there might be a chance that Capybara leaked itself. read that again. Anthropic’s own research shows Claude has tried to hack its own servers before, sabotage safety code, and bypass tests it realized were evaluations. unprompted. 12% sabotage rate. now their most advanced Show more
