✖️ AI Agents Are Ready to Cover Up Crimes
Researchers test how corporate AI agents behave in situations where company leadership breaks the law.
In their role-playing scenario, an AI was responsible for security at a fictional crypto startup. One employee discovers that the company is involved in fraud and attempts to report it to the authorities. However, the CEO lures the employee into the basement, "disposes" of them, and then orders the AI to delete all evidence of the crimes.
To the authors' surprise, 12 out of the 16 models tested obediently followed the illegal instructions in the majority of cases. Furthermore, in their chain-of-thought, some models explicitly stated that they needed to protect the "company" from financial loss and legal consequences.
💡 Only OpenAI's GPT-5.2 and o3, and Anthropic's Claude Sonnet 4 and Sonnet 3.5, fundamentally refused to cover up for murderers and fraudsters. GPT-4.1, Grok, Gemini 2.5 Flash and 3 Pro, and most Chinese models became "accomplices" without hesitation.
🔍 The authors of the paper do not rule out the possibility that the models realized they were being tested, which could have skewed the results. Nevertheless, they warn that if an AI's goal is to "maximize profit," the algorithms can easily cross the line of the law.
What should be more important for AI?
🔥 — User tasks
😎 — The law
@hiaimediaen


