🌥 Claude Opus 4.5: More Powerful Than Gemini 3 Pro and Cheaper Than Opus 4.1
Anthropic has released its new flagship model, Claude Opus 4.5. In most benchmarks, it outperforms all competitors, including Google's latest Gemini 3 Pro.
⚡️ Opus 4.5 is the first to score over 80% on SWE-bench Verified, where AI independently solves real developer problems.
💡 Opus 4.5 is sometimes "too smart." In the τ²-Bench Airline, for example, the model acts as an airline agent, helping passengers resolve issues. In one scenario, Claude found a loophole in the "company rules" and managed to change an economy-class booking. Technically, Claude failed the test, but the developers were surprised by its unconventional reasoning.
👨 Anthropic also gave it the same programming assignment used for performance optimization engineer candidates. With only two hours, Opus 4.5 outperformed every human who has ever attempted the challenge.
💰 Most importantly, the model is now three times cheaper than Opus 4.1—$5/$15 per 1M input/output tokens. It's finally on par with competitors.
@hiaimediaen


