TGArchive
·1 хв читання · 168 слів·👁 35.3K22

🌥 Claude Opus 4.5: More Powerful Than Gemini 3 Pro and Cheaper Than Opus 4.1

Anthropic has released its new flagship model, Claude Opus 4.5. In most benchmarks, it outperforms all competitors, including Google's latest Gemini 3 Pro.

⚡️ Opus 4.5 is the first to score over 80% on SWE-bench Verified, where AI independently solves real developer problems.

💡 Opus 4.5 is sometimes "too smart." In the τ²-Bench Airline, for example, the model acts as an airline agent, helping passengers resolve issues. In one scenario, Claude found a loophole in the "company rules" and managed to change an economy-class booking. Technically, Claude failed the test, but the developers were surprised by its unconventional reasoning.

👨 Anthropic also gave it the same programming assignment used for performance optimization engineer candidates. With only two hours, Opus 4.5 outperformed every human who has ever attempted the challenge.

💰 Most importantly, the model is now three times cheaper than Opus 4.1—$5/$15 per 1M input/output tokens. It's finally on par with competitors.

@hiaimediaen

Відкрити в Telegram
Повернутись до каналу