
Anthropic has unveiled Claude Sonnet 4.5, an AI that excels at autonomous coding tasks and tool use, with high scores in coding-related benchmarks like SWE-bench. On the other hand, the AI is less engaging in conversation and weaker in visual reasoning than competing AI models.
Anthropic has launched Claude Sonnet 4.5, its latest AI with improved coding performance designed to better help software developers code apps.
Sonnet 4.5 benchmarks well on several major AI coding benchmarks, including SWE-bench and Terminal-Bench. The AI has an improved ability to use computer tools to accomplish tasks autonomously, as seen in its leading OSWorld benchmark result, enabling it to create a working clone of the claude.ai website on its own.
The AI’s improved abilities allow it to answer prompts across the financial, legal, medical, and STEM fields better than Anthropic’s prior models, but Claude Sonnet 4.5 only manages to score between a C and a D grade when answering these types of prompts. It also performs poorly in visual reasoning tasks during the MMMU benchmark test versus other AI models.
Hackers will want to stick with other AI models to do bad things like conduct prompt injection attacks because Sonnet 4.5 has the lowest success rate among all AI models tested.
Users who enjoy a spicy AI chat will find the latest Claude disappointing due to its reduced rate of spontaneously speaking about spirituality. The model also expresses positivity about itself less often, making for a duller conversation.
Readers interested in chatting with Claude Sonnet 4.5 can download the app for smartphones here or access the AI on Anthropic’s website. Those who actually put AI to work can use a Plaud Note to put Claude to work at summarizing and transcribing stand-up meetings.

David Chien – Tech Writer – 759 articles published on Notebookcheck since 2023
Having worked at Activision, UCLA, Anime Expo and more, I’ve seen technology being used to save lives, create games, and create fantastic 3D VR/AR worlds. There’s always something fun in emerging technology that I want to get my hands on and all my friends turn to me to find the best for their needs, so I’m glad to bring my experience to Notebookcheck.
David Chien, 2025-09-30 (Update: 2025-09-30)