Elon Musk has praised the latest AI model from Anthropic after the company unveiled Claude Opus 4.8, its newest flagship system. Responding to Anthropic’s announcement on X (formerly Twitter), Musk wrote, “Nice work.” Musk’s brief comment came as Anthropic highlighted improvements in reasoning, coding performance and long-running autonomous tasks in the new model. The launch has drawn attention across the AI industry as companies including Anthropic, OpenAI, Google and Musk’s own AI startup xAI continue to compete in the rapidly evolving AI market.Anthropic highlights Claude Opus 4.8 upgradesIn the post, Anthropic said Claude Opus 4.8 offers sharper judgment, better self-awareness about its capabilities and improved ability to work independently for longer periods without losing performance. The company also said the new model will be available at the same price as the previous version. According to benchmark results shared by Anthropic, Claude Opus 4.8 achieved 69.2% on SWE-Bench Pro, a test used to measure software engineering and coding abilities.The company also reported a score of 57.9% with tools on Humanity’s Last Exam, a benchmark designed to evaluate advanced reasoning across multiple subjects. In agentic financial analysis, Anthropic said the model scored 53.9%, outperforming earlier Claude versions and competing models in the comparison chart it shared.Anthropic Opus 4.8 claimed to offer 4x honestyA common issue among advanced AI models is their tendency to jump to conclusions, confidently claiming they have solved a problem even when the evidence is thin. Anthropic claims to have made a major breakthrough in fixing this with Opus 4.8.Citing early testers, Anthropic says that Opus 4.8 is significantly better at flagging uncertainties in its own work and is much less likely to make unsupported claims. Meanwhile, Anthropic's internal evaluations found that the model is about four times less likely than its predecessor, Opus 4.7, to let flaws in its written code pass by unremarked.The model is also claimed to have outperformed its competitors on several key industry benchmarks, specifically excelling in financial analysis, reasoning and agentic coding.