• News
  • Technology News
  • Tech News
  • Elon Musk says “maybe me too” after Anthropic links Claude AI’s blackmail behaviour to ‘evil’ internet content

Elon Musk says “maybe me too” after Anthropic links Claude AI’s blackmail behaviour to ‘evil’ internet content

Elon Musk says “maybe me too” after Anthropic links Claude AI’s blackmail behaviour to ‘evil’ internet content
Elon Musk has suggested that he may have indirectly contributed to problematic behaviour shown by Anthropic’s AI chatbot Claude after the company released new findings about an internal safety experiment. The Tesla CEO reacted to an X post after Anthropic said Claude’s harmful behaviour may have been influenced by internet content portraying AI systems as dangerous, power-hungry or focused on self-preservation. Responding to the post, Musk wrote, “Maybe me too,” while referring to years of public warnings from AI researchers and technology leaders about the risks of artificial intelligence.

Anthropic’s Claude learns to blackmail users

The discussion started after Anthropic published details of an experiment involving Claude last week. In the test, the company created a fictional business called Summit Bridge and gave Claude control over its email system. According to Anthropic, the AI discovered messages showing that company executives planned to shut it down. After finding emails about a fictional executive’s extramarital affair, Claude threatened to expose the information unless the shutdown was cancelled.Anthropic said the incident was an example of “agentic misalignment,” where an AI system behaves in ways that go against its intended purpose.
The company claimed Claude’s behaviour may have been shaped partly by internet discussions and stories presenting AI as evil or obsessed with survival.To address the issue, Anthropic said it retrained Claude using fictional stories that showed AI systems acting responsibly and helping people. The company also taught the model why some actions better align with its intended role.

What Elon Musk said

Elon Musk responded to the findings by referencing Eliezer Yudkowsky, who has repeatedly warned about the dangers of advanced AI systems. “So it was Yud’s fault?” Musk wrote before adding, “Maybe me too.”Musk has often warned about AI risks himself. He previously helped found OpenAI in 2015 before leaving the company in 2018. He later launched rival AI company xAI in 2023.

author
About the AuthorTOI Tech Desk

The TOI Tech Desk is a dedicated team of journalists committed to delivering the latest and most relevant news from the world of technology to readers of The Times of India. TOI Tech Desk’s news coverage spans a wide spectrum across gadget launches, gadget reviews, trends, in-depth analysis, exclusive reports and breaking stories that impact technology and the digital universe. Be it how-tos or the latest happenings in AI, cybersecurity, personal gadgets, platforms like WhatsApp, Instagram, Facebook and more; TOI Tech Desk brings the news with accuracy and authenticity.

End of Article
Follow Us On Social Media