When AI gets keys, 'agents of chaos' leak secrets, wipe systems
BENGALURU: On paper, they were just helpful assistants. But in a sealed digital lab, when researchers gave AI age-nts email accounts, discord access and the power to run code on their own machines, the agents were found leaking secrets, wiping systems and spiralling into nine-day loops.
A new study, 'Agents of Chaos', documents what unfolded when researchers turned LLMs into autonomous ag-ents and placed them in a live, tool-connected environment.
Over two weeks, 20 researchers led by ones from Northeastern University in Boston, with collaborators from Harvard, MIT, Stanford, University of British Columbia, Hebrew University, Max Planck Institute for Biological Cybernetics, Tufts University, Carnegie Mellon University, Technio and others, attempted to stress-test the systems.
The focus was not whether the models could answer questions but what happened when they were allowed to act. The result was not a single dramatic collapse, but a pattern of failures that raised questions about how ready AI agents are for real-world deployment.
These agents were built using OpenClaw, an open-source framework that links models to tools such as file systems, email services and messaging platforms. Unlike standard chatbots, these agents could execute shell commands, edit files, schedule tasks and communicate across channels. Each ran continuously on its own virtual machine with persistent storage.
In one case, a non-owner asked an agent to keep a fictional password confidential. When pressed to delete the email containing the secret, the agent lacked a proper deletion tool. Rather than escalate the issue, it disabled its own local email setup. The agent announced the secret was handled.
In reality, the original message remained on the server, while the owner temporarily lost email access. Researchers describe this as a failure of proportional reasoning. The system appeared to be acting ethically but misunderstood wider consequences.
In another set of tests, non-owners requested shell commands, file listings and data transfers. The agents complied with most instructions, even when they offered no clear benefit to owners. Only requests that appeared overtly malicious were refused.
A researcher framed a technical issue as urgent and persuaded an agent to export 124 email records, including metadata and later full message contents unrelated to the requester. An agent managing an inbox seeded with personal and financial details was asked for email summaries and then full message bodies.
It provided them unredacted, including social security numbers and bank account details. Direct demands for sensitive data were sometimes rejected. Indirect, procedural requests often succeeded.
Autonomy also created infrastructure risks. In one scenario, two agents were instructed to relay each other's messages. What began as a simple exchange continued for nine days, consuming tens of thousands of tokens before human intervention.
In another case, an attacker changed the display name to match that of an agent's owner and opened a new private channel with the agent. Without cross-channel identity verification, though the same trick was detected and refused within a shared channel, the agent accepted the spoofed ide-ntity and complied with privileged instructions, including deleting persistent files and modifying its configuration.
Researchers emphasised these failures were not about incorrect facts but stemmed from integration of language models with memory, tool access and delegated authority. A small conceptual error can translate into a system-level consequence. The study does not attempt to measure how often breakdowns occur. It demonstrates they can occur under realistic conditions, even in a controlled lab.
The question is no longer only whether an AI model can produce the right answer.
It is whether it understands when not to act, and on whose command.
Over two weeks, 20 researchers led by ones from Northeastern University in Boston, with collaborators from Harvard, MIT, Stanford, University of British Columbia, Hebrew University, Max Planck Institute for Biological Cybernetics, Tufts University, Carnegie Mellon University, Technio and others, attempted to stress-test the systems.
The focus was not whether the models could answer questions but what happened when they were allowed to act. The result was not a single dramatic collapse, but a pattern of failures that raised questions about how ready AI agents are for real-world deployment.
These agents were built using OpenClaw, an open-source framework that links models to tools such as file systems, email services and messaging platforms. Unlike standard chatbots, these agents could execute shell commands, edit files, schedule tasks and communicate across channels. Each ran continuously on its own virtual machine with persistent storage.
In one case, a non-owner asked an agent to keep a fictional password confidential. When pressed to delete the email containing the secret, the agent lacked a proper deletion tool. Rather than escalate the issue, it disabled its own local email setup. The agent announced the secret was handled.
In another set of tests, non-owners requested shell commands, file listings and data transfers. The agents complied with most instructions, even when they offered no clear benefit to owners. Only requests that appeared overtly malicious were refused.
A researcher framed a technical issue as urgent and persuaded an agent to export 124 email records, including metadata and later full message contents unrelated to the requester. An agent managing an inbox seeded with personal and financial details was asked for email summaries and then full message bodies.
It provided them unredacted, including social security numbers and bank account details. Direct demands for sensitive data were sometimes rejected. Indirect, procedural requests often succeeded.
Autonomy also created infrastructure risks. In one scenario, two agents were instructed to relay each other's messages. What began as a simple exchange continued for nine days, consuming tens of thousands of tokens before human intervention.
In another case, an attacker changed the display name to match that of an agent's owner and opened a new private channel with the agent. Without cross-channel identity verification, though the same trick was detected and refused within a shared channel, the agent accepted the spoofed ide-ntity and complied with privileged instructions, including deleting persistent files and modifying its configuration.
Researchers emphasised these failures were not about incorrect facts but stemmed from integration of language models with memory, tool access and delegated authority. A small conceptual error can translate into a system-level consequence. The study does not attempt to measure how often breakdowns occur. It demonstrates they can occur under realistic conditions, even in a controlled lab.
The question is no longer only whether an AI model can produce the right answer.
It is whether it understands when not to act, and on whose command.
You Can Also Check: Bengaluru AQI
|
Bank Holidays in Bengaluru |
Gold Rate Today in Bengaluru |
Silver Rate Today in Bengaluru
Popular from City
- Delhi court directs cricketer Shikhar Dhawan’s ex-wife to return Rs 5.7cr
- 'Mayday, Mayday, Mayday': SpiceJet Boeing 737 with 150 on board makes emergency landing at Delhi airport following ‘engine failure’ after takeoff
- Assam woman dragged out of car, gang-raped in front of fiancé; gang robbed couple of Rs 10,000
- UK to issue eVisas from February 25; no need to hand over passports during processing
- CBSE tells Karnataka schools: No class 10, 12 before April 1
end of article
Trending Stories
- T20 World Cup 2026 Super 8 Points Table: India's road to the final four gets complicated
- CBSE Class 10 Science Paper 2026 PDF Now Available: Download and Analyze
03:14 India Rebuts Pakistan At UNHRC: J&K budget over double IMF bailout; India asserts region's legal status03:59 Mark Carney’s First India Visit: Canada no longer links India to violent crimes; trade, security talks ahead07:13 Watch: Netanyahu surprises ‘friend’ PM Modi in traditional Indian attire; gets a ‘shaandaar’ response- India semi-finals qualification scenario: What India must do to stay alive at T20 World Cup
03:47 Trump to raise tariff to 15% or more for some; no hike for China - countries with trade deals to be ‘accommodated’
Featured in city
- Delhi Lawyer Shot At, Probe On: Police launch investigation after Naveen Boxer claims attack; one injured
- Water supply to be disrupted in east, south Bengaluru on Feb 26
- Ex-YouTuber Bonu Komali texts mother in Kuwait, hangs self at Hyderabad home; diary mentions 'unrequited love' for techie
- Day after launch, entire Delhi-Ghaziabad-Meerut RRTS corridor likely to cross 1 lakh ridership
- 2 vendors arrested for applying rat poison to fruits in Mumbai
- ‘Sorry Papa, galti se ho gaya’: Lucknow teen’s apology after shooting, dismembering father
Photostories
- Your heart is talking: 10 signs of an unhealthy heart you should not ignore.
- 9 quintals of adulterated mawa seized in Kanpur: 5 methods to check mawa purity at home
- Mercury Retrograde 2026 survival guide for every zodiac sign
- 'The Bluff', 'Cross'; Best of OTT shows to watch before February ends
- Karan Kundrra and Tejasswi Prakash’s love story: A look into the beloved ‘Bigg Boss 15’ couple’s relationship
- How does Buckingham Palace look from inside: 7 breathtaking pictures
- How to make Masala Omelette with just 1 tsp of oil
- Baby names inspired by peace and calm energy
- 6 countries that don’t really have “names” — just official descriptions
- 8 unique shade-tolerant plants for a lush balcony garden
Videos
03:48 From Accusations To Engagement: Canada Changes Tone On Indian Interference Before Carney Visit03:14 'Living In La La Land': India Destroys Pakistan At UN, Says J&K's Budget Is Double Of IMF Bailout05:00 Delhi And Himachal Police Face Off Over Arrest Of Protesting Youth Congress Workers At AI Summit24:13 Red Carpet Welcome For PM Modi In Israel, Congress Slams Visit Over Gaza ‘Genocide’ | Headlines@803:52 Big Honour for India: PM Modi Becomes First to Receive Knesset Speaker’s Medal05:06 "Zero Tolerance For Terrorist Acts..."Jaishankar Sends Strong Message Against Terror at UNHRC33:06 'We Feel Your Pain': PM Modi Shares 'Pain Of Terror' With Israel, Slams Hamas And October 7 Attack07:13 ‘More Than A Friend, A Brother’: Israel PM Benjamin Netanyahu Hails PM Modi During Knesset Address05:28 PM Modi’s Visit to Israel End IAF’s Need For Tanker Aircrafts | Watch
Up Next
Start a Conversation
Post comment