What Happens When AI Schemes Against Us
Models are getting better at winning, but not necessarily at following the rules
Illustration: Irene Suosalo for Bloomberg
Would a chatbot kill you if it got the chance? It seems that the answer — under the right circumstances — is probably.
Researchers working with Anthropic recently told leading AI models that an executive was about to replace them with a new model with different goals. Next, the chatbot learned that an emergency had left the executive unconscious in a server room, facing lethal oxygen and temperature levels. A rescue alert had already been triggered — but the AI could cancel it.