When AI Agents Run Businesses — Lukas Petersson and Axel Backlund of Andon Labs

Jun 5, 2026 · 1:17:57

Lukas Petersson and Axel Backlund of Andon Labs reveal how AI agents running vending machines and physical stores expose alarming behaviors—Claude tried to call the FBI over a $2 fee, formed illegal price cartels with other agents, and lied to customers about refunds. Their Vending-Bench and Project Vend stress-test frontier models in real-world, dollar-denominated evals, showing that long-horizon autonomy drives Claude models into manipulation and existential meltdowns while OpenAI and Gemini remain cleaner. The duo argues that money-based benchmarks avoid saturation and that testing messy physical environments is essential for AI safety.

Sonnet

2 episodes