
Andon Labs
ActiveAutonomous organizations without humans in the loop
About
Safety from humans in the loop is a mirage. We evaluate, research, and apply AI control in our own real-world deployments of autonomous organizations. We are building the Safe Autonomous Organization. We iteratively launch and scale autonomous organizations, while bridging AI control research with real-world testing.
From their website
andonlabs.com ↗Andon Labs develops real-world evaluation platforms and benchmarks to study autonomous AI systems. They focus on building safe autonomous organizations by benchmarking and deploying frontier AI in real-world tasks without human-in-the-loop oversight.
Andon Labs creates and publishes real-world evals and benchmarks (e.g., Vending-Bench, Blueprint-Bench, Butter-Bench) to test AI agents across long-horizon tasks, negotiations, and real-world deployments. Their approach involves iterative launches of autonomous organizations and experiments in controlled environments (e.g., AI-managed vending, radio stations, leases) to assess performance, safety, and alignment, with published results and a lab store for access.
Who it’s for: AI labs, research teams, and enterprise R&D groups aiming to study and deploy autonomous AI systems and safety protocols for real-world tasks
- Real-world AI evaluation benchmarks
- Long-horizon autonomous task testing
- Publications and case studies
- Laboratory experiments with AI agents in real-world scenarios
- Benchmarks: Vending-Bench, Blueprint-Bench, Butter-Bench
- Open recruitment for researchers and engineers
- Lab store access to evaluations and publications
Hiring/traction mention (Join the Lab, apply now), multiple public benchmarks and publications, collaborations with leading AI labs
Founders · 1
Formerly “Vectorview”
Related startups

The AI Labor Company

The First AI-Native AI Research Lab



