platoseed
Platform for building RL environments and evals
HUD (YC W25) is developing agentic evals and RL environments for Computer Use Agents (CUAs) that browse the web for frontier AI labs. Our CUA Evals framework is the first comprehensive evaluation tool for CUAs. People don't actually know if AI agents are working reliably. To make AI agents work in the real world, we need detailed evals for a huge range of tasks. We're backed by Y Combinator, and work closely with frontier AI labs to provide agent evaluation and training infrastructure at scale.
CPO @ hud interested in natural intelligence, flowers, and the human condition
Formerly โHuman Union Dataโ ยท why startups rename โ

Data and RL environments to automate knowledge work

RL environments for long horizon AI Agents