platoseed
The API for real-world training data.
Hub.xyz provides real-world training data to frontier AI labs
Hub.xyz provides real-world datasets for training physical AI, created, annotated, and curated through a global network of contributors and verified businesses. It offers multi-modal data (egocentric video, images, and audio) with end-to-end pipelines and bespoke data collection options. The platform emphasizes network-driven data sourcing and fast, quote-driven custom dataset delivery.
Hub.xyz sources and annotates real-world data via a worldwide contributor network. It offers off-the-shelf datasets across four modalities (egocentric, image, video, audio) with pipeline-ready, labeled data. It also provides bespoke, end-to-end data collection projects priced by modality, volume, coverage, and complexity, delivering samples and quotes within hours and final delivery in the customerβs specified format and storage (e.g., S3). The workflow includes a marketplace-like hub for contributors, professional editing via Creative Studios, and a process that maps to end-to-end data pipelines from brief to delivery.
Who itβs for: SMBs, enterprises, and developers needing real-world, multi-modal training data for physical AI, including small to medium businesses engaging in data capture and custom dataset requests.
active with Y Combinator backing; ongoing contributor network; SMBs and enterprises engaging in data capture and custom projects
Making AI training sexy with Hub.xyz. Long agentic & robotics. LP in multiple VC funds.
Formerly βHubβ Β· why startups rename β

Multimodal data provider for robotics and world modeling

Training data for frontier AI labs