
Kalpa Labs
ActiveScaling Generalist Speech models
About
We're building the next frontier of speech models. Generalist speech models that unlock in-context learning & strong instruction following for speech models alongside unifying existing speech capabilities like speech to text, text to speech, voice cloning, etc.
Founders · 2
Pushing frontier of speech models @ KalpaLabs. Previously led full stack ML @ Google scaling to billions of queries / month.
Pushing frontier of speech models @ KalpaLabs. Previously built nanoseconds latency software at HFTs.
Launch
Scaling Foundational speech models for In-context Learning & Instruction Following
KalpaLabs announces a generalist speech model that handles speech-to-text, text-to-speech, and cross-modal tasks with in-context learning and steerability, aiming to unify STT, TTS, and voice actions into one system. They pretrained 800M–4.8B parameter models on 2M hours of audio and highlight scalable, aligned, and efficient speech modeling with long context and audio-in-context prompts.
Related startups

Foundational Voice Models for regional languages

Audio & Video Data



