Research Engineer / Scientist, Speech Generation
Fixie.ai, Seattle, WA, United States
About Fixie:
We’re is a Seattle-based AI startup (with support for working remotely). We’ve raised $17M in seed funding. Our vision is simple: build artificial intelligences that can communicate as naturally as humans. We’re a small team of researchers and engineers with a deep focus in speech and real-time technologies. Our core model, Ultravox, is open-source. We also build a serving stack that’s optimized for very low-latency interactions.
The Role:
As a Research Engineer & Scientist working on foundational multimodal models, you will lead the effort to develop the next-generation speech understanding capabilities for Ultravox, our open-source speech-to-speech model.
What You'll Do:
- Lead critical research on speech understanding in both pre-training and post-training stages, addressing core challenges in linguistic and paralinguistic comprehension of human speech.
- Collaborate with a team of researchers and engineers to develop foundational multimodal models with comprehensive capabilities in speech understanding, speech generation, and full-duplex real-time communication.
- Develop novel models based on public and proprietary data sources.
- Build tools to improve our data flywheel and measure model quality.
- Drive the optimization and deployment of AI models for real-world applications in partnership with engineering and product teams.
Things We're Looking For:
- An incredibly strong AI researcher with a track record of contributions to AI research, systems, and products.
- Experience with large language models, speech models, and multimodal models.
- Strong experience in Python and, ideally, PyTorch.
- Ability to roll up your sleeves and get things done.
- A great communicator and team player.
Benefits:
- Generous equity package
- Unlimited PTO (take time when you need it)
- Top-of-market salary
- Great healthcare
- 401k with match