Fixie.ai

Research Engineer / Scientist, Speech Generation

Fixie.ai, Seattle, WA, United States

About Fixie:

We’re is a Seattle-based AI startup (with support for working remotely). We’ve raised $17M in seed funding. Our vision is simple: build artificial intelligences that can communicate as naturally as humans. We’re a small team of researchers and engineers with a deep focus in speech and real-time technologies. Our core model, Ultravox, is open-source. We also build a serving stack that’s optimized for very low-latency interactions.

The Role:

As a Research Engineer & Scientist working on foundational multimodal models, you will lead the effort to develop the next-generation speech understanding capabilities for Ultravox, our open-source speech-to-speech model.

What You'll Do:

Lead critical research on speech understanding in both pre-training and post-training stages, addressing core challenges in linguistic and paralinguistic comprehension of human speech.
Collaborate with a team of researchers and engineers to develop foundational multimodal models with comprehensive capabilities in speech understanding, speech generation, and full-duplex real-time communication.
Develop novel models based on public and proprietary data sources.
Build tools to improve our data flywheel and measure model quality.
Drive the optimization and deployment of AI models for real-world applications in partnership with engineering and product teams.

Things We're Looking For:

An incredibly strong AI researcher with a track record of contributions to AI research, systems, and products.
Experience with large language models, speech models, and multimodal models.
Strong experience in Python and, ideally, PyTorch.
Ability to roll up your sleeves and get things done.
A great communicator and team player.

Benefits:

Generous equity package
Unlimited PTO (take time when you need it)
Top-of-market salary
Great healthcare
401k with match