Logo
Fixie.ai

Research Engineer / Scientist, Speech Generation

Fixie.ai, Seattle, WA, United States


About Fixie:

We’re is a Seattle-based AI startup (with support for working remotely). We’ve raised $17M in seed funding. Our vision is simple: build artificial intelligences that can communicate as naturally as humans. We’re a small team of researchers and engineers with a deep focus in speech and real-time technologies. Our core model, Ultravox, is open-source. We also build a serving stack that’s optimized for very low-latency interactions.

The Role:

As a Research Engineer & Scientist working on foundational multimodal models, you will lead the effort to develop the next-generation speech understanding capabilities for Ultravox, our open-source speech-to-speech model.

What You'll Do:

  • Lead critical research on speech understanding in both pre-training and post-training stages, addressing core challenges in linguistic and paralinguistic comprehension of human speech.
  • Collaborate with a team of researchers and engineers to develop foundational multimodal models with comprehensive capabilities in speech understanding, speech generation, and full-duplex real-time communication.
  • Develop novel models based on public and proprietary data sources.
  • Build tools to improve our data flywheel and measure model quality.
  • Drive the optimization and deployment of AI models for real-world applications in partnership with engineering and product teams.

Things We're Looking For:

  • An incredibly strong AI researcher with a track record of contributions to AI research, systems, and products.
  • Experience with large language models, speech models, and multimodal models.
  • Strong experience in Python and, ideally, PyTorch.
  • Ability to roll up your sleeves and get things done.
  • A great communicator and team player.

Benefits:

  • Generous equity package
  • Unlimited PTO (take time when you need it)
  • Top-of-market salary
  • Great healthcare
  • 401k with match