Lumicity
Machine Learning Engineer
Lumicity, San Jose, California, United States, 95199
About the Company
This company develops generative video models that allow users to create animated pictures with ease, incorporating their own existing audio or utilizing text-to-speech models. Having raised over $10M and generating significant excitement with their first two foundational model releases, they are expanding their team in San Francisco.
About the Role
We're looking for passionate
Machine Learning Engineers
to join our team and help build cutting-edge systems for large-scale data collection, GPU training, and AI model inference optimization. If you have deep expertise in model quantization, parallel inference, and accelerating diffusion models, and you're excited about deploying state-of-the-art ML models in the cloud, this is the perfect opportunity for you. What You'll Do : Build and scale distributed data collection and curation systems to support large-scale model training and inference. Optimize GPU-based training pipelines for efficiency and speed, focusing on large-scale model deployment. Accelerate inference for diffusion models and transformers, leveraging techniques like model quantization and parallel inference. Optimize and implement
CUDA kernels ,
Triton , and
TensorRT
to maximize inference performance. Develop and maintain cloud-based infrastructure (AWS, Oracle) using Kubernetes and Terraform for scalable model deployment. Architect REST APIs for distributed systems, ensuring high performance and low-latency responses. What You Bring : 5+ years of experience
in Python or Golang, with a strong emphasis on performance optimization. Expertise in
model quantization ,
parallel inference , and deploying ML models in production. Hands-on experience with
PyTorch ,
TensorRT ,
Triton , and
CUDA kernels
for accelerating model inference, especially in large-scale applications. Strong background with
Kubernetes ,
Docker , and
NVIDIA hardware
(GPUs, Tensor Cores). Experience scaling pipelines in AWS (SQS, Kafka), implementing
infrastructure as code
using tools like Terraform. A
startup mindset ability to move fast, iterate quickly, and build impactful systems in a fast-evolving space. Passion for deploying AI technologies at scale and driving innovation in generative models. Send your resume today
and join us in building the next generation of AI-driven video models!
Machine Learning Engineers
to join our team and help build cutting-edge systems for large-scale data collection, GPU training, and AI model inference optimization. If you have deep expertise in model quantization, parallel inference, and accelerating diffusion models, and you're excited about deploying state-of-the-art ML models in the cloud, this is the perfect opportunity for you. What You'll Do : Build and scale distributed data collection and curation systems to support large-scale model training and inference. Optimize GPU-based training pipelines for efficiency and speed, focusing on large-scale model deployment. Accelerate inference for diffusion models and transformers, leveraging techniques like model quantization and parallel inference. Optimize and implement
CUDA kernels ,
Triton , and
TensorRT
to maximize inference performance. Develop and maintain cloud-based infrastructure (AWS, Oracle) using Kubernetes and Terraform for scalable model deployment. Architect REST APIs for distributed systems, ensuring high performance and low-latency responses. What You Bring : 5+ years of experience
in Python or Golang, with a strong emphasis on performance optimization. Expertise in
model quantization ,
parallel inference , and deploying ML models in production. Hands-on experience with
PyTorch ,
TensorRT ,
Triton , and
CUDA kernels
for accelerating model inference, especially in large-scale applications. Strong background with
Kubernetes ,
Docker , and
NVIDIA hardware
(GPUs, Tensor Cores). Experience scaling pipelines in AWS (SQS, Kafka), implementing
infrastructure as code
using tools like Terraform. A
startup mindset ability to move fast, iterate quickly, and build impactful systems in a fast-evolving space. Passion for deploying AI technologies at scale and driving innovation in generative models. Send your resume today
and join us in building the next generation of AI-driven video models!