Applied Intuition Inc.
ML Runtime Optimization Engineer
Applied Intuition Inc., Mountain View, California, us, 94039
About the roleWe are looking for a software engineer with expertise in optimizing ML models and deploying them on production-grade runtime environments and chips. You’ll work across the entire ML framework/compiler stack (e.g. PyTorch, JAX, ONNX, TensorRT, CUDA, XLA, Triton).
At Applied Intuition, you will:
Build the optimization pipeline for deploying ML models to real-world hardware.
Build foundational libraries for analyzing and optimizing model performance, correctness, numerical stability, and cross-platform reproducibility.
Closely collaborate with ML developers on model architecture details to reduce compiled latency and resource usage.
Learn about the variety of production-grade boards our customers use and develop computational resource strategies for different customer needs.
We're looking for someone who has:
B.Sc in Computer Science, Mathematics or a related field
Knowledge and experience with ML accelerators, GPU, CPU, SoC architecture and micro-architecture.
Proficiency in C++, strong software development skills with the focus on high-performance computing.
Working experience with Python.
Experience in developing on or using deep learning frameworks (e.g., PyTorch, JAX, ONNX, etc.)
Nice to have:
M.Sc or PhD in a ML related area
Built a ML compiler or optimization framework from scratch before
Deployed ML solutions to embedded chips for real time robotics applications
The salary range for this position is $125,000 - $222,000 USD annually. This salary range is an estimate, and the actual salary may vary based on the Company's compensation practices.
#J-18808-Ljbffr
At Applied Intuition, you will:
Build the optimization pipeline for deploying ML models to real-world hardware.
Build foundational libraries for analyzing and optimizing model performance, correctness, numerical stability, and cross-platform reproducibility.
Closely collaborate with ML developers on model architecture details to reduce compiled latency and resource usage.
Learn about the variety of production-grade boards our customers use and develop computational resource strategies for different customer needs.
We're looking for someone who has:
B.Sc in Computer Science, Mathematics or a related field
Knowledge and experience with ML accelerators, GPU, CPU, SoC architecture and micro-architecture.
Proficiency in C++, strong software development skills with the focus on high-performance computing.
Working experience with Python.
Experience in developing on or using deep learning frameworks (e.g., PyTorch, JAX, ONNX, etc.)
Nice to have:
M.Sc or PhD in a ML related area
Built a ML compiler or optimization framework from scratch before
Deployed ML solutions to embedded chips for real time robotics applications
The salary range for this position is $125,000 - $222,000 USD annually. This salary range is an estimate, and the actual salary may vary based on the Company's compensation practices.
#J-18808-Ljbffr