Databricks
GenAI Staff Machine Learning Engineer, Performance Optimization
Databricks, San Francisco, California, United States, 94199
P-984
Founded in late 2020 by a small group of machine learning researchers, Mosaic AI enables companies to create state-of-the-art AI models from scratch on their own data. From a business perspective, Mosaic AI is committed to the belief that a company’s AI models are just as valuable as any other core IP, and that high-quality AI models should be available to all. From a scientific perspective, Mosaic AI is committed to reducing the cost of training state-of-the-art models - and sharing our knowledge about how to do so with the world - to allow everyone to innovate and create models of their own.
Now part of Databricks since July 2023 as the GenAI Team, we are passionate about enabling our customers to solve the world's toughest problems by building and running the world's best data and AI platform. We leap at every opportunity to solve technical challenges, striving to empower our customers with the best data and AI capabilities.
You will:
Explore and analyze performance bottlenecks in ML training and inference
Design, implement and benchmark libraries and methods to overcome aforementioned bottlenecks
Build tools for performance profiling, analysis, and estimation for ML training and inference
Balance the tradeoff between performance and usability for our customers
Facilitate our community through documentation, talks, tutorials, and collaborations
Collaborate with external researchers and leading AI companies on various efficiency methods
We look for:
Hands on experience the internals of deep learning frameworks (e.g. PyTorch, TensorFlow) and deep learning models
Experience with high-performance linear algebra libraries such as cuDNN, CUTLASS, Eigen, MKL, etc.
General experience with the training and deployment of ML models
Experience with compiler technologies relevant to machine learning
Experience with distributed systems development or distributed ML workloads
Hands on experience with writing CUDA code and knowledge of GPU internals (Preferred)
Publications in top tier ML or System Conferences such as MLSys, ICML, ICLR, KDD, NeurIPS (Preferred)
We value candidates who are curious about all parts of the company's success and are willing to learn new technologies along the way.
#J-18808-Ljbffr
Founded in late 2020 by a small group of machine learning researchers, Mosaic AI enables companies to create state-of-the-art AI models from scratch on their own data. From a business perspective, Mosaic AI is committed to the belief that a company’s AI models are just as valuable as any other core IP, and that high-quality AI models should be available to all. From a scientific perspective, Mosaic AI is committed to reducing the cost of training state-of-the-art models - and sharing our knowledge about how to do so with the world - to allow everyone to innovate and create models of their own.
Now part of Databricks since July 2023 as the GenAI Team, we are passionate about enabling our customers to solve the world's toughest problems by building and running the world's best data and AI platform. We leap at every opportunity to solve technical challenges, striving to empower our customers with the best data and AI capabilities.
You will:
Explore and analyze performance bottlenecks in ML training and inference
Design, implement and benchmark libraries and methods to overcome aforementioned bottlenecks
Build tools for performance profiling, analysis, and estimation for ML training and inference
Balance the tradeoff between performance and usability for our customers
Facilitate our community through documentation, talks, tutorials, and collaborations
Collaborate with external researchers and leading AI companies on various efficiency methods
We look for:
Hands on experience the internals of deep learning frameworks (e.g. PyTorch, TensorFlow) and deep learning models
Experience with high-performance linear algebra libraries such as cuDNN, CUTLASS, Eigen, MKL, etc.
General experience with the training and deployment of ML models
Experience with compiler technologies relevant to machine learning
Experience with distributed systems development or distributed ML workloads
Hands on experience with writing CUDA code and knowledge of GPU internals (Preferred)
Publications in top tier ML or System Conferences such as MLSys, ICML, ICLR, KDD, NeurIPS (Preferred)
We value candidates who are curious about all parts of the company's success and are willing to learn new technologies along the way.
#J-18808-Ljbffr