Theaiinstitute
Machine Learning Engineer
Theaiinstitute, Cambridge, Massachusetts, us, 02140
Our mission is to solve the most important and fundamental challenges in AI and Robotics to enable future generations of intelligent machines that will help us all live better lives.
Machine Learning Engineers work cross-functionally, creating new technology to improve machine learning pipelines for robots. If you have a passion for developing and implementing state-of-the-art model architectures and building infrastructure for model training, inference, optimization, and data processing, this is the place for you! We are onsite in our new Cambridge, MA office where we are building a collaborative and exciting new organization.
Responsibilities
Train, deploy, and maintain various ML models on cloud and on-premise infrastructure Develop pipelines and tools for all components of the ML lifecycle – from training, evaluation, and optimization to deployment Partner closely with research and engineering teams on model architecture design, implementation, experimentation, and productionalization Promote quality and reliability through regular code reviews Contribute to the vibrant research and learning environment of the Institute by staying up to date on the latest innovations in ML architectures, frameworks and applications to robotics Requirements
BS or MS in computer science, engineering, or equivalent technical experience 6+ years overall experience (3+ post-Masters or PhD) as a machine learning engineer, software engineer, or applied scientist. Experience writing production code for data processing, machine learning training, and/or serving in Python, C++, or similar Experience with git, issue tracking, CI/CD, and modern software engineering practices Experience with cloud ecosystems such as GCP and AWS Experience with deep learning libraries and frameworks such as PyTorch, TensorFlow and Flax Hands-on experience implementing and training deep learning models Experience with state-of-the-art deep-learning techniques such as transformers, diffusion models, and multi-modal modeling, in domains such as robotics, computer vision, and NLP. Bonus - Nice to Have
Hands-on experience with one or more of reinforcement learning, imitation learning, incremental learning, inference optimization and model compression Experience with robotics simulation platforms such as MuJoCo, Isaac Sim, and Drake Experience deploying models to devices such as robotic embodiments, and/or experience with ROS Experience with parallelized data processing frameworks such as Hadoop, Spark, and Ray Experience scaling training to multi-gpu and multi-node environments with Ray, Pytorch Lightning, Kubeflow, or similar Experience with MLOps (model versioning, model and data lineage, monitoring, model hosting and deployment, scalability, orchestration, continuous learning) Experience with Docker, Kubernetes, cloud computing, or similar applications Experience with orchestration workflows with tools such as Airflow, Kubeflow, or AWS Step Functions DevOps experience (e.g. CI/CD Pipelines, Infrastructure as Code, containers)
#J-18808-Ljbffr
Responsibilities
Train, deploy, and maintain various ML models on cloud and on-premise infrastructure Develop pipelines and tools for all components of the ML lifecycle – from training, evaluation, and optimization to deployment Partner closely with research and engineering teams on model architecture design, implementation, experimentation, and productionalization Promote quality and reliability through regular code reviews Contribute to the vibrant research and learning environment of the Institute by staying up to date on the latest innovations in ML architectures, frameworks and applications to robotics Requirements
BS or MS in computer science, engineering, or equivalent technical experience 6+ years overall experience (3+ post-Masters or PhD) as a machine learning engineer, software engineer, or applied scientist. Experience writing production code for data processing, machine learning training, and/or serving in Python, C++, or similar Experience with git, issue tracking, CI/CD, and modern software engineering practices Experience with cloud ecosystems such as GCP and AWS Experience with deep learning libraries and frameworks such as PyTorch, TensorFlow and Flax Hands-on experience implementing and training deep learning models Experience with state-of-the-art deep-learning techniques such as transformers, diffusion models, and multi-modal modeling, in domains such as robotics, computer vision, and NLP. Bonus - Nice to Have
Hands-on experience with one or more of reinforcement learning, imitation learning, incremental learning, inference optimization and model compression Experience with robotics simulation platforms such as MuJoCo, Isaac Sim, and Drake Experience deploying models to devices such as robotic embodiments, and/or experience with ROS Experience with parallelized data processing frameworks such as Hadoop, Spark, and Ray Experience scaling training to multi-gpu and multi-node environments with Ray, Pytorch Lightning, Kubeflow, or similar Experience with MLOps (model versioning, model and data lineage, monitoring, model hosting and deployment, scalability, orchestration, continuous learning) Experience with Docker, Kubernetes, cloud computing, or similar applications Experience with orchestration workflows with tools such as Airflow, Kubeflow, or AWS Step Functions DevOps experience (e.g. CI/CD Pipelines, Infrastructure as Code, containers)
#J-18808-Ljbffr