Logo
ZipRecruiter

Machine Learning Infrastructure Engineer

ZipRecruiter, San Francisco, California, United States, 94199


Job Description Company Overview:

Welcome to the forefront of machine learning infrastructure! At our company, we're passionate about pushing the boundaries of artificial intelligence and machine learning. Our mission is to develop robust and scalable infrastructure solutions that empower data scientists and machine learning engineers to build, deploy, and manage cutting-edge machine learning models. Join us and be part of a dynamic team committed to shaping the future of machine learning infrastructure. Position Overview:

As a Machine Learning Infrastructure Engineer, you'll play a crucial role in designing, building, and optimizing our machine learning infrastructure to support the needs of our organization. Working closely with cross-functional teams of data scientists, software engineers, and DevOps specialists, you'll ensure the reliability, scalability, and efficiency of our machine learning systems. If you're passionate about machine learning infrastructure and eager to drive innovation in AI, we want you on our team. Key Responsibilities: Machine Learning Infrastructure Design:

Design and architect scalable and reliable infrastructure solutions to support machine learning model development, training, and deployment. Model Training and Experimentation:

Develop and maintain infrastructure for model training and experimentation, including distributed computing environments and GPU clusters. Model Deployment and Serving:

Implement and manage infrastructure for deploying and serving machine learning models in production environments, ensuring low-latency and high availability. Model Monitoring and Management:

Develop monitoring and management tools for tracking model performance, health, and drift, and automating model retraining and redeployment. Data Processing and Feature Engineering:

Develop pipelines and tools for data processing, feature engineering, and preprocessing to support machine learning model development. Infrastructure Automation:

Implement infrastructure automation and orchestration using tools such as Kubernetes, Docker, Terraform, and Ansible to streamline deployment and management processes. Performance Optimization:

Optimize infrastructure performance for speed, scalability, and cost-effectiveness, leveraging cloud services and distributed computing technologies. Security and Compliance:

Implement security controls and compliance measures to protect sensitive data and ensure compliance with regulatory requirements in machine learning workflows. Qualifications: Bachelor's degree or higher in Computer Science, Engineering, or related field. Strong background in infrastructure engineering, with hands-on experience in designing, building, and optimizing infrastructure solutions for machine learning. Proficiency in programming such as Python, Java, or Go, and experience with machine learning frameworks such as TensorFlow, PyTorch, or scikit-learn. Experience with cloud platforms such as AWS, Google Cloud Platform, or Microsoft Azure, and familiarity with cloud services for machine learning (e.g., SageMaker, AI Platform, Azure ML). Knowledge of distributed computing technologies such as Apache Spark, Hadoop, or Dask, and experience with containerization and orchestration technologies such as Docker and Kubernetes. Strong problem-solving abilities and analytical thinking, with a keen attention to detail and a passion for tackling complex technical challenges. Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams and communicate technical concepts to non-technical stakeholders. Benefits: Competitive salary: The industry standard salary for Machine Learning Infrastructure Engineers typically ranges from $150,000 to $230,000 per year, depending on experience and qualifications. Exceptional candidates may be eligible for higher compensation packages. Comprehensive health, dental, and vision insurance plans. Flexible work hours and remote work options. Generous vacation and paid time off. Professional development opportunities, including access to training programs, conferences, and workshops. State-of-the-art technology environment with access to cutting-edge tools and resources. Vibrant and inclusive company culture with team-building activities and social events. Opportunities for career growth and advancement within the company. Exciting projects with real-world impact in the field of artificial intelligence and machine learning. Chance to work alongside top talent and industry experts in machine learning infrastructure. Join Us:

Ready to shape the future of machine learning infrastructure? Apply now to join our team and be part of an exciting journey of innovation and discovery!

#J-18808-Ljbffr