Logo
ZipRecruiter

Senior Machine Learning Infrastructure Engineer

ZipRecruiter, San Francisco, CA, United States


Company Overview: Welcome to the cutting-edge of AI-driven innovation! At our company, we're pioneers in leveraging machine learning to revolutionize industries. We're committed to building robust infrastructure that powers our machine learning models at scale. Join us and be part of a dynamic team shaping the future of AI infrastructure engineering.

Position Overview: As a Senior Machine Learning Infrastructure Engineer, you'll lead the design, development, and optimization of our machine learning infrastructure. You'll work on challenging projects, from building scalable data pipelines to deploying and managing machine learning models in production environments. If you're a seasoned engineer with expertise in machine learning infrastructure technologies and a passion for building scalable, reliable, and efficient systems, we want you on our team.

Key Responsibilities:

  1. Infrastructure Design: Design and architect scalable and reliable infrastructure solutions to support machine learning workflows, including data ingestion, model training, evaluation, and deployment.
  2. Data Pipeline Development: Develop and maintain data pipelines to ingest, preprocess, and transform data for training machine learning models, ensuring data quality, integrity, and scalability.
  3. Model Training Infrastructure: Build and optimize infrastructure for training machine learning models at scale, leveraging distributed computing frameworks and accelerators for performance and efficiency.
  4. Model Deployment: Design and implement systems for deploying and managing machine learning models in production environments, ensuring reliability, scalability, and real-time inference capabilities.
  5. Monitoring and Logging: Implement monitoring and logging solutions to track the performance and health of machine learning infrastructure and models, proactively identifying and resolving issues.
  6. Automation and Orchestration: Develop automation and orchestration tools to streamline machine learning workflows, reducing manual intervention and improving operational efficiency.
  7. Security and Compliance: Implement security controls and ensure compliance with data privacy regulations in machine learning infrastructure and workflows, protecting sensitive data and ensuring regulatory compliance.
  8. Documentation and Best Practices: Document infrastructure designs, processes, and best practices, providing clear and comprehensive documentation to facilitate understanding and collaboration among team members.
  9. Collaboration: Collaborate with data scientists, machine learning engineers, and software developers to understand requirements and deliver infrastructure solutions that meet business needs.
  10. Mentorship and Development: Mentor junior engineers, sharing expertise and best practices in machine learning infrastructure engineering, and facilitate knowledge sharing sessions within the team.

Qualifications:

  • Bachelor's degree or higher in Computer Science, Engineering, Mathematics, or related field.
  • 5+ years of experience in infrastructure engineering, with a focus on machine learning infrastructure.
  • Proficiency in cloud platforms such as AWS, Azure, or Google Cloud Platform, and services like AWS SageMaker, Azure Machine Learning, or Google AI Platform.
  • Strong programming skills in such as Python, Java, or Scala, with experience in distributed computing frameworks like Apache Spark or TensorFlow.
  • Experience with containerization technologies such as Docker and container orchestration platforms such as Kubernetes.
  • Strong understanding of machine learning concepts and techniques, with experience deploying and managing machine learning models in production environments.
  • Strong problem-solving skills and analytical thinking, with the ability to design and troubleshoot complex infrastructure issues.
  • Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams and communicate technical concepts to non-technical stakeholders.

Benefits:

  • Competitive salary: The industry standard salary for Senior Machine Learning Infrastructure Engineers typically ranges from $170,000 to $230,000 per year, depending on experience and qualifications.
  • Comprehensive health, dental, and vision insurance plans.
  • Flexible work hours and remote work options.
  • Generous vacation and paid time off.
  • Professional development opportunities, including access to training programs, conferences, and workshops.
  • State-of-the-art technology environment with access to cutting-edge tools and resources.
  • Vibrant and inclusive company culture with opportunities for growth and advancement.
  • Exciting projects with real-world impact at the forefront of AI-driven innovation.

Join Us: Ready to shape the future of machine learning infrastructure engineering? Apply now to join our team and be part of the AI revolution!

#J-18808-Ljbffr