Logo
Clearpoint

Lead Machine Learning Engineer

Clearpoint, Houston, TX, United States


TITLE: Lead Machine Learning Engineer

LOCATION: Houston, Texas

TYPE: Direct Hire

SALARY: $195,000 - $230,000

SUMMARY:

The Lead Machine Learning Engineer will be responsible for establishing DevOps and MLOps processes throughout the Corporate Data & Analytics Team to support AI/ML applications. Driving the adoption of best practices in DevOps and MLOps, resulting in faster deployment of AI/ML and data-driven solutions that satisfy business requirements. This role requires extensive experience in DevOps and MLOps, a thorough grasp of InfraOps, and a comprehensive understanding of AI/ML data and analytics cloud services and components. You will work closely with data scientists, machine learning engineers, data engineers, software engineers, and platform architects, using cutting-edge tools and technologies to deploy and maintain AI/ML and advanced analytics solutions, as well as integrate analytic models with existing business applications.

DUTIES:

  • Enhance current DevOps processes to improve the whole AI/ML application development lifecycle.
  • Collaborate with development and cloud platform teams to verify that the infrastructure satisfies the application's needs.
  • Establish and maintain the best practices for cloud security, compliance, and cost efficiency.
  • Create automated build and deployment methods to allow continuous delivery of software releases, as well as improve the existing CI/CD pipelines for AIML application development and deployment.
  • Collaborate with data scientists, data engineers, data analysts, software engineers, IT professionals, and stakeholders to accelerate the deployment of AI applications using CI/CD pipelines while maintaining the applications SLAs on a single platform.
  • Design, develop, and maintain infrastructure using infrastructure as code tools such as Terraform, Ansible, CloudFormation, etc.
  • Utilize existing Databricks CLI codes to manage the Databricks platform as code for AI/ML data pipelines (batch processing, batch streaming, and streaming) and model serving endpoints.
REQUIREMENTS:
  • 10+ years of experience in software engineering with a strong background in DevOps and Infrastructure as Code, supporting Machine Learning and Data Science workloads.
  • Expertise in code versioning tools, such as Gitlab, GitHub, Azure DevOps, and Bitbucket, familiar with branch-level code repository management.
  • Experience deploying Machine Learning solutions on cloud platforms (e.g., AWS, Azure, or GCP), Databricks and AWS is preferred.
  • Proficient with GitHub actions to automate testing and deployment of data and ML workloads from CI/CD provider to Databricks.
  • Strong knowledge of infrastructure automation tools such as Terraform, Ansible, CloudFormation, etc.
  • Experience with data processing frameworks/tools/platforms such as Databricks, Apache Spark, Kafka, Flink, and AWS cloud services for batch processing, batch streaming, and streaming.
  • Experience containerizing analytical models using Docker and Kubernetes or other container orchestration platforms.
  • Technical expertise across all deployment models on public cloud, private cloud, and on-premises infrastructure.
  • Experience in event-driven, and microservice architectures for enterprise-level platform development.
  • Expertise in Linux, and knowledge of networking and security concepts.
EDUCATION:
  • Bachelor's Degree in Computer Science, Computer Engineering, Information Technology, Software Engineering, or equivalent technical discipline