Clearpoint
Lead Machine Learning Engineer
Clearpoint, Houston, Texas, 77246
TITLE: Lead Machine Learning Engineer LOCATION: Houston, Texas TYPE: Direct Hire SALARY: $195,000 - $230,000 SUMMARY: The Lead Machine Learning Engineer will be responsible for establishing DevOps and MLOps processes throughout the Corporate Data & Analytics Team to support AI/ML applications. Driving the adoption of best practices in DevOps and MLOps, resulting in faster deployment of AI/ML and data-driven solutions that satisfy business requirements. This role requires extensive experience in DevOps and MLOps, a thorough grasp of InfraOps, and a comprehensive understanding of AI/ML data and analytics cloud services and components. You will work closely with data scientists, machine learning engineers, data engineers, software engineers, and platform architects, using cutting-edge tools and technologies to deploy and maintain AI/ML and advanced analytics solutions, as well as integrate analytic models with existing business applications. DUTIES: Enhance current DevOps processes to improve the whole AI/ML application development lifecycle. Collaborate with development and cloud platform teams to verify that the infrastructure satisfies the application's needs. Establish and maintain the best practices for cloud security, compliance, and cost efficiency. Create automated build and deployment methods to allow continuous delivery of software releases, as well as improve the existing CI/CD pipelines for AIML application development and deployment. Collaborate with data scientists, data engineers, data analysts, software engineers, IT professionals, and stakeholders to accelerate the deployment of AI applications using CI/CD pipelines while maintaining the applications SLAs on a single platform. Design, develop, and maintain infrastructure using infrastructure as code tools such as Terraform, Ansible, CloudFormation, etc. Utilize existing Databricks CLI codes to manage the Databricks platform as code for AI/ML data pipelines (batch processing, batch streaming, and streaming) and model serving endpoints. REQUIREMENTS: 10 years of experience in software engineering with a strong background in DevOps and Infrastructure as Code, supporting Machine Learning and Data Science workloads. Expertise in code versioning tools, such as Gitlab, GitHub, Azure DevOps, and Bitbucket, familiar with branch-level code repository management. Experience deploying Machine Learning solutions on cloud platforms (e.g., AWS, Azure, or GCP), Databricks and AWS is preferred. Proficient with GitHub actions to automate testing and deployment of data and ML workloads from CI/CD provider to Databricks. Strong knowledge of infrastructure automation tools such as Terraform, Ansible, CloudFormation, etc. Experience with data processing frameworks/tools/platforms such as Databricks, Apache Spark, Kafka, Flink, and AWS cloud services for batch processing, batch streaming, and streaming. Experience containerizing analytical models using Docker and Kubernetes or other container orchestration platforms. Technical expertise across all deployment models on public cloud, private cloud, and on-premises infrastructure. Experience in event-driven, and microservice architectures for enterprise-level platform development. Expertise in Linux, and knowledge of networking and security concepts. EDUCATION: Bachelor's Degree in Computer Science, Computer Engineering, Information Technology, Software Engineering, or equivalent technical discipline