Idexx

Principal Data Engineer

Idexx, Frisco, Texas, United States, 75034

As a

Principal Data Engineer , you will work with ML research scientists across IDEXX R&D to enable our product related to be leveraged for ML modeling. You will design and implement end-to-end data workflows for ML model building and monitoring for complex products, including imaging and clinical and operational solutions data. Additional data engineering work will impact the search, retrieval, processing, tagging, and publishing of curated data sets, with a focus on improving ML model performance over time.You must live within a commutable distance 1-3 days per week to either Westbrook, Maine or Frisco, Texas. Unfortunately, we are unable to provide relocation assistance or sponsorship for this role.Department:

IDEXX Data and AI Centre of Excellence develops and delivers data and AI assets and solutions to enhance IDEXX R&D, products, software, services, internal operations, and business practices.In this role:

You will design and implement scalable, reliable distributed data processing frameworks and analytical infrastructure using multiple technologies, including data sets or data warehouses, data virtualization and services, and repositories of semi-structured data sets.You will design automated software deployment functionality that efficiently manages applications across distributed platforms.You will monitor structural performance and utilization, identify problems, and implement solutions.You will lead the creation of standards, best practices, and new processes for the operational integration of new technology solutions.You will ensure environments are compliant with defined standards and operational procedures.You will implement measures to ensure data accuracy and accessibility, constantly monitoring and refining the performance of data management systems.You will understand structural requirements and define standards for storing, consuming, integrating, and managing data for machine learning applications.You will collaborate with data scientists and analysts to understand their data needs and develop solutions to meet those needs.You will develop and maintain data systems, processes, and procedures documentation.You will complete problem tickets, including bug fixes, design modification, and enhancement based on customer requirements.What you need to succeed:

You have

5 or more years of experience working

in machine learning and have delivered solutions into production in a professional setting.You have 5 or more years of experience using

Python, SQL, Databricks, Spark,

with

Big Data.It is helpful if you:

Your technical background is in Artificial Intelligence (AI) and Machine Learning (ML).You have experience owning a technology product and assuming a technical lead role.You understand structural requirements and can define standards for storing, consuming, integrating, and managing data.You are proficient in coding and programming languages such as Structured Query Language (SQL) and Python. Familiarity with R will be an advantage.You are familiar with cloud platforms such as Amazon Web Services (AWS).You have experience or a good understanding of:- Hadoop-based technologies like MapReduce and Spark- SQL-based technologies like Oracle, PostgreSQL and MySQL- Data processing tools including DLT- Cloud-based data platforms, including Databricks and Snowflake- data warehousing solutions and relational database theory- industry-standard software APIs.You have good verbal and written communication skills and can translate technical subject matter to non-technical audiences.You take the initiative in resolving problems and can balance conflicting requirements in partnership with others.You excel at customer service and building relationships.You have experience building distributed and cloud-based data pipelines.

#J-18808-Ljbffr