Grainger

Senior Data Engineer

Grainger, Chicago, Illinois, United States, 60290

Position Details:The search team comprises the e-commerce search platform, backend services, machine learning endpoints, and our search engine, which enables grainger.com customers to find products that serve relevant content to online customers, internal support teams, and marketing programs. We are looking for a Senior Data engineer. You will report to the Mgr, Product EngineeringYou will:Design and implement technical solutions and processes to ensure data reliability and accuracy.Build pipelines that feed embedding models and vector databases while working with platform-oriented teams to ensure database response times meet expectations.Develop data models and mappings and build new data assets required by data science teams. Perform exploratory data analysis on existing products and datasets.Understand trends and emerging technologies and evaluate the performance and applicability of potential tools for our requirements.Enable data discovery and curation for analytical purposes by collaborating with data scientists and providing data integration solutions.Collaborate with stakeholders, including team and product managers, to create secure and efficient data products.Ensure that data processing pipelines and ETL jobs are designed for scalability, using distributed technologies to handle increasing data volumes efficiently.Work within an Agile delivery / DevOps methodology to deliver product increments in iterative sprints.You Have:3+ years of experience in batch and streaming ETL using Spark, Python, Scala, Snowflake, or Databricks for Data Engineering or Machine Learning workloads. Snowflake and Databricks are a must.3+ years orchestrating and implementing pipelines with workflow tools like Databricks Workflows, Apache Airflow, or Luigi3+ years of experience prepping structured and unstructured data for data science models.3+ years of experience with containerization and orchestration technologies (Docker, Kubernetes) and experience with shell scripting in Bash, Unix, or Windows shell is preferable.Experience working in vector databases like Milvus, Pinecone, or WeaviateExperience using machine learning in data pipelines to discover, classify, and clean data.Implemented CI/CD with automated testing in Jenkins, Github Actions, or Gitlab CI/CDFamiliarity with AWS Services not limited to Glue, Athena, Lambda, S3, and DynamoDBDemonstrated experience implementing data management life cycle, using data quality functions like standardization, transformation, rationalization, linking, and matching

#J-18808-Ljbffr