Harnham

Data Engineer

Harnham, San Francisco, CA, United States

Data Engineer

1 Year Contract

Hybrid (2 Days Onsite)

$75-85/hr

We are partnered with a leading beauty retailer, delivering innovative and personalized beauty experiences to their customers. We are looking for a highly skilled Data Engineer to join the team and help enhance our data platforms and analytics capabilities. You will work with cutting-edge technologies to process large volumes of data, integrate machine learning models, and build scalable data pipelines that will power our e-commerce platforms and improve customer experiences.

Role Overview

As a data engineer, you will play a key role in shaping and implementing our data strategy. You will be responsible for leading the development and deployment of data processing pipelines, driving machine learning model integration, and ensuring high-quality, scalable, and reliable data platforms.

Key Responsibilities

Drive the design, implementation, and optimization of data pipelines for heavy-volume data processing, including ETL workflows using Spark and Python.
Build and enhance large-scale data platforms, data lakes, and data warehouses using technologies like Databricks and Delta Lake.
Oversee the integration of machine learning models into production systems. Work through all phases of model development, including data preprocessing, feature engineering, and model deployment.
Ensure near-real-time, batch, and real-time data pipelines are efficient, accurate, and meet the required performance metrics.
Work hands-on with deep learning frameworks such as PyTorch, TensorFlow, and Keras to integrate ML models in a production environment.

Required Skills and Experience

8-10 years of experience in data engineering, data processing, data platforms, or equivalent.
Strong proficiency in Python and Spark (5+ years of hands-on experience) for building scalable ETL workflows and data pipelines.
Hands-on experience in ETL workflows using Spark and Python.
Designing and implementing large-scale data processing pipelines for batch and real-time data flows.
Building and deploying ML models in a production environment, with expertise in data preprocessing, feature engineering, and model engineering.
Experience with Databricks or similar data platforms for managing and processing large-scale data workflows.