Logo
Mitchell Martin Inc.

Data Engineer

Mitchell Martin Inc., Chicago, Illinois, United States, 60290


Our client, and nation's leading insurers, is seeking a Data Engineer.

Location:

Remote

Position Type:

Contract to Hire

POSITION SUMMARY:

We are seeking an experienced and motivated data engineer to join a Lean/Agile team building and supporting our data science and analytics operational platform.As an experienced engineer of data extraction, transformation, and persistence, you will be designing and implementing various components of our data science collaboration and deployment platform.Working closely with Data Science and Analytics professionals, you will develop automated, streaming data pipelines for event capture, transformation, and feature extraction to assist the machine learning process.The industry changes rapidly, so we are looking for candidates who can respond to change, pick up new technologies quickly, and adapt to shifting requirements.We also want candidates who are production-oriented and have a commitment to quality.PRINCIPAL DUTIES AND RESPONSIBILITIES:

Build and maintain event capture/transformation flows, feature repositories, data cache for real-time analytics, and more.Develop data pipelines that can be leveraged in both model training and production execution.Collaborate with Data Architecture and other Data Engineering groups, maintaining a focus on operationalizing data flows in the service of data science and analytics groups.Development of code to extract value from various structured, semi-structured, and unstructured data sources creating refined data repositories for ease of analysis.MINIMUM JOB REQUIREMENTS:

5+ years in data-related fieldStrong Python data skills with Pandas, as well as XML/JSON parsingExperience with AWS cloud technologies including S3, EC2 instances, and moreStrong SQL skills and ability to adapt those skills in multiple relational technologies and some NoSQL technologies (SAS, PROC SQL, Microsoft SQL, Snowflake, Dynamics)Experience with the following technologies a plus:Redshift, Hive, SparkSQL, etc.Experience in additional languages such as Java or Scala helpfulETL tools such as Informatica, Pentaho, SAP, etc.Messaging systems such as Amazon Kinesis or Apache KafkaAWS technologies such as Glue, DynamoDBApache Spark or PySpark a plusWorkflow scheduling tools such as Apache Airflow, Windows Scheduler, or LuigiExperience calling third-party REST APIs and working with JSON data

#J-18808-Ljbffr