Highbrow LLC

Data Engineer

Highbrow LLC, Hartford, Connecticut, us, 06112

Job Title:

Data EngineerJob ID: 2022-11553

Job Location: Hartford, CT or Remote

Job Travel Location(s):

# Positions: 1

Employment Type: W2

Candidate Constraints:

Duration: Long Term

# of Layers:0

Work Eligibility:All Work Authorizations are Permitted – No Visa Transfers

Key Technology:

Spark, Pyspark, big data infra, AWS, ETL/Teradata, Gitlab

Job Responsibilities:

As a Data engineer, you will have the opportunity to expand your skills in a variety of areas while working on a data-focused team. On this team, you will be help architect and deliver a wide variety of code artifacts. You will be working to build a scalable and secure ETL solutions for operational, reporting and analytical data needs. In addition, you’ll gain experience in CI/CD by utilizing Jenkins, Terraform, and Ansible. Our group has a focus on full cycle engineering – requirements to production support – this means you will have an opportunity to work on a variety of problems to solve while gaining Big Data experience. Although this is a specialized role you will have the freedom to expand your responsibilities and try new things you may be interested in.

You will be challenged to:

Design, develop and maintain ETL platforms for various business use cases which are fault tolerant, highly distributed and robust.

Analyze large sets of structured and semi structured data for business analytics and ETL design.

Translate business needs and vision into roadmap, project deliverables and organization strategies.

Design and implement ETL solutions using leveraging cloud native platforms.

Collaborate with analytics and business teams to design data models that feed business intelligence tools, increasing data accessibility and encouraging data driven solutions.

Skills and Experience Required:

Good experience on designing and developing data pipelines for data ingestion and transformation using Spark.

Distributed computing experience using Pyspark.

Good understanding of spark framework and spark architecture.

Experience working in Cloud based big data infrastructure.

Excellent in trouble shooting the performance and data skew issues.

Must have good understanding of spark run time metrics and tune applications based on metrics.

Deep knowledge in partitioning, bucketing concepts of data ingestion.

Good understanding of AWS services like Glue, Athena, S3, Lambda, Cloud formation.

Preferred working knowledge on the implementation of datalake ETL using AWS glue, Databricks etc.

Experience with data modelling techniques for cloud data stores and on prem databases like Teradata, Teradata Vantage (TDV) etc.

Preferred working experience in ETL development in Teradata vantage and data migration from on prem to Teradata vantage.

Proficiency in SQL, relational and non-relational databases, query optimization and data modelling.

Experience with source code control systems like Gitlab.

Experience with large scale distributed relational and NoSQL database systems.

#J-18808-Ljbffr