Get AI-powered advice on this job and more exclusive features.
About Infinitive
Infinitive is a data and AI consultancy that helps clients modernize, monetize, and operationalize their data to generate lasting value. They pride themselves on their deep industry and technology expertise, ensuring that they drive and sustain the adoption of new capabilities. Infinitive is committed to aligning their team with their clients' culture, ensuring a successful partnership by bringing the right mix of talent and skills for high return on investment. Infinitive has earned recognition as one of the "Best Small Firms to Work For" by Consulting Magazine, receiving this accolade seven times, most recently in 2024. They have also been honored as a “Top Workplace” by the Washington Post, “Best Places to Work” by the Washington Business Journal, and “Best Places to Work” by Virginia Business.
Job Summary
We are seeking a skilled Data Engineer to join our team and play a key role in designing, building, and maintaining robust data pipelines and platforms. The ideal candidate will have strong experience with Python, AWS (Glue, S3, Lambda, CloudWatch), Databricks, Apache Spark, SQL, Snowflake, and DynamoDB. This role involves working with large-scale data processing systems, optimizing ETL/ELT workflows, and ensuring the reliability, scalability, and security of data solutions.
Key Responsibilities
- Design, develop, and maintain scalable ETL/ELT pipelines using AWS Glue, Databricks, and Apache Spark.
- Work with structured and semi-structured data in Snowflake, DynamoDB, and S3.
- Optimize and troubleshoot Spark jobs for performance and cost efficiency.
- Develop and maintain Lambda functions to support real-time and batch data processing.
- Implement data quality, validation, and monitoring using CloudWatch and other observability tools.
- Design and optimize complex SQL queries for data transformation, aggregation, and reporting.
- Collaborate with cross-functional teams, including Data Scientists, Analysts, and DevOps teams, to support data-driven decision-making.
- Implement best practices for data security, governance, and compliance in the cloud environment.
- Automate data workflows and CI/CD pipelines using infrastructure-as-code and Git-based version control.
Required Qualifications:
- 3+ years of experience in Data Engineering or a similar role.
- Strong proficiency in Python for data processing and automation.
- Hands-on experience with AWS services including Glue, S3, Lambda, CloudWatch, DynamoDB.
- Expertise in Databricks and Apache Spark for large-scale data processing.
- Proficient in SQL for querying and manipulating structured data.
- Experience working with Snowflake, including schema design and performance tuning.
- Knowledge of data lake and data warehouse architectures.
- Familiarity with data security, IAM roles, and cloud-based authentication mechanisms.
- Strong problem-solving and debugging skills in a cloud-based data environment.
Preferred Qualifications:
- Experience with streaming data architectures using Kafka, Kinesis, or Pub/Sub.
- Knowledge of orchestration tools such as Airflow, Step Functions, or MWAA.
- Experience with Terraform or CloudFormation for infrastructure automation.
- Understanding of data governance frameworks (e.g., Dataplex, Alation, or Unity Catalog).
- Experience in CI/CD and DevOps practices for data engineering pipelines.
- Hands-on experience with monitoring and logging tools, including Splunk and NewRelic.
Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Information Technology
Industries
Business Consulting and Services
#J-18808-Ljbffr