Inficare
Data Engineer
Inficare, Bentonville, Arkansas, United States, 72712
Job Position:
Data Engineer
Job Location:
Bentonville, AR
Job duration:
Contract
Responsibilities:
Design, develop, and maintain robust and scalable ETL workflows and data pipelines using tools like Hive, Spark, and Airflow. Implement and manage data storage and processing solutions using Apache Hudi and BigQuery. Develop and optimize data pipelines for structured and unstructured data in GCP environments, leveraging GCS for data storage. Write clean, maintainable, and efficient code in Scala and Python to process and transform data. Ensure data quality, integrity, and consistency by implementing appropriate data validation and monitoring techniques. Work with cross-functional teams to understand business requirements and deliver data solutions that drive insights and decision-making. Troubleshoot and resolve performance and scalability issues in data processing and pipelines. Stay updated with the latest developments in big data technologies and tools and incorporate them into the workflow as appropriate. Required Skills and Qualifications:
Proven experience as a Data Engineer, preferably in a big data environment. Expertise in Hive, Spark, and Apache Hudi for big data processing and storage. Hands-on experience with BigQuery and Google Cloud Platform (GCP) services such as GCS, Dataflow, and Pub/Sub. Strong programming skills in Scala and Python, with experience in building data pipelines and ETL processes. Proficiency with workflow orchestration tools like Apache Airflow. Solid understanding of data warehousing concepts, data modelling, and schema design. Knowledge of distributed systems and parallel processing. Strong problem-solving skills and ability to work with large datasets in a fast-paced environment. Must have skills.
Overall Experience level: 7+ years of recent GCP experience Apache Hudi for big data processing and storage 7+ years of hands-on experience Hadoop, Hive or Spark, Airflow or a workflow orchestration solution Experience with programming languages: Python, Java, Scala, etc. Experience with scripting languages: Perl, Shell, etc.
Data Engineer
Job Location:
Bentonville, AR
Job duration:
Contract
Responsibilities:
Design, develop, and maintain robust and scalable ETL workflows and data pipelines using tools like Hive, Spark, and Airflow. Implement and manage data storage and processing solutions using Apache Hudi and BigQuery. Develop and optimize data pipelines for structured and unstructured data in GCP environments, leveraging GCS for data storage. Write clean, maintainable, and efficient code in Scala and Python to process and transform data. Ensure data quality, integrity, and consistency by implementing appropriate data validation and monitoring techniques. Work with cross-functional teams to understand business requirements and deliver data solutions that drive insights and decision-making. Troubleshoot and resolve performance and scalability issues in data processing and pipelines. Stay updated with the latest developments in big data technologies and tools and incorporate them into the workflow as appropriate. Required Skills and Qualifications:
Proven experience as a Data Engineer, preferably in a big data environment. Expertise in Hive, Spark, and Apache Hudi for big data processing and storage. Hands-on experience with BigQuery and Google Cloud Platform (GCP) services such as GCS, Dataflow, and Pub/Sub. Strong programming skills in Scala and Python, with experience in building data pipelines and ETL processes. Proficiency with workflow orchestration tools like Apache Airflow. Solid understanding of data warehousing concepts, data modelling, and schema design. Knowledge of distributed systems and parallel processing. Strong problem-solving skills and ability to work with large datasets in a fast-paced environment. Must have skills.
Overall Experience level: 7+ years of recent GCP experience Apache Hudi for big data processing and storage 7+ years of hands-on experience Hadoop, Hive or Spark, Airflow or a workflow orchestration solution Experience with programming languages: Python, Java, Scala, etc. Experience with scripting languages: Perl, Shell, etc.