JobRialto
Azure Databricks Developer
JobRialto, Louisville, Kentucky, us, 40201
Job Summary:
We are looking for an experienced Data Engineer to join our team. In this role, you will design and implement high-performance data ingestion pipelines from multiple sources, leveraging Apache Spark and/or Azure Databricks. You will also be responsible for delivering and presenting proofs of concept to stakeholders and developing scalable, reusable frameworks for ingesting geospatial data sets. This is an exciting opportunity to contribute to cutting-edge data engineering initiatives in a fast-paced environment.
Key Responsibilities: Design and Implement Data Pipelines: Build and optimize high-performance data ingestion pipelines from diverse sources using Apache Spark and/or Azure Databricks, ensuring scalability and reliability. Proofs of Concept: Deliver and present proofs of concept for key technology components, showcasing new approaches and solutions to project stakeholders. Geospatial Data Integration: Develop scalable and reusable frameworks for efficiently ingesting and processing geospatial data sets, supporting both structured and unstructured data formats. Collaborate with Teams: Work closely with cross-functional teams (data scientists, analysts, product owners) to gather requirements and ensure successful integration of data pipelines. Optimize Performance: Continuously monitor and improve the performance, scalability, and cost-efficiency of the data pipelines and ingestion frameworks. Ensure Data Quality: Apply best practices to ensure data quality, security, and integrity across the entire ingestion process. Required Qualifications:
Experience: Proven experience in data engineering, specifically in building data pipelines with Apache Spark and/or Azure Databricks. Geospatial Data Expertise: Familiarity with handling geospatial data sets, including their structure and integration techniques. Data Engineering Frameworks: Experience in developing scalable and reusable data frameworks for complex data processing. Programming: Proficiency in Python, Scala, or Java for building data pipelines and processing large data sets. Cloud Technologies: Experience with cloud platforms, especially Azure, and familiarity with data processing tools such as Azure Data Lake, Databricks, or HDInsight. Collaborative Skills: Strong communication skills and experience working in cross-functional teams. Preferred Qualifications:
Experience with Apache Hadoop, Kafka, or other big data frameworks. Familiarity with machine learning pipelines or data processing in AI/ML environments. Experience with ETL/ELT processes and data warehousing solutions
Certifications (if applicable):
Microsoft Certified: Azure Data Engineer or other relevant certifications in data engineering or cloud technologies.
Education:
Bachelors Degree
We are looking for an experienced Data Engineer to join our team. In this role, you will design and implement high-performance data ingestion pipelines from multiple sources, leveraging Apache Spark and/or Azure Databricks. You will also be responsible for delivering and presenting proofs of concept to stakeholders and developing scalable, reusable frameworks for ingesting geospatial data sets. This is an exciting opportunity to contribute to cutting-edge data engineering initiatives in a fast-paced environment.
Key Responsibilities: Design and Implement Data Pipelines: Build and optimize high-performance data ingestion pipelines from diverse sources using Apache Spark and/or Azure Databricks, ensuring scalability and reliability. Proofs of Concept: Deliver and present proofs of concept for key technology components, showcasing new approaches and solutions to project stakeholders. Geospatial Data Integration: Develop scalable and reusable frameworks for efficiently ingesting and processing geospatial data sets, supporting both structured and unstructured data formats. Collaborate with Teams: Work closely with cross-functional teams (data scientists, analysts, product owners) to gather requirements and ensure successful integration of data pipelines. Optimize Performance: Continuously monitor and improve the performance, scalability, and cost-efficiency of the data pipelines and ingestion frameworks. Ensure Data Quality: Apply best practices to ensure data quality, security, and integrity across the entire ingestion process. Required Qualifications:
Experience: Proven experience in data engineering, specifically in building data pipelines with Apache Spark and/or Azure Databricks. Geospatial Data Expertise: Familiarity with handling geospatial data sets, including their structure and integration techniques. Data Engineering Frameworks: Experience in developing scalable and reusable data frameworks for complex data processing. Programming: Proficiency in Python, Scala, or Java for building data pipelines and processing large data sets. Cloud Technologies: Experience with cloud platforms, especially Azure, and familiarity with data processing tools such as Azure Data Lake, Databricks, or HDInsight. Collaborative Skills: Strong communication skills and experience working in cross-functional teams. Preferred Qualifications:
Experience with Apache Hadoop, Kafka, or other big data frameworks. Familiarity with machine learning pipelines or data processing in AI/ML environments. Experience with ETL/ELT processes and data warehousing solutions
Certifications (if applicable):
Microsoft Certified: Azure Data Engineer or other relevant certifications in data engineering or cloud technologies.
Education:
Bachelors Degree