Culinovo, Inc.
Job Title: Data Engineer or Databricks Engineer
Location: Plano, TX or Houston, TX
Hybrid
Overview:
We are seeking a talented and motivated
Databricks Engineer
to join our growing team. As a Databricks Engineer, you will be responsible for leveraging the Databricks platform to optimize and manage the data engineering pipeline. Your primary tasks will include working with Apache Spark, integrating data sources, optimizing ETL workflows, and deploying data models into production. You will work closely with data scientists, data analysts, and other teams to deliver actionable insights and solutions.
Key Responsibilities: Data Engineering : Design, develop, and maintain scalable data pipelines using Databricks, Apache Spark, and other cloud-based tools (AWS, Azure, GCP). ETL Processes : Build and optimize ETL pipelines to move data from source to destination efficiently and effectively. Platform Management : Set up, configure, and manage the Databricks platform, ensuring optimal performance and cost-effectiveness. Data Integration : Integrate data from multiple sources, including structured, semi-structured, and unstructured data. Collaboration : Work closely with Data Scientists, Data Analysts, and other stakeholders to translate business requirements into technical solutions. Optimization : Continuously monitor and improve the performance of existing data pipelines, ensuring that they are running efficiently and without errors. Automation : Develop and implement automation strategies for data ingestion, transformation, and deployment using Databricks jobs, notebooks, and workflows. Security : Ensure proper security practices are followed for data storage, access, and sharing within the platform. Documentation : Maintain clear documentation on pipeline architectures, configurations, and processes for knowledge sharing within the team. Required Skills and Qualifications:
Experience : 8+ years of experience in data engineering or a similar role with a focus on Databricks and cloud data platforms. Databricks Expertise : Strong experience with Databricks environment, including managing clusters, working with notebooks, and building pipelines. Apache Spark : Advanced knowledge of Spark for large-scale data processing, optimization, and debugging. Programming Languages : Proficiency in Python, Scala, or SQL for data engineering tasks. Cloud Platforms : Experience working with at least one major cloud provider (AWS, Azure, or GCP). ETL Frameworks : Solid understanding of ETL frameworks and tools, such as Apache Airflow, DBT, or similar. Data Warehousing : Familiarity with modern data warehousing concepts and tools (e.g., Snowflake, Redshift, BigQuery). Database Skills : Experience with relational databases (SQL) and NoSQL databases (e.g., MongoDB, Cassandra). Version Control : Proficiency in using version control systems, such as Git. Problem-solving : Strong analytical and troubleshooting skills. Communication : Excellent communication skills to collaborate with cross-functional teams and stakeholders. Preferred Qualifications:
Certifications : Databricks Certified Associate or Professional. Big Data Tools : Experience with tools like Kafka, Delta Lake, or Apache Flink. Machine Learning : Exposure to machine learning models and working knowledge of ML pipelines. Data Visualization : Familiarity with data visualization tools like Tableau, Power BI, or Databricks SQL Analytics
Location: Plano, TX or Houston, TX
Hybrid
Overview:
We are seeking a talented and motivated
Databricks Engineer
to join our growing team. As a Databricks Engineer, you will be responsible for leveraging the Databricks platform to optimize and manage the data engineering pipeline. Your primary tasks will include working with Apache Spark, integrating data sources, optimizing ETL workflows, and deploying data models into production. You will work closely with data scientists, data analysts, and other teams to deliver actionable insights and solutions.
Key Responsibilities: Data Engineering : Design, develop, and maintain scalable data pipelines using Databricks, Apache Spark, and other cloud-based tools (AWS, Azure, GCP). ETL Processes : Build and optimize ETL pipelines to move data from source to destination efficiently and effectively. Platform Management : Set up, configure, and manage the Databricks platform, ensuring optimal performance and cost-effectiveness. Data Integration : Integrate data from multiple sources, including structured, semi-structured, and unstructured data. Collaboration : Work closely with Data Scientists, Data Analysts, and other stakeholders to translate business requirements into technical solutions. Optimization : Continuously monitor and improve the performance of existing data pipelines, ensuring that they are running efficiently and without errors. Automation : Develop and implement automation strategies for data ingestion, transformation, and deployment using Databricks jobs, notebooks, and workflows. Security : Ensure proper security practices are followed for data storage, access, and sharing within the platform. Documentation : Maintain clear documentation on pipeline architectures, configurations, and processes for knowledge sharing within the team. Required Skills and Qualifications:
Experience : 8+ years of experience in data engineering or a similar role with a focus on Databricks and cloud data platforms. Databricks Expertise : Strong experience with Databricks environment, including managing clusters, working with notebooks, and building pipelines. Apache Spark : Advanced knowledge of Spark for large-scale data processing, optimization, and debugging. Programming Languages : Proficiency in Python, Scala, or SQL for data engineering tasks. Cloud Platforms : Experience working with at least one major cloud provider (AWS, Azure, or GCP). ETL Frameworks : Solid understanding of ETL frameworks and tools, such as Apache Airflow, DBT, or similar. Data Warehousing : Familiarity with modern data warehousing concepts and tools (e.g., Snowflake, Redshift, BigQuery). Database Skills : Experience with relational databases (SQL) and NoSQL databases (e.g., MongoDB, Cassandra). Version Control : Proficiency in using version control systems, such as Git. Problem-solving : Strong analytical and troubleshooting skills. Communication : Excellent communication skills to collaborate with cross-functional teams and stakeholders. Preferred Qualifications:
Certifications : Databricks Certified Associate or Professional. Big Data Tools : Experience with tools like Kafka, Delta Lake, or Apache Flink. Machine Learning : Exposure to machine learning models and working knowledge of ML pipelines. Data Visualization : Familiarity with data visualization tools like Tableau, Power BI, or Databricks SQL Analytics