RIT Solutions, Inc.
Principal Data Engineer (GCP)
RIT Solutions, Inc., Phoenix, Arizona, United States, 85223
Principal Data Engineer (GCP)Location: 100% Remote6-month contract to hire
Our client is seeking a Principal Data Engineer passionate about data in all forms, whether stored within relational databases, data warehouses, data lakes, lakehouses, or in-transit in ETL pipelines.The role involves architecting and implementing data solutions to deliver insights, visualizations, or better predictions for their clients.The Principal Data Engineer will support software development teams, data analysts, and data scientists using market-relevant products and services.
Key Responsibilities
Oversee the entire technical lifecycle of a cloud data platform, including framework decisions, feature breakdowns, technical requirements, and production readinessDesign and implement a robust, secure data platform in GCP using industry best practices, native security tools, and integrated data governance controlsTranslate defined data governance strategies into technical requirements, implementing controls, documenting processes, and fostering a data-driven cultureUtilize complex SQL knowledge to work with relational databases, BigQuery, and various other databasesBuild analytics tools that leverage the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other business performance metricsDesign and implement scalable and reliable data pipelines on GCPImplement Change Data Capture (CDC) techniques and manage Delta Live Tables for real-time data integration and analyticsConfigure and manage Data Lakes in GCP to support diverse data types and formats for scalable storage, processing, and analyticsDesign API architecture, including RESTful services and microservices, integrating Machine Learning models into production systemsBuild infrastructure required for ETL of data from various sources using SQL and GCPMigrate and create data pipelines and infrastructure from AWS or Azure to GCPWrite and maintain robust, efficient, scalable Python scripts for data processing and automationUtilize a strong understanding of data pipeline design patterns to determine the best solution for use casesWork with unstructured datasets and build processes supporting data transformation, structures, metadata, dependency, and workload managementCollaborate with stakeholders to assist with data-related technical issues and support their data infrastructure needsEnsure the stability and security of data in transit and at restBuild internal processes, frameworks, and best practices for the data engineering domainFoster cross-functional collaboration between engineering and other project disciplinesMentor and support the growth of other data engineersParticipate in the internal leadership of the data engineering domain and provide technical feedback and recommendationsAssess the technical skills of prospective candidates and provide recommendations to hiring managersAssist with sales requests by providing technical recommendations and estimates to prospective clients
Skills & Qualifications
Bachelor's in Computer Science or related field or equivalent experience requiredAt least 15 years of overall technical experience3 years leading large GCP projectsExtensive knowledge of GCP data services such as BigQuery, Dataflow, Dataproc, and Pub/SubExperience designing and implementing data governance and compliance policiesProficiency in Python and SQLExperience with migrating data pipelines and infrastructure to GCPDeep understanding of data modeling, ETL processes, and data warehousing principlesFamiliarity with data pipeline orchestration tools and practicesExcellent problem-solving and analytical skillsStrong communication skills with the ability to convey technical information to non-technical stakeholdersProactive collaborator with a history of mentoring colleaguesExperience building and optimizing big data pipelines and data setsExperience with APIs and additional database management systems is a plus
Our client is seeking a Principal Data Engineer passionate about data in all forms, whether stored within relational databases, data warehouses, data lakes, lakehouses, or in-transit in ETL pipelines.The role involves architecting and implementing data solutions to deliver insights, visualizations, or better predictions for their clients.The Principal Data Engineer will support software development teams, data analysts, and data scientists using market-relevant products and services.
Key Responsibilities
Oversee the entire technical lifecycle of a cloud data platform, including framework decisions, feature breakdowns, technical requirements, and production readinessDesign and implement a robust, secure data platform in GCP using industry best practices, native security tools, and integrated data governance controlsTranslate defined data governance strategies into technical requirements, implementing controls, documenting processes, and fostering a data-driven cultureUtilize complex SQL knowledge to work with relational databases, BigQuery, and various other databasesBuild analytics tools that leverage the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other business performance metricsDesign and implement scalable and reliable data pipelines on GCPImplement Change Data Capture (CDC) techniques and manage Delta Live Tables for real-time data integration and analyticsConfigure and manage Data Lakes in GCP to support diverse data types and formats for scalable storage, processing, and analyticsDesign API architecture, including RESTful services and microservices, integrating Machine Learning models into production systemsBuild infrastructure required for ETL of data from various sources using SQL and GCPMigrate and create data pipelines and infrastructure from AWS or Azure to GCPWrite and maintain robust, efficient, scalable Python scripts for data processing and automationUtilize a strong understanding of data pipeline design patterns to determine the best solution for use casesWork with unstructured datasets and build processes supporting data transformation, structures, metadata, dependency, and workload managementCollaborate with stakeholders to assist with data-related technical issues and support their data infrastructure needsEnsure the stability and security of data in transit and at restBuild internal processes, frameworks, and best practices for the data engineering domainFoster cross-functional collaboration between engineering and other project disciplinesMentor and support the growth of other data engineersParticipate in the internal leadership of the data engineering domain and provide technical feedback and recommendationsAssess the technical skills of prospective candidates and provide recommendations to hiring managersAssist with sales requests by providing technical recommendations and estimates to prospective clients
Skills & Qualifications
Bachelor's in Computer Science or related field or equivalent experience requiredAt least 15 years of overall technical experience3 years leading large GCP projectsExtensive knowledge of GCP data services such as BigQuery, Dataflow, Dataproc, and Pub/SubExperience designing and implementing data governance and compliance policiesProficiency in Python and SQLExperience with migrating data pipelines and infrastructure to GCPDeep understanding of data modeling, ETL processes, and data warehousing principlesFamiliarity with data pipeline orchestration tools and practicesExcellent problem-solving and analytical skillsStrong communication skills with the ability to convey technical information to non-technical stakeholdersProactive collaborator with a history of mentoring colleaguesExperience building and optimizing big data pipelines and data setsExperience with APIs and additional database management systems is a plus