Anblicks
Senior Data Engineer
Anblicks, Dallas, TX, United States
Job Title: Senior Data Engineer
JOB DUTIES:
Participate in daily agile and scrum processes to understand changing business requirements, including examining system configurations and operating procedures, as well as functional requirements gathering. Ingest and prepare business-ready data in the cloud using Azure Data Factory (ADF) to build ELT(Extract Load and Transform)/ETL (Extract Transform and Load) data pipelines, then move the data into a data warehouse (Dedicated SQL Pool) and create data lake zones for data analytics and visualization. Work with a combination of Azure Data Factory and Azure Databricks, extract, load, and transform data from cloud sources and on-premises databases such as Oracle, SAP, and SQL Server to Data Lake Storage and Azure Synapse Analytics. Create an Azure Data Factory Pipeline Template to migrate an on-premises data platform to the Azure Cloud using batch processing methods by incremental or full load, and an ADF config driven framework to pull data from multiple sources with different table structures with less manual work and less resource usage. Analyze, design, and build modern data solutions using Azure PaaS services to support data visualization. Understand the current production state of the application and the impact of new implementation on existing business processes. Enable private end point, firewall setting and Azure Key Vault for robust data security. Analyze the existing SSIS packages and integrate it with Azure Data Factory, using SSIS transformations like Lookup, Derived column, Data conversion, Aggregate, Conditional split, SQL task, Script task, and Send Mail task. Create a JSON structure for data storage in Azure Cosmos DB (SQL API), write stored procedures and functions and work with the API team to create Cosmos DB queries that use fewer request units. Data Model in Snowflake and ELT using Snowflake SQL, implementing complex stored procedures and best practices with data warehouse and ETL concepts. Design and customize dimension data models for data warehouse supporting data using Azure Synapse Analytics, and select the appropriate distribution method in dimension and fact tables to load the data in an optimized manner, as well as implement complex stored procedures and best practices with data warehouse. Build a distributed in-memory application using spark applications and perform analytics efficiently on large datasets using python and Spark SQL, also use Spark SQL to implement transformation logic in Databricks and mount/unmount Azure Blob Storage. Read the data from different file format parquet, avro, csv and json using pySpark (Python API) in Azure Databricks and perform data extraction, transformation to uncover insights into customer usage patterns and insert curated data into a data warehouse. Create data visualization reports and dashboards in Power BI using data from the data warehouse, flat files, and Azure SQL. Responsible for fixing problems and conducting investigations into SQL queries, Stored Procedures related to long running jobs and Azure service performance. Utilize Azure Monitor and Alert services, create monitors, alerts, and notifications for Data Factory, Synapse Analytics, and Data Lake Storage. Perform the required daily GIT support for various projects. Be in charge of maintaining GIT repositories and access control procedures. Create CI&CD using azure dev ops pipeline to deploy Azure Services (Storage, Data factory, Key vault & Logic App) using ARM Templates.
JOB REQUIREMENT:
Master's degree in Computer Science, Computer Information Systems, or Engineering related or Technical related fields plus 2 years of experience. In lieu of the above, we will also accept a Bachelor's degree in Computer Science, Computer Information Systems, or Engineering related or Technical related fields plus 5 years of progressively responsible post-baccalaureate experience. Foreign degree equivalent is acceptable. We will also accept any suitable combination of education, training and/or experience. Experience to include working on Azure Data Factory (ADF), Oracle, SQL server, Azure Databricks, Azure Synapse Analytics, Data Lake Storage, Azure PaaS services, SSIS packages, Azure Cosmos DB (SQL API), Python, Spark applications, Azure SQL, SQL queries, Azure Monitor and Alert services, GIT Support, ARM Templates.
HOURS: M-F, 8:00 a.m. - 5:00 p.m.
JOB LOCATION: Dallas, Texas. Travel is not required, but candidates must be willing to relocate to unanticipated work locations across the country per contract demand.
CONTACT: Email resume referencing job code# SDE01302023ANB to Maruthi Technologies INC. DBA Anblicks at [email protected]
JOB DUTIES:
Participate in daily agile and scrum processes to understand changing business requirements, including examining system configurations and operating procedures, as well as functional requirements gathering. Ingest and prepare business-ready data in the cloud using Azure Data Factory (ADF) to build ELT(Extract Load and Transform)/ETL (Extract Transform and Load) data pipelines, then move the data into a data warehouse (Dedicated SQL Pool) and create data lake zones for data analytics and visualization. Work with a combination of Azure Data Factory and Azure Databricks, extract, load, and transform data from cloud sources and on-premises databases such as Oracle, SAP, and SQL Server to Data Lake Storage and Azure Synapse Analytics. Create an Azure Data Factory Pipeline Template to migrate an on-premises data platform to the Azure Cloud using batch processing methods by incremental or full load, and an ADF config driven framework to pull data from multiple sources with different table structures with less manual work and less resource usage. Analyze, design, and build modern data solutions using Azure PaaS services to support data visualization. Understand the current production state of the application and the impact of new implementation on existing business processes. Enable private end point, firewall setting and Azure Key Vault for robust data security. Analyze the existing SSIS packages and integrate it with Azure Data Factory, using SSIS transformations like Lookup, Derived column, Data conversion, Aggregate, Conditional split, SQL task, Script task, and Send Mail task. Create a JSON structure for data storage in Azure Cosmos DB (SQL API), write stored procedures and functions and work with the API team to create Cosmos DB queries that use fewer request units. Data Model in Snowflake and ELT using Snowflake SQL, implementing complex stored procedures and best practices with data warehouse and ETL concepts. Design and customize dimension data models for data warehouse supporting data using Azure Synapse Analytics, and select the appropriate distribution method in dimension and fact tables to load the data in an optimized manner, as well as implement complex stored procedures and best practices with data warehouse. Build a distributed in-memory application using spark applications and perform analytics efficiently on large datasets using python and Spark SQL, also use Spark SQL to implement transformation logic in Databricks and mount/unmount Azure Blob Storage. Read the data from different file format parquet, avro, csv and json using pySpark (Python API) in Azure Databricks and perform data extraction, transformation to uncover insights into customer usage patterns and insert curated data into a data warehouse. Create data visualization reports and dashboards in Power BI using data from the data warehouse, flat files, and Azure SQL. Responsible for fixing problems and conducting investigations into SQL queries, Stored Procedures related to long running jobs and Azure service performance. Utilize Azure Monitor and Alert services, create monitors, alerts, and notifications for Data Factory, Synapse Analytics, and Data Lake Storage. Perform the required daily GIT support for various projects. Be in charge of maintaining GIT repositories and access control procedures. Create CI&CD using azure dev ops pipeline to deploy Azure Services (Storage, Data factory, Key vault & Logic App) using ARM Templates.
JOB REQUIREMENT:
Master's degree in Computer Science, Computer Information Systems, or Engineering related or Technical related fields plus 2 years of experience. In lieu of the above, we will also accept a Bachelor's degree in Computer Science, Computer Information Systems, or Engineering related or Technical related fields plus 5 years of progressively responsible post-baccalaureate experience. Foreign degree equivalent is acceptable. We will also accept any suitable combination of education, training and/or experience. Experience to include working on Azure Data Factory (ADF), Oracle, SQL server, Azure Databricks, Azure Synapse Analytics, Data Lake Storage, Azure PaaS services, SSIS packages, Azure Cosmos DB (SQL API), Python, Spark applications, Azure SQL, SQL queries, Azure Monitor and Alert services, GIT Support, ARM Templates.
HOURS: M-F, 8:00 a.m. - 5:00 p.m.
JOB LOCATION: Dallas, Texas. Travel is not required, but candidates must be willing to relocate to unanticipated work locations across the country per contract demand.
CONTACT: Email resume referencing job code# SDE01302023ANB to Maruthi Technologies INC. DBA Anblicks at [email protected]