CloudBC Labs
Lead Azure Data Engineer
CloudBC Labs, Seattle, Washington, us, 98127
Lead Azure Data Engineer Location-Seattle, WA / Remote Job Type-Long Term Interview Mode: Hands on Coding round on SQL, Python, Pyspark Key Skills: SQL, Python, Pyspark, Databricks, Synapse Analytics, ADF/ADLS, Data Warehousing, Data Modelling, Architecture, design. 12 Years Experience is a must. Job Description: Leads large-scale, complex, cross-functional projects build technical roadmap for the WFM Data Services platform . Leading and reviewing design artifacts Build and own the automation and monitoring frameworks that showcase reliable, accurate, easy-to-understand metrics and operational KPIs to stakeholders for data pipeline quality Execute proof of concept on new technology and tools to pick the best tools and solutions Supports business objectives by collaborating with business partners to identify opportunities and drive resolution; Communicating status and issues to Sr Starbucks leadership and stakeholders; Directing project team and cross functional teams on all technical aspects of the projects Lead with engineering team to build and support real-time, highly available data, data pipeline and technology capabilities Translate strategic requirements into business requirements to ensure solutions meet business needs Define & implement data retention policies and procedures Define & implement data governance policies and procedures Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability Enable team to pursue insights and applied breakthroughs, while also driving the solutions to Starbucks' scale Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of structured and unstructured data sources and using big data technologies. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics. Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs. Perform root cause analysis to identify permanent resolutions to software or business process issues Basic Qualifications 10 year of experience with object-oriented/object function scripting languages: Python, Java, etc 8 years of leading development of large scale cloud-based services with platforms like AWS, GCP or Azure and developing and operating cloud-based distributed systems. Experience building and optimizing data pipelines, architectures and data sets. Knowledge on Incorta ETL Pipelines Build processes supporting data transformation, data structures, metadata, dependency and workload management Strong computer science fundamentals in data structures, algorithm design, problem solving, and complexity Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores. Software development experience in big data technologies Databricks, Hadoop, Hive, Spark(PySpark) Familiarity with distributed systems and computing at scale. Advanced working experience with databases SQL & NoSQL is required. Proficiency in data processing using technologies like Spark Streaming, Spark SQL, Expertise in developing big data pipelines using technologies like Kafka, Storm, Experience with large scale data warehousing, mining or analytic systems. Ability to work with analysts to gather requirements and translate them into data engineering tasks Aptitude to independently learn new technologies. Experience automating deployments with continuous integration and continuous delivery systems Experience with DevOps , automation using Terraform or similar products are preferred . Preferred Qualifications Ability to apply knowledge of multidisciplinary business principles and practices to achieve successful outcomes in cross-functional projects and activities Effective communication skills Excel at problem solving Proficiency in debugging, troubleshooting, performance tuning and relevant tooling Proven ability to manage and deployment of big data implementations Experience building cloud native enterprise software Solid understanding of data design patterns and best practices Proficiency in logging and monitoring tools, patterns & implementations Understanding of enterprise security, REST / SOAP services, best practices around enterprise deployments Proven ability and desire to mentor others in a team environment Working knowledge of data visualization tools such as Tableau is a plus Bachelor's degree in computer science, management information systems, or related discipline