Silverchair
Lead Data Engineer
Silverchair, Charlottesville, Virginia, United States, 22904
About Silverchair
Silverchair is the premier independent platform partner for scholarly and professional publishers, dedicated to expanding the reach of the worlds most valuable knowledge. By connecting creators, publishers, and users, we amplify the impact of scholarship and enhance the accessibility of critical information. Our global teams develop, build, and host websites, online products, and digital libraries for prestigious publishers, including the American Medical Association, MIT Press, and Oxford University Press.
DEI Statement
At Silverchair, we celebrate and embrace diversity in all its forms. We are committed to fostering an inclusive environment from the moment you consider joining our team. We actively encourage candidates from diverse backgrounds to apply, believing that a variety of perspectives and experiences enriches our community, drives innovation, and strengthens our impact.
Equity and inclusion are at the core of our hiring practices, and we strive to build a team that reflects a broad spectrum of cultures, experiences, and viewpoints. We are particularly committed to increasing representation from groups historically underrepresented in technology careers. Your unique experiences and perspectives are not just welcomed but are integral to our collective success. Join us in our mission to create a culture that unites and brings out the best in all of us.
Learn more about our commitment to diversity, equity, and inclusion at Silverchair.
Overview
Silverchair seeks a seasoned Lead Data Engineer with deep expertise in Microsoft Azure to play a pivotal role in advancing our analytic capabilities. With a demonstrated history of success in developing, deploying, and managing Azure data engineering projects, this hands-on leader will bring a robust background in data engineering, cloud computing, and analytics, coupled with the capacity to spearhead projects and provide guidance to team members.
As Lead Data Engineer, you will have the opportunity to advance offerings in scholarly publishing analytics, serving world-renowned client organizations. You'll be joining Silverchair, a company that combines long-established industry presence with the agility of a nimble software organization. Our analytics team operates with a startup-like ethos offering high autonomy and robust support from leadership within a growing company. Your strategic vision and technical expertise will be key in shaping the future of our data platform as you deliver innovative data solutions to the scholarly publishing industry.
Essential Functions :
Data Platform Leadership : Take an ownership mindset to fully understand the existing data estate and partner with the Technical Director in planning the short- and long-term development path to achieve data platform enhancement goals.
Data Transformation Design : Create transformations that ensure robust and reliable data flow from source data to report delivery. Acquire existing data transformations and evolve them to improve client reporting standards and exceed client reporting expectations.
Data Pipeline Development : Design, construct, and maintain robust data pipelines using Microsoft ETL tools like Azure Data Factory, Azure Synapse Analytics, and Microsoft Fabric, ensuring optimal data flow for analytics.
Data Storage Solutions : Implement and manage Azure-based solutions (Data Lakes, Synapse SQL servers, Fabric Data Warehouses and Lakehouses), maintaining data integrity and accessibility.
Performance Tuning : Ensure efficient, reliable data handling and retrieval by monitoring and optimizing the performance of data lakes, Synapse SQL Servers and Fabric assets.
Mentor and Team Player:
Willingness and ability to promote knowledge sharing and skill development among the team in a collaborative and inclusive manner. Mentor junior data engineers and provide technical guidance to cross-functional teams. Documentation : Document data engineering projects, including architecture and established processes, using industry standard visualization tools like Lucid or Visio, DBML diagrams, and written Standard Operating Procedures. Required Skills : Microsoft Azure Expertise:
At least 5 years of hands-on experience with Azure data services in a data engineering context, including a comprehensive understanding of Azure data solutions and best practices. Data Modeling and Warehousing:
Proficient in constructing and managing modern data warehouse and data lakehouse architectures, ensuring scalable and efficient data storage. Pipeline Construction and Orchestration:
Skilled in building and maintaining robust data pipelines and orchestrations, with a focus on scalability and reliability. ETL/ELT Processes:
Deep knowledge of ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes for data transformation and integration, particularly using Azure Data Factory and Azure Synapse. Streaming Data Processing:
Demonstrated proficiency in designing and implementing solutions for real-time event processing, including experience with managed Kafka services like Azure Event Hubs or Confluent. Data Processing Patterns : Experience with Change Data Capture for data ingestion, as well as strategies for data movement optimization, including incremental ingestion and preservation of historical data. SQL and Data Manipulation:
Strong command of T-SQL and familiarity with Spark SQL, with the ability to perform complex data transformations, cleansing, enrichment, and performance optimization. Analytical Problem-Solving:
Exceptional problem-solving skills, with the capability to dissect complex issues, articulate them clearly, and drive towards effective solutions. Communication and Collaboration:
Excellent communication skills essential for effective teamwork and cross-functional collaboration with data analysts, business analysts, and other stakeholders. Programming Skills:
Proficiency in Python and PySpark, along with experience in scripting languages, and the development of infrastructure as code using Azure Resource Manager templates, Terraform, YAML pipelines, and PowerShell. CI/CD Implementation:
Practical experience in implementing and managing Continuous Integration and Continuous Deployment (CI/CD) practices using tools like ARM templates, Azure DevOps, and automated pipelines. Desired Experience : Power BI Expertise : Experience building Power BI semantic models and reporting with demonstrated capability to define dimensional models to produce effective semantic models. Microsoft Certifications:
Holding certifications such as Data Engineering on Microsoft Azure (DP-203), Azure Solutions Architect Expert (AZ-305), and Microsoft Certified: Azure Enterprise Data Analyst Associate (DP-500) will be advantageous. DevOps and Agile : Experience with Azure DevOps, Git, and Agile Scrum environments. Pentaho Experience : Familiarity with Pentaho is a significant plus, given existing legacy systems. Machine Learning Integration : Experience integrating data solutions with machine learning models. Location and Remote Work : Open for remote candidates with the ability to work within the Eastern Time Zone. Candidates based in or around Charlottesville, Virginia, are a plus. Disclosures At this time, we cannot sponsor a new applicant for employment authorization for this position. No agencies please.
Willingness and ability to promote knowledge sharing and skill development among the team in a collaborative and inclusive manner. Mentor junior data engineers and provide technical guidance to cross-functional teams. Documentation : Document data engineering projects, including architecture and established processes, using industry standard visualization tools like Lucid or Visio, DBML diagrams, and written Standard Operating Procedures. Required Skills : Microsoft Azure Expertise:
At least 5 years of hands-on experience with Azure data services in a data engineering context, including a comprehensive understanding of Azure data solutions and best practices. Data Modeling and Warehousing:
Proficient in constructing and managing modern data warehouse and data lakehouse architectures, ensuring scalable and efficient data storage. Pipeline Construction and Orchestration:
Skilled in building and maintaining robust data pipelines and orchestrations, with a focus on scalability and reliability. ETL/ELT Processes:
Deep knowledge of ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes for data transformation and integration, particularly using Azure Data Factory and Azure Synapse. Streaming Data Processing:
Demonstrated proficiency in designing and implementing solutions for real-time event processing, including experience with managed Kafka services like Azure Event Hubs or Confluent. Data Processing Patterns : Experience with Change Data Capture for data ingestion, as well as strategies for data movement optimization, including incremental ingestion and preservation of historical data. SQL and Data Manipulation:
Strong command of T-SQL and familiarity with Spark SQL, with the ability to perform complex data transformations, cleansing, enrichment, and performance optimization. Analytical Problem-Solving:
Exceptional problem-solving skills, with the capability to dissect complex issues, articulate them clearly, and drive towards effective solutions. Communication and Collaboration:
Excellent communication skills essential for effective teamwork and cross-functional collaboration with data analysts, business analysts, and other stakeholders. Programming Skills:
Proficiency in Python and PySpark, along with experience in scripting languages, and the development of infrastructure as code using Azure Resource Manager templates, Terraform, YAML pipelines, and PowerShell. CI/CD Implementation:
Practical experience in implementing and managing Continuous Integration and Continuous Deployment (CI/CD) practices using tools like ARM templates, Azure DevOps, and automated pipelines. Desired Experience : Power BI Expertise : Experience building Power BI semantic models and reporting with demonstrated capability to define dimensional models to produce effective semantic models. Microsoft Certifications:
Holding certifications such as Data Engineering on Microsoft Azure (DP-203), Azure Solutions Architect Expert (AZ-305), and Microsoft Certified: Azure Enterprise Data Analyst Associate (DP-500) will be advantageous. DevOps and Agile : Experience with Azure DevOps, Git, and Agile Scrum environments. Pentaho Experience : Familiarity with Pentaho is a significant plus, given existing legacy systems. Machine Learning Integration : Experience integrating data solutions with machine learning models. Location and Remote Work : Open for remote candidates with the ability to work within the Eastern Time Zone. Candidates based in or around Charlottesville, Virginia, are a plus. Disclosures At this time, we cannot sponsor a new applicant for employment authorization for this position. No agencies please.