Logo
Blue Origin

Data Engineer III - New Glenn

Blue Origin, Seattle, Washington, us, 98127


At Blue Origin, we envision millions of people living and working in space for the benefit of Earth. We're working to develop reusable, safe, and low-cost space vehicles and systems within a culture of safety, collaboration, and inclusion. Join our diverse team of problem solvers as we add new chapters to the history of spaceflight! At Blue Origin, we're at the forefront of innovation in aerospace manufacturing and operations. Our mission is to leverage cutting-edge technology to design, build, and maintain rockets that are not only safe and reliable but also environmentally friendly. We are currently seeking a Data Engineer to join our team and play a pivotal role in leveraging the Databricks platform to support the New Glenn Data Analytics Team. As a Data Engineer, you will guide the definition of common standards for data pipelines, ensuring high data quality and reliability. You will be at the helm of creating patterns and standards for the data models in our data factory. Additionally, you will lead the design and implementation of DevOps practices, data platform monitoring, and observability to ensure operational excellence and high availability of data services. Need to solution & implement complex data pipelines orchestration for a knowledge graph build. Responsibilities include but are not limited to: Develop and Define Data Pipeline Standards: Establish common standards for data pipelines, focusing on data quality, RBAC, and data classification to maintain integrity and security. Develop Data Patterns and Standards: Create and implement patterns and standards for CDC, incremental loads, enhancing the platform's responsiveness and flexibility. Create Standard/Playbook/Process for Team: Create process for data team to manage complex data models using DBT, Dagster, and Gitlab. Mentor Data Team: Provide leadership and mentorship to team, promoting the adoption of standards, patterns, and DataOps practices for team Maintain Relationships: Maintain standards and practices to keep in-line with Enterprise. DevOps Practices: Design and implement DevOps practices for data operations, ensuring continuous integration, continuous delivery, and automated testing are at the core of the data engineering workflow. Monitoring and Observability: Establish comprehensive monitoring and observability frameworks for the data platform, enabling proactive issue detection and resolution to maintain high service availability. Monitor, maintain and optimize data pipelines, databases and related infrastructure to ensure high availability, reliability and performance. Troubleshoot and resolve issues related to data ingestion, processing and storage in a timely manner. Conduct root cause analysis for incidents and implement correction actions to prevent and mitigate system failures and performance bottlenecks. Technology Stack Experience: Programming Languages - Python, PySpark, SQL Databases - Redshift, Postgres Cloud Services - AWS, Azure, GCP (storage, compute, databases, catalogs, streaming, replication, Queueing & Notification, Logging & Monitoring service) - any one of the cloud platforms is good but AWS is preferred. Metadata Catalogs - DataHub, Informatica, Collibra (experience in any one is good) Data Quality Platforms - Anomolo, Informatica, BigEye (experience in any one is good) Event Platforms - Kafka, MSK Data Platforms - Palantir Foundry, Databricks Qualifications: Bachelor's or Master's degree in Computer Science, Data Engineering or a related field. Proven experience as a Data Engineer with experience in Databricks We are diversly skilled team of data professionals. Position requires a motivated self-starter with excellent communication skills to support team Strong DevOps practices and tools for infrastructure as code, continuous integration and deployment automation (for data models) Solid knowledge of data