Patternbio
Senior Data Engineer (Open Level)
Patternbio, San Francisco, California, United States, 94199
At Pattern Bio, we are creating next-generation cancer therapies at the intersection of synthetic biology and machine learning, with data as the cornerstone of our platform. Our mission is to transform disease treatment, starting with cancer, by leveraging our innovative biomolecular computing technology. By utilizing multi-input, molecular-level computation within individual cells, we aim to deliver curative therapies where the traditional one drug-one target approach has fallen short. Pattern Bio is assembling a world-class team to bring two decades of advancements in DNA computing and machine learning into the clinic.About the RoleWe are seeking a highly skilled
Senior Data Engineer
(will consider Staff or Principal depending on experience) to build and manage our data infrastructure. In this role, you will design, deploy, and maintain a robust data engine that supports the lifecycle of experimental and large-scale omics data. Your expertise will be critical in advancing our pipeline of innovative cancer therapies.What You'll DoDevelop Data Infrastructure : Build and manage scalable data pipelines for batch ingestion of large-scale omics datasets (e.g., RNA-seq, DNA-seq) into a cloud data warehouse.Enforce Data Schemas : Develop and implement enforceable data schemas, data versioning strategies, and data migration to maintain data integrity and consistency.Optimize Databases : Design and optimize relational databases to support complex queries across diverse datasets from multiple experiments.Integrate ELNs : Connect Electronic Lab Notebooks (ELNs) with our data infrastructure, managing schema control metadata and data tables from lab experiments.API Development : Develop robust Python APIs to facilitate data queries and analyses, and to build data dashboard visualizations and download tools.Automate Processes : Implement automated data quality checks, pre-processing steps, and triggers for algorithmic analysis on new data arrivals.Generate Reports : Automate the creation of analytical reports and visualizations, integrating them into documentation workflows and team communication channels.Collaborate : Work closely with cross-functional teams to support data-driven research and development efforts.QualificationsEducation : MS or PhD in Computer Science, Data Engineering, Bioinformatics, or a related field.Experience : 5+ years of industry experience in data engineering, including 3+ years in biotech or life sciences.Technical Skills :
Strong intuitive understanding of data infrastructure and architecture principles.Demonstrated track record of building schema-controlled databases to manage diverse data types.Experience with diverse omics data types, and methods for normalization, batch correction, and harmonization.Expertise in Python programming and data engineering frameworks.Proficiency with relational databases, particularly PostgreSQL.Experience with cloud platforms (AWS preferred) and data warehouse technologies.Expertise in data versioning and migration strategies.Knowledge of data visualization tools and API development.Experience with FDA compliant databases preferred.
This is an in-person role based at our headquarters in South San Francisco, CA. To apply, send your resume, cover letter, and portfolio to careers@patternbio.com.
#J-18808-Ljbffr
Senior Data Engineer
(will consider Staff or Principal depending on experience) to build and manage our data infrastructure. In this role, you will design, deploy, and maintain a robust data engine that supports the lifecycle of experimental and large-scale omics data. Your expertise will be critical in advancing our pipeline of innovative cancer therapies.What You'll DoDevelop Data Infrastructure : Build and manage scalable data pipelines for batch ingestion of large-scale omics datasets (e.g., RNA-seq, DNA-seq) into a cloud data warehouse.Enforce Data Schemas : Develop and implement enforceable data schemas, data versioning strategies, and data migration to maintain data integrity and consistency.Optimize Databases : Design and optimize relational databases to support complex queries across diverse datasets from multiple experiments.Integrate ELNs : Connect Electronic Lab Notebooks (ELNs) with our data infrastructure, managing schema control metadata and data tables from lab experiments.API Development : Develop robust Python APIs to facilitate data queries and analyses, and to build data dashboard visualizations and download tools.Automate Processes : Implement automated data quality checks, pre-processing steps, and triggers for algorithmic analysis on new data arrivals.Generate Reports : Automate the creation of analytical reports and visualizations, integrating them into documentation workflows and team communication channels.Collaborate : Work closely with cross-functional teams to support data-driven research and development efforts.QualificationsEducation : MS or PhD in Computer Science, Data Engineering, Bioinformatics, or a related field.Experience : 5+ years of industry experience in data engineering, including 3+ years in biotech or life sciences.Technical Skills :
Strong intuitive understanding of data infrastructure and architecture principles.Demonstrated track record of building schema-controlled databases to manage diverse data types.Experience with diverse omics data types, and methods for normalization, batch correction, and harmonization.Expertise in Python programming and data engineering frameworks.Proficiency with relational databases, particularly PostgreSQL.Experience with cloud platforms (AWS preferred) and data warehouse technologies.Expertise in data versioning and migration strategies.Knowledge of data visualization tools and API development.Experience with FDA compliant databases preferred.
This is an in-person role based at our headquarters in South San Francisco, CA. To apply, send your resume, cover letter, and portfolio to careers@patternbio.com.
#J-18808-Ljbffr