Storm3
Senior Data Architect
Storm3, San Francisco, California, United States, 94199
Data Architect/ Engineering lead - MLAI Drug Discovery Unicorn (Series B)Remote within USCompetitive base + hefty equityThis start-up is truly a leader in deep learning & generative AI, merging cutting-edge models with biology research to accelerate disease treatment & drug discovery.They are seeking an experienced, self-driven
Data Architect/ Engineer Lead - ML
(note: final level determined through interview process) to join their fast-growing AI/ ML team. This Lead will design and develop data model/ lakehouse and work alongside ML architects & scientists to scale data curation and pipelines for ML on a modern tech stack that enables the productization of unique phenotypic, target-agnostic drug discovery platformsResponsibilities:Design, implement and support data lakehouse infrastructure using modern cloud-service stack (AWS preferred): Python, S3, Batch, Lambda, EKS, IAM, Rest (Redshift, Glue, Athena, ECR, Parquet is a plus)Develop ETLs and real-time data pipelines to source & curate data from internal and public data sources (experience with biopharma data: image, omics, molecular is a plus)Ownership of end-to-end data model (for structured and unstructured data) for ML training & inference and statistical analysisRequirements:Bachelor's degree (Master's preferred, or equivalent years of industry experience) in computer science, engineering, analytics, mathematics, statistics or equivalentDeep expertise working with large data sets, data visualization, building complex data processes, performance tuning, bringing data from disparate data stores and programmatically identifying patterns to optimize ML utilization5+ years of software development experience working on large scale cloud-based services & data environment:o Python for data modeling, warehousing and ETLo Relational SQL and NoSQL databases (experience with large non-relational DBs/ stores: object, graph, columnar DBs/ stores are a plus)o Automated build processes with CI/CD in cloud, cluster & workload managemento Adherence to production environments (agile, regression testing, version control)o Familiarity with big and real-time data governance and ML workflow orchestration (experience with Spark, Databricks, MLflow is a plus)Bonus qualifications:Experience developing software components in a start-up environmentExperience in biotech and drug discoveryA self-reliant problem-solver who also excels in teamwork, characterized by strong data-driven, first principled decision-making and superb communication skillsRemote within the USAInterested in applying? Please click on the Easy Apply' button or alternatively email me your resume at
stefani.lukic@Storm3 is a HealthTech recruitment firm with clients across major Tech hubs in Europe, APAC and North America. To discuss open opportunities or career options, please visit our website at and follow the Storm3 LinkedIn page for the latest jobs and int el
Data Architect/ Engineer Lead - ML
(note: final level determined through interview process) to join their fast-growing AI/ ML team. This Lead will design and develop data model/ lakehouse and work alongside ML architects & scientists to scale data curation and pipelines for ML on a modern tech stack that enables the productization of unique phenotypic, target-agnostic drug discovery platformsResponsibilities:Design, implement and support data lakehouse infrastructure using modern cloud-service stack (AWS preferred): Python, S3, Batch, Lambda, EKS, IAM, Rest (Redshift, Glue, Athena, ECR, Parquet is a plus)Develop ETLs and real-time data pipelines to source & curate data from internal and public data sources (experience with biopharma data: image, omics, molecular is a plus)Ownership of end-to-end data model (for structured and unstructured data) for ML training & inference and statistical analysisRequirements:Bachelor's degree (Master's preferred, or equivalent years of industry experience) in computer science, engineering, analytics, mathematics, statistics or equivalentDeep expertise working with large data sets, data visualization, building complex data processes, performance tuning, bringing data from disparate data stores and programmatically identifying patterns to optimize ML utilization5+ years of software development experience working on large scale cloud-based services & data environment:o Python for data modeling, warehousing and ETLo Relational SQL and NoSQL databases (experience with large non-relational DBs/ stores: object, graph, columnar DBs/ stores are a plus)o Automated build processes with CI/CD in cloud, cluster & workload managemento Adherence to production environments (agile, regression testing, version control)o Familiarity with big and real-time data governance and ML workflow orchestration (experience with Spark, Databricks, MLflow is a plus)Bonus qualifications:Experience developing software components in a start-up environmentExperience in biotech and drug discoveryA self-reliant problem-solver who also excels in teamwork, characterized by strong data-driven, first principled decision-making and superb communication skillsRemote within the USAInterested in applying? Please click on the Easy Apply' button or alternatively email me your resume at
stefani.lukic@Storm3 is a HealthTech recruitment firm with clients across major Tech hubs in Europe, APAC and North America. To discuss open opportunities or career options, please visit our website at and follow the Storm3 LinkedIn page for the latest jobs and int el