Logo
Karkidi

Principal Data Scientist, Proteomics

Karkidi, South San Francisco, California, 94083


Who We Are: Calico (Calico Life Sciences LLC) is an Alphabet-founded research and development company whose mission is to harness advanced technologies and model systems to increase our understanding of the biology that controls human aging. Calico will use that knowledge to devise interventions that enable people to lead longer and healthier lives. Calico’s highly innovative technology labs, its commitment to curiosity-driven discovery science, and, with academic and industry partners, its vibrant drug-development pipeline, together create an inspiring and exciting place to catalyze and enable medical breakthroughs. Position Description: Calico is seeking a statistically-savvy Principal Proteomics Data Scientist to join our multidisciplinary team of mass spectrometrists, biologists and computer scientists. This candidate will help bridge the gap between start-of-the art proteomics measurements and questions around the biology of aging by inventing new approaches to proteomics data analysis as well as applying existing state-of-the-art methods. The Proteomics team at Calico generates extensive quantitative proteomics datasets from both large, unique cohort studies as well as non-standard proteome profiling experiments. These datasets, representing organisms ranging from yeast to human, require rigorous statistical analyses, novel signal extraction and processing pipelines, and input into experimental design and validation. The ideal candidate is someone with exceptional statistical and computational chops, a proven history of applying statistical concepts to complex biological problems, direct experience with proteomics data analysis techniques, and a strong record of publication and tool development. The candidate will primarily work with mass spectrometry proteomics data, but experience with non-mass spectrometry proteomics data would also be highly valued. In this role, the candidate would work collaboratively with members of the Proteomics group, the Computing team, and the broader R&D organization. Position Responsibilities: Invent and develop data analysis strategies, write algorithms, and broadly support the analysis of proteomics data, in particular data from large patient cohorts and non-standard proteome profiling experiments Maintain and continue to develop in-house developed proteomics algorithms and software tools Explore and deploy external algorithms, software, and tools as needed to complement in-house software Work with software engineers, data scientists, the proteomics team, and scientists at Calico to develop novel platforms for extracting biological insight from experimental data at multiple spatial and temporal scales Interact closely with scientists in the proteomics team, basic research, and in drug development to understand their data analysis needs and provide answers to technical questions through custom analyses, 1-on-1 communication, presentations, and written documents, including scientific publications Position Requirements: Master’s degree or Ph.D. in a quantitative discipline such as statistics, biostatistics, computer science, bioinformatics, or computational biology 7 years of experience post PhD (or 10 years post MS) in either academic or industry settings Expertise in statistical data analysis Experience working with proteomics data and proteomics software (e.g., PEAKS Sequest, Byonic, PD, MaxQuant, etc.) Fluent coding skills in one or more programming languages (preferably R and/or python). Advanced proficiency with the R programming language is strongly preferred. Outstanding communication and collaborative worth ethic Intellectual curiosity, attention to detail, and good follow-through Proactive approach to collaborations and a demonstrated ability to work in a team environment Must be willing to work onsite at least four days a week Nice to Have: Experience with non-standard proteomics workflows (e.g. microbiome, de-novo, HDX, FPOP, CFMS) Experience with DIA (Data Independent Acquisition) Experience with post translational modifications, and in particular glycoproteomics Experience working with large cohort and population scale datasets Experience with additional programming languages (e.g. C++, MATLAB, SQL, Julia) and cloud computing environments Good understanding of human physiology and molecular biology and how proteomics data can be applied to age-related biological questions Experience in multi-omics data analysis and combining orthogonal proteomics methods with genomics (e.g. PhIP-seq, single cell) Experience with non-mass spectrometry proteomics data (e.g. Olink, Somalogic) Experience with software development best practices (version control, CI/CD, etc.) The estimated base salary range for this role is $199,000 - $203,000. Actual pay will be based on a number of factors including experience and qualifications. This position is also eligible for two annual cash bonuses. J-18808-Ljbffr