Prometheus Federal Services
Informatics Data Scientist Lead
Prometheus Federal Services, Washington, District of Columbia, us, 20022
Informatics Data Scientist Lead
Prometheus Federal Services (PFS), a trusted partner to federal health and social services agencies, has an opening for an Informatics Data Scientist Lead. This position is responsible for developing and maintaining our Python codebase, focusing on Extract-Transform-Load (ETL) processes and bioinformatics pipelines. The role requires a blend of technical expertise in data science and bioinformatics, with a strong emphasis on Python programming, data processing, and high-performance computing.Essential Duties and Responsibilities
The successful candidate may be responsible for, among other things:Develop, maintain, and document Python code for ETL processes and bioinformatics pipelinesEnsure that code is well-documented, version-controlled, and adheres to industry standards such as PEP8Implement automated testing frameworks (e.g., pytest) to ensure the reliability and performance of codeCreate logging mechanisms to monitor processes and troubleshoot issuesDesign and implement ETL processes to extract data from various sources, transform it as needed, and load it into relational databasesEnhance and maintain existing ETL processes, ensuring they are well-documented and testedAlign and harmonize data from multiple sources for integration into master datasetsDevelop bioinformatics pipelines for tasks such as variant calling, gene expression analysis, and data annotationWork within a Linux-based high-performance computing environment using command-line toolsUtilize tools like Python’s Snakemake to create and manage complex workflowsPerform testing and validation of bioinformatics pipelines, ensuring accuracy and efficiencyCollaborate with cross-functional teams, including data engineers, researchers, and project managersParticipate in regular meetings to discuss project progress, challenges, and goalsProvide support to research and data teams, helping to structure and prepare data for analysis and modelingMinimum Qualifications
Bachelor’s in Data Science, Computer Science, Bioinformatics, or a related fieldMinimum of eight (8) years of experienceMinimum of five (5) years of federal consultingStrong experience in Python programming, particularly in the context of ETL processes and bioinformaticsFamiliarity with version control systems (e.g., Git) and workflow management tools like SnakemakeExperience working in Linux-based high-performance computing environmentsKnowledge of relational databases and data integration techniquesExperience with automated testing and logging best practicesStrong analytical and problem-solving skillsExcellent communication and documentation skillsAbility to work both independently and as part of a teamAuthorized to work in the U.S. indefinitely without sponsorshipAbility to obtain a public trustPreferred Qualifications
Experience in healthcare, life sciences, or related industriesMaster’s degree in Data Science, Computer Science, Bioinformatics, or a related fieldVHA ExperienceKnowledge of bioinformatics tools and pipelinesFamiliarity with AI/ML concepts and their application to data science
#J-18808-Ljbffr
Prometheus Federal Services (PFS), a trusted partner to federal health and social services agencies, has an opening for an Informatics Data Scientist Lead. This position is responsible for developing and maintaining our Python codebase, focusing on Extract-Transform-Load (ETL) processes and bioinformatics pipelines. The role requires a blend of technical expertise in data science and bioinformatics, with a strong emphasis on Python programming, data processing, and high-performance computing.Essential Duties and Responsibilities
The successful candidate may be responsible for, among other things:Develop, maintain, and document Python code for ETL processes and bioinformatics pipelinesEnsure that code is well-documented, version-controlled, and adheres to industry standards such as PEP8Implement automated testing frameworks (e.g., pytest) to ensure the reliability and performance of codeCreate logging mechanisms to monitor processes and troubleshoot issuesDesign and implement ETL processes to extract data from various sources, transform it as needed, and load it into relational databasesEnhance and maintain existing ETL processes, ensuring they are well-documented and testedAlign and harmonize data from multiple sources for integration into master datasetsDevelop bioinformatics pipelines for tasks such as variant calling, gene expression analysis, and data annotationWork within a Linux-based high-performance computing environment using command-line toolsUtilize tools like Python’s Snakemake to create and manage complex workflowsPerform testing and validation of bioinformatics pipelines, ensuring accuracy and efficiencyCollaborate with cross-functional teams, including data engineers, researchers, and project managersParticipate in regular meetings to discuss project progress, challenges, and goalsProvide support to research and data teams, helping to structure and prepare data for analysis and modelingMinimum Qualifications
Bachelor’s in Data Science, Computer Science, Bioinformatics, or a related fieldMinimum of eight (8) years of experienceMinimum of five (5) years of federal consultingStrong experience in Python programming, particularly in the context of ETL processes and bioinformaticsFamiliarity with version control systems (e.g., Git) and workflow management tools like SnakemakeExperience working in Linux-based high-performance computing environmentsKnowledge of relational databases and data integration techniquesExperience with automated testing and logging best practicesStrong analytical and problem-solving skillsExcellent communication and documentation skillsAbility to work both independently and as part of a teamAuthorized to work in the U.S. indefinitely without sponsorshipAbility to obtain a public trustPreferred Qualifications
Experience in healthcare, life sciences, or related industriesMaster’s degree in Data Science, Computer Science, Bioinformatics, or a related fieldVHA ExperienceKnowledge of bioinformatics tools and pipelinesFamiliarity with AI/ML concepts and their application to data science
#J-18808-Ljbffr