Core4ce
Senior Data Scientist
Core4ce, Arlington, Virginia, United States, 22201
Job Description
Core4ce is looking for Senior Data Scientists to join our team supporting the CDAO effort which is responsible for the acceleration of the DoD's adoption of data, analytics, and AI to generate decision advantage from the boardroom to the battlefield. The CDAO is the lead for all AI work within the DoD.
ResponsibilitiesDesigns, configures, develops, tests, and supports informatics and data science solutions for a wide array of technical use casesCollaborate with cross-functional teams, including data scientists and software engineers to integrate AI solutions developed by other elements of CDAO or the DoD community into Search Portfolio products when appropriateOptimize AI models for performance, scalability, and efficiency, leveraging cloud-based resources and distributed computing frameworks, specifically Apache Spark/Databricks. Ability to adapt code base to also run using GPU enabled Kubernetes clusters.Stay updated on and contribute to the latest advancements in AI research, applying new findings to improve Search Portfolio productsManage the lifecycle of AI/ML components used in Search Portfolio products from research and development to deployment and optimizationApplies analytical methodologies to diagnose data-related challenges, implement solutions, and evaluate performance;Documents and presents requirements, design alternatives, and findings to team members and clients;Ability to develop strategic, baselined, data modeling processes; ability to accurately determine cause-and-effect relationships; andExperience with integrated development environments, data integration, data visualization, data mining, and analysis tools.Maintains and guides the development of common libraries and tools used by multiple teams.Aids in formulating a strategy on how to achieve rapid prototypingRequirements
Bachelor's degree plus 7-10 years experience, or a Masters Degree plus 5 years of experience.Experience with ML fields, e.g., natural language processing, computer vision, statistical learning theoryHands-on experience with Natural Language Processing (NLP), Large Language Models, text embedding, semantic query, use of generative AI for text, and retrieval augmented generation (RAG)Familiarity with data preprocessing, feature engineering, and model evaluation techniques essential for machine learning projectsStrong understanding of various machine learning algorithms, including supervised and unsupervised learning, reinforcement learning, and neural networksExperience with version control systems like Git, enabling effective collaboration and code managementExperience in an ML engineer or data scientist role building ML modelsExperience writing code in Python, R, Scala, Java, C++ with documentation for reproducibilityExperience using Apache Spark/Databricks distributed compute environments for AI/ML workloadsExperience handling petabyte size datasets, diving into data to discover hidden patterns, using data visualization tools, writing SQL, and working with GPUs to develop modelsExperience with cloud-based data persistence products, especially RDS PostgreSQL and PostgreSQL extensions such as pgvector.Experience writing and speaking about technical concepts to business, technical, and lay audiences and giving data-driven presentations
Active TS (SCI
Eligibility) Required
All qualified applicants will receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity, religion, national origin, disability, veteran status, age, marital status, pregnancy, genetic information, or other legally protected status.
Core4ce is looking for Senior Data Scientists to join our team supporting the CDAO effort which is responsible for the acceleration of the DoD's adoption of data, analytics, and AI to generate decision advantage from the boardroom to the battlefield. The CDAO is the lead for all AI work within the DoD.
ResponsibilitiesDesigns, configures, develops, tests, and supports informatics and data science solutions for a wide array of technical use casesCollaborate with cross-functional teams, including data scientists and software engineers to integrate AI solutions developed by other elements of CDAO or the DoD community into Search Portfolio products when appropriateOptimize AI models for performance, scalability, and efficiency, leveraging cloud-based resources and distributed computing frameworks, specifically Apache Spark/Databricks. Ability to adapt code base to also run using GPU enabled Kubernetes clusters.Stay updated on and contribute to the latest advancements in AI research, applying new findings to improve Search Portfolio productsManage the lifecycle of AI/ML components used in Search Portfolio products from research and development to deployment and optimizationApplies analytical methodologies to diagnose data-related challenges, implement solutions, and evaluate performance;Documents and presents requirements, design alternatives, and findings to team members and clients;Ability to develop strategic, baselined, data modeling processes; ability to accurately determine cause-and-effect relationships; andExperience with integrated development environments, data integration, data visualization, data mining, and analysis tools.Maintains and guides the development of common libraries and tools used by multiple teams.Aids in formulating a strategy on how to achieve rapid prototypingRequirements
Bachelor's degree plus 7-10 years experience, or a Masters Degree plus 5 years of experience.Experience with ML fields, e.g., natural language processing, computer vision, statistical learning theoryHands-on experience with Natural Language Processing (NLP), Large Language Models, text embedding, semantic query, use of generative AI for text, and retrieval augmented generation (RAG)Familiarity with data preprocessing, feature engineering, and model evaluation techniques essential for machine learning projectsStrong understanding of various machine learning algorithms, including supervised and unsupervised learning, reinforcement learning, and neural networksExperience with version control systems like Git, enabling effective collaboration and code managementExperience in an ML engineer or data scientist role building ML modelsExperience writing code in Python, R, Scala, Java, C++ with documentation for reproducibilityExperience using Apache Spark/Databricks distributed compute environments for AI/ML workloadsExperience handling petabyte size datasets, diving into data to discover hidden patterns, using data visualization tools, writing SQL, and working with GPUs to develop modelsExperience with cloud-based data persistence products, especially RDS PostgreSQL and PostgreSQL extensions such as pgvector.Experience writing and speaking about technical concepts to business, technical, and lay audiences and giving data-driven presentations
Active TS (SCI
Eligibility) Required
All qualified applicants will receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity, religion, national origin, disability, veteran status, age, marital status, pregnancy, genetic information, or other legally protected status.