Baer Group
GenAI Data Scientist (16565)
Baer Group, Baltimore, Maryland, United States, 21276
** Federal Project - Applicant must be a United States Citizen, with the ability to obtain a Public Trust. **
Baer is looking for a GenAI Data Scientist for a 9-month Federal project located in Baltimore, MD.
Title:
GenAI Data ScientistLocation:
Remote and Baltimore, MD (Future Onsite)Duration:
9 monthsRate:
All-inclusiveAlignment:
W2 or C2C (Vendors Not Permitted)
Description:
Formulate, design, and deliver AI/ML-based decision-making frameworks and generative models for business outcomes.Develop and fine-tune AI models for NLP tasks, such as summarization, Named Entity Recognition (NER), text classification, and sentiment analysis, focusing on unstructured clinical records.Implement dynamic prompt engineering strategies to optimize generative AI model outputs and improve overall performance.Analyze and preprocess large datasets, especially unstructured medical records (e.g., physician notes, discharge summaries), using libraries like Pandas, NLTK, and SpaCy.Conduct experiments to evaluate AI model performance, utilizing metrics such as precision, recall, and F1-score, and continuously improve models through hyperparameter tuning.Collaborate with cross-functional teams, including data scientists and software engineers, to integrate AI models into cloud-based production environments (e.g., AWS, Azure).Incorporate human-in-the-loop feedback to refine AI models and improve outcomes for clinical use cases.Stay updated with the latest research and advancements in AI and NLP, applying cutting-edge techniques such as transfer learning, attention mechanisms, and fine-tuning pre-trained models.Design and implement data pipelines, structuring both SQL and NoSQL databases (e.g., PostgreSQL, MongoDB) for efficient data storage and retrieval.Deploy AI models using cloud platforms (AWS, Azure) and containerization (Docker), leveraging CI/CD pipelines for scalability and performance optimization.
Requirements:5+ years of experience in AI/ML development with a strong focus on NLP and Generative AI, using frameworks such as TensorFlow, PyTorch, and Hugging Face.Mastery in Python and proficiency in libraries such as Transformers, NLTK, SpaCy, and Gensim, with experience in data manipulation using Pandas and NumPy.Demonstrated expertise in generative AI models (e.g., OpenAI’s GPT, LLaMA) and libraries like VLLM.Experience in prompt engineering strategies for generative AI model enhancement.Familiarity with cloud platforms (AWS, Azure) and containerization (Docker), with hands-on experience deploying machine learning models using CI/CD pipelines.Experience with human-in-the-loop systems, integrating feedback from clinicians to refine AI models.Strong analytical and statistical modeling skills, with experience evaluating model performance and iterating on improvements.Experience working with healthcare data standards such as HL7, FHIR, ICD codes, and SNOMED.Familiarity with SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Elasticsearch) databases, and experience optimizing them for AI model integration.Ability to articulate technical challenges and solutions effectively, with strong written and verbal communication skills.Master’s degree in Data Science, AI, Computer Science, or a related field + 10 years of experience, or PhD + 4 years.
Public Trust Security Clearance
is the lowest level of additional background screening that the federal government requires for applicants of certain jobs, which includes completing a Standard Form 85 (SF85) form.#J-18808-Ljbffr
Baer is looking for a GenAI Data Scientist for a 9-month Federal project located in Baltimore, MD.
Title:
GenAI Data ScientistLocation:
Remote and Baltimore, MD (Future Onsite)Duration:
9 monthsRate:
All-inclusiveAlignment:
W2 or C2C (Vendors Not Permitted)
Description:
Formulate, design, and deliver AI/ML-based decision-making frameworks and generative models for business outcomes.Develop and fine-tune AI models for NLP tasks, such as summarization, Named Entity Recognition (NER), text classification, and sentiment analysis, focusing on unstructured clinical records.Implement dynamic prompt engineering strategies to optimize generative AI model outputs and improve overall performance.Analyze and preprocess large datasets, especially unstructured medical records (e.g., physician notes, discharge summaries), using libraries like Pandas, NLTK, and SpaCy.Conduct experiments to evaluate AI model performance, utilizing metrics such as precision, recall, and F1-score, and continuously improve models through hyperparameter tuning.Collaborate with cross-functional teams, including data scientists and software engineers, to integrate AI models into cloud-based production environments (e.g., AWS, Azure).Incorporate human-in-the-loop feedback to refine AI models and improve outcomes for clinical use cases.Stay updated with the latest research and advancements in AI and NLP, applying cutting-edge techniques such as transfer learning, attention mechanisms, and fine-tuning pre-trained models.Design and implement data pipelines, structuring both SQL and NoSQL databases (e.g., PostgreSQL, MongoDB) for efficient data storage and retrieval.Deploy AI models using cloud platforms (AWS, Azure) and containerization (Docker), leveraging CI/CD pipelines for scalability and performance optimization.
Requirements:5+ years of experience in AI/ML development with a strong focus on NLP and Generative AI, using frameworks such as TensorFlow, PyTorch, and Hugging Face.Mastery in Python and proficiency in libraries such as Transformers, NLTK, SpaCy, and Gensim, with experience in data manipulation using Pandas and NumPy.Demonstrated expertise in generative AI models (e.g., OpenAI’s GPT, LLaMA) and libraries like VLLM.Experience in prompt engineering strategies for generative AI model enhancement.Familiarity with cloud platforms (AWS, Azure) and containerization (Docker), with hands-on experience deploying machine learning models using CI/CD pipelines.Experience with human-in-the-loop systems, integrating feedback from clinicians to refine AI models.Strong analytical and statistical modeling skills, with experience evaluating model performance and iterating on improvements.Experience working with healthcare data standards such as HL7, FHIR, ICD codes, and SNOMED.Familiarity with SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Elasticsearch) databases, and experience optimizing them for AI model integration.Ability to articulate technical challenges and solutions effectively, with strong written and verbal communication skills.Master’s degree in Data Science, AI, Computer Science, or a related field + 10 years of experience, or PhD + 4 years.
Public Trust Security Clearance
is the lowest level of additional background screening that the federal government requires for applicants of certain jobs, which includes completing a Standard Form 85 (SF85) form.#J-18808-Ljbffr