Unity Health Toronto
Senior Data Scientist GEMINI Data Ops
Unity Health Toronto, San Francisco, California, United States, 94199
GEMINI (www.geminimedicine.ca) is a unique big data platform in the Canadian healthcare landscape using advanced methods and analytics to extract and standardize data captured in hospital electronic health records. GEMINI currently exists at 30+ ON hospitals and supports the Ontario General Medicine Quality Improvement Network, GeMQIN, a provincial network led by Ontario Health to improve care for general medicine hospital patients. General medicine patients represent 40% of emergency admissions to hospital and are the largest group of hospitalized patients. GEMINI is a collaborative data and analytics platform for all Ontario hospitals to accelerate research and quality improvement, leading to excellent hospital care.
The main roles of a Senior Data Scientist are to lead and design scientifically rigorous approaches to examining data, apply advanced analytical techniques to healthcare data, and translate the findings into meaningful, applied knowledge for researchers and other collaborators. This includes developing machine learning algorithms, creating training and validation pipelines, and performing and interpreting model evaluation for real-world healthcare applications.
The Senior Data Scientist will have excellent data science and programming skills, and an aptitude for applying machine learning to deliver real outcomes in the clinical environment. They will provide leadership and mentorship to the team and collaborate with external stakeholders.
Duties and Responsibilities:
Data Analysis and machine learning (45%)
Conduct analysis with supervised and unsupervised machine learning methods. Apply natural language processing to unstructured text data, such as bag of words, TDIF vectorization, and word embeddings. Perform hyperparameter searches to find best fitting models. Apply common algorithms to answer project objectives using libraries from R/Python, E.g. linear regression, logistic regression, GLM, GAM, penalized regression, SVM, random forest, XGBoost, neural networks, and common time series models (ARIMA, holt-winter, state space models, etc.) Develop and apply advanced machine learning models, such as neural networks (feed forward, convolutional, recurrent), using the following libraries: Keras, TensorFlow, PyTorch Analyze/interpret results of projects and draw conclusions for review and incorporation into reports/presentations Compare and contrast results from different models using appropriate model performance summaries using classification, regression, and calibration performance metrics Work closely with dev ops teams to put models into production Build systems to closely monitor model performance post-deployment Clearly document all model details, code and iterations using appropriate commenting and version control Lead and develop project analytic plans outlining key components of analytical approaches (40%)
Work with collaborators and researchers to understand which analytical approach would be best suited to answer the project objectives Provide on-demand consultative data science expertise for ad-hoc requests and recommend data science approaches to meet collaborator needs Understand data requirements based on stated project goals Integrate with the team members including Data Scientists to develop multi-disciplinary analytic approaches, and assist in best practices and process development Mentorship and Communication (15%)
Mentor and lead junior team members to establish coding techniques, structure of dataset for studies and develop sound analytical plans Mentor other team members and provide project/analytical guidance Prepare presentations and manuscripts for various large audiences Provide recommendations for analytical approaches to other Data Scientists and external investigators Provide recommendations based on results of analyses and project objectives Construct elegant visualizations and dashboards to communicate findings/output to end users, managers and senior executives Document and communicate errors in data/code to team members including senior staff Qualifications:
Master's Degree Required Skilled in deploying, and maintaining scalable models in production environments using CI/CD pipe lines to ensure high reliability and performance Proven experience developing, testing, and optimizing performance of NLP models, including tokenization, name entity recognition, language generation with open source frameworks such as spaCy, Hugging Face's Transformers Expert in SQL, R and/or Python, and git. Expertise in a broad range of machine learning techniques for structured and unstructured data with the ability to adapt and apply new methods as needed. Deep understanding of applied prediction tasks, including model evaluation, model fairness, and implementation of real-world tools. Proficiency in regression for continuous normal, skewed, binary, count, and time-to-event outcomes. Able to independently identify and implement methods that most effectively address project-specific objectives, without substantial methodological oversight. Working understanding of causal inference and common study designs used in observational research. Apply advanced analytical techniques to healthcare data, and translate the findings into meaningful, applied knowledge for internal and external stakeholders Mentor other team members and provide project and analytical guidance Familiarity with open source large language models is a plus. Experience with cloud object storage (e.g., MinIO, Amazon S3). Understanding of managing real-time transactional PostgreSQL databases is a plus
Please Note: Registering and making an account with Unity Health does not mean you have submitted an application for the position you would like to apply for. Please ensure you register and make an account with Unity Health AND apply to the position. Both need to be completed to consider your application.
Thank you for applying.
The main roles of a Senior Data Scientist are to lead and design scientifically rigorous approaches to examining data, apply advanced analytical techniques to healthcare data, and translate the findings into meaningful, applied knowledge for researchers and other collaborators. This includes developing machine learning algorithms, creating training and validation pipelines, and performing and interpreting model evaluation for real-world healthcare applications.
The Senior Data Scientist will have excellent data science and programming skills, and an aptitude for applying machine learning to deliver real outcomes in the clinical environment. They will provide leadership and mentorship to the team and collaborate with external stakeholders.
Duties and Responsibilities:
Data Analysis and machine learning (45%)
Conduct analysis with supervised and unsupervised machine learning methods. Apply natural language processing to unstructured text data, such as bag of words, TDIF vectorization, and word embeddings. Perform hyperparameter searches to find best fitting models. Apply common algorithms to answer project objectives using libraries from R/Python, E.g. linear regression, logistic regression, GLM, GAM, penalized regression, SVM, random forest, XGBoost, neural networks, and common time series models (ARIMA, holt-winter, state space models, etc.) Develop and apply advanced machine learning models, such as neural networks (feed forward, convolutional, recurrent), using the following libraries: Keras, TensorFlow, PyTorch Analyze/interpret results of projects and draw conclusions for review and incorporation into reports/presentations Compare and contrast results from different models using appropriate model performance summaries using classification, regression, and calibration performance metrics Work closely with dev ops teams to put models into production Build systems to closely monitor model performance post-deployment Clearly document all model details, code and iterations using appropriate commenting and version control Lead and develop project analytic plans outlining key components of analytical approaches (40%)
Work with collaborators and researchers to understand which analytical approach would be best suited to answer the project objectives Provide on-demand consultative data science expertise for ad-hoc requests and recommend data science approaches to meet collaborator needs Understand data requirements based on stated project goals Integrate with the team members including Data Scientists to develop multi-disciplinary analytic approaches, and assist in best practices and process development Mentorship and Communication (15%)
Mentor and lead junior team members to establish coding techniques, structure of dataset for studies and develop sound analytical plans Mentor other team members and provide project/analytical guidance Prepare presentations and manuscripts for various large audiences Provide recommendations for analytical approaches to other Data Scientists and external investigators Provide recommendations based on results of analyses and project objectives Construct elegant visualizations and dashboards to communicate findings/output to end users, managers and senior executives Document and communicate errors in data/code to team members including senior staff Qualifications:
Master's Degree Required Skilled in deploying, and maintaining scalable models in production environments using CI/CD pipe lines to ensure high reliability and performance Proven experience developing, testing, and optimizing performance of NLP models, including tokenization, name entity recognition, language generation with open source frameworks such as spaCy, Hugging Face's Transformers Expert in SQL, R and/or Python, and git. Expertise in a broad range of machine learning techniques for structured and unstructured data with the ability to adapt and apply new methods as needed. Deep understanding of applied prediction tasks, including model evaluation, model fairness, and implementation of real-world tools. Proficiency in regression for continuous normal, skewed, binary, count, and time-to-event outcomes. Able to independently identify and implement methods that most effectively address project-specific objectives, without substantial methodological oversight. Working understanding of causal inference and common study designs used in observational research. Apply advanced analytical techniques to healthcare data, and translate the findings into meaningful, applied knowledge for internal and external stakeholders Mentor other team members and provide project and analytical guidance Familiarity with open source large language models is a plus. Experience with cloud object storage (e.g., MinIO, Amazon S3). Understanding of managing real-time transactional PostgreSQL databases is a plus
Please Note: Registering and making an account with Unity Health does not mean you have submitted an application for the position you would like to apply for. Please ensure you register and make an account with Unity Health AND apply to the position. Both need to be completed to consider your application.
Thank you for applying.