Microsoft Corporation
Principal Applied Data Scientist
Microsoft Corporation, Redmond, Washington, United States, 98052
Are you passionate about large scale machine learning? Are you interested in large language models? We are Web Data Platform Document Understanding team and we are looking for a Principal Applied Data Scientist who is a committed team contributor who will work on developing cutting-edge solutions for various document understanding tasks.
We build web-scale Index along with deep understanding of the document content to retrieve rich facets, semantic view, dense representations while weeding out poor quality pages to power relevant and delightful experiences across products like Bing Search Engine, Personalized Recommendations, Ads, windows experience, etc.
Document Understanding Team plays a pivotal and central role in understanding various public web information needs of users and building next generation models to stay ahead of the curve and push the platform capabilities to go hand in hand with modelling improvements. Work in our team is unique as you'll be able to work with industry-leading scales of data (raw and training data), computing resources and best quality large language models like GPT4, multimodal large language model, etc.
In this role, you will be leading projects from idea creation through implementation, experimentation and delivering improvements to real world scenarios, working closely with various partners. We are looking for a passionate and motivated team member who has solid, hands-on experience in transforming business problems into ML problems, collecting high-quality labels, and developing state-of-the-art models to address product challenges and drive value for end users. You will also leverage your software engineering skills and machine learning (ML) expertise in fields such as natural language processing and information retrieval to help create the next generation of text representation and understanding techniques at Microsoft.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
Lead innovation and development of deep learning models for document understanding and their usage in downstream tasks, e.g., Copilot, Generative Search, Search, Spam, QA and recommendation, etc.
Identify opportunities in Web Data space to solve using ML at scale of 100s of Billions of documents.
Push the state-of-the-art in those areas through multiple aspects, for example:
Defining the problem space.
Gathering training data at scale.
Exploring model design and architecture.
Exploring learning objectives and tasks.
Build Feature Generation Algorithms used at Index Generation time.
Build Automated Document Understanding Training Pipelines.
Guide team members to develop new technologies that lead to solutions that impact real production scenarios in Microsoft.
Work closely with various teams in WebXT to understand common needs and build technical roadmap for addressing them.
Qualifications
Required Qualifications
Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research)
OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience (e.g., statistics, predictive analytics, research)
OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research)
OR equivalent experience.
Hands-on experience in developing algorithms and models using deep learning frameworks such as TensorFlow, PyTorch, etc.
Other Requirements
Ability to meet Microsoft, customer and/or government security screening requirements that are required for this role. These requirements include, but are not limited to the following specialized security screenings:
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Preferred Qualifications
Doctorate in Computer Science (CE), Electrical Engineering (EE) or a related STEM fields with a focus on natural language processing (NLP), machine learning (ML), or computer vision (CV)
OR Master's Degree in CS, EE or a related STEM fields with a focus on natural language processing (NLP), machine learning (ML), or computer vision (CV)
8+ years of experience in product development in the areas of Software Engineering and ML (Deep Learning).
A demonstrated track record of excellent communication and collaboration skills.
Potential to think big, while showing progress with real world impact during design and development.
Actively conducting research in at least one of the following areas: artificial intelligence, data science, information retrieval, machine learning, and natural language processing.
Experience with Search/recommendations.
Understanding and knowledge of web data documents understanding concepts, methods, applications, and challenges.
Applied Sciences IC5 - The typical base pay range for this role across the U.S. is USD $137,600 - $267,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $180,400 - $294,000 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft will accept applications for the role until November 22, 2024.
#Bing
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .
We build web-scale Index along with deep understanding of the document content to retrieve rich facets, semantic view, dense representations while weeding out poor quality pages to power relevant and delightful experiences across products like Bing Search Engine, Personalized Recommendations, Ads, windows experience, etc.
Document Understanding Team plays a pivotal and central role in understanding various public web information needs of users and building next generation models to stay ahead of the curve and push the platform capabilities to go hand in hand with modelling improvements. Work in our team is unique as you'll be able to work with industry-leading scales of data (raw and training data), computing resources and best quality large language models like GPT4, multimodal large language model, etc.
In this role, you will be leading projects from idea creation through implementation, experimentation and delivering improvements to real world scenarios, working closely with various partners. We are looking for a passionate and motivated team member who has solid, hands-on experience in transforming business problems into ML problems, collecting high-quality labels, and developing state-of-the-art models to address product challenges and drive value for end users. You will also leverage your software engineering skills and machine learning (ML) expertise in fields such as natural language processing and information retrieval to help create the next generation of text representation and understanding techniques at Microsoft.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
Lead innovation and development of deep learning models for document understanding and their usage in downstream tasks, e.g., Copilot, Generative Search, Search, Spam, QA and recommendation, etc.
Identify opportunities in Web Data space to solve using ML at scale of 100s of Billions of documents.
Push the state-of-the-art in those areas through multiple aspects, for example:
Defining the problem space.
Gathering training data at scale.
Exploring model design and architecture.
Exploring learning objectives and tasks.
Build Feature Generation Algorithms used at Index Generation time.
Build Automated Document Understanding Training Pipelines.
Guide team members to develop new technologies that lead to solutions that impact real production scenarios in Microsoft.
Work closely with various teams in WebXT to understand common needs and build technical roadmap for addressing them.
Qualifications
Required Qualifications
Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research)
OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience (e.g., statistics, predictive analytics, research)
OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research)
OR equivalent experience.
Hands-on experience in developing algorithms and models using deep learning frameworks such as TensorFlow, PyTorch, etc.
Other Requirements
Ability to meet Microsoft, customer and/or government security screening requirements that are required for this role. These requirements include, but are not limited to the following specialized security screenings:
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Preferred Qualifications
Doctorate in Computer Science (CE), Electrical Engineering (EE) or a related STEM fields with a focus on natural language processing (NLP), machine learning (ML), or computer vision (CV)
OR Master's Degree in CS, EE or a related STEM fields with a focus on natural language processing (NLP), machine learning (ML), or computer vision (CV)
8+ years of experience in product development in the areas of Software Engineering and ML (Deep Learning).
A demonstrated track record of excellent communication and collaboration skills.
Potential to think big, while showing progress with real world impact during design and development.
Actively conducting research in at least one of the following areas: artificial intelligence, data science, information retrieval, machine learning, and natural language processing.
Experience with Search/recommendations.
Understanding and knowledge of web data documents understanding concepts, methods, applications, and challenges.
Applied Sciences IC5 - The typical base pay range for this role across the U.S. is USD $137,600 - $267,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $180,400 - $294,000 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft will accept applications for the role until November 22, 2024.
#Bing
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .