Capital One
Manager, Data Scientist, Generative AI Systems (Genesis)
Capital One, Chicago, Illinois, United States, 60290
Center 1 (19052), United States of America, McLean, Virginia
Manager, Data Scientist, Generative AI Systems (Genesis)Data is at the center of everything we do. As a startup, we disrupted the credit card industry by individually personalizing every credit card offer using statistical modeling and the relational database, cutting edge technology in 1988! Fast-forward a few years, and this little innovation and our passion for data has skyrocketed us to a Fortune 200 company and a leader in the world of data-driven decision-making.
As a Data Scientist at Capital One, you’ll be part of a team that’s leading the next wave of disruption at a whole new scale, using the latest in computing and machine learning technologies and operating across billions of customer records to unlock the big opportunities that help everyday people save money, time and agony in their financial lives.
Team DescriptionThe Generative AI Systems (Genesis) team within Card Data Science builds state-of-art, generative AI-based solutions for dialogue, text summarization, reading comprehension, speech recognition, image/document processing as well as time-series sequencing modeling. We partner with product, tech and design teams to deliver internal applications based on these solutions that drive efficiency in our business and data analytics teams, as well as customer-facing applications that enhance the customer experience. You will work with a seasoned group of natural language processing (NLP), speech, and computer vision specialists, experimenting with emerging technologies in generative AI, delivering software implementing these technologies, and contributing research to major NLP and AI/ML conferences.
Role DescriptionIn this role, you will:
Train, fine-tune, and customize large language models across multiple modalities--text, speech, and vision--for use in downstream applications such as dialogue and image processing.
Partner with a cross-functional team of data scientists, software engineers, and product managers to deliver a product customers love.
Leverage a broad stack of technologies — PyTorch, Hugging Face, Spark, LangChain and more — to reveal the insights hidden within huge volumes of numeric and textual data.
Build machine learning models through all phases of development, from design through training, evaluation, validation, and implementation.
Contribute research to top-tier NLP conferences such as Association for Computational Linguistics (ACL) and Empirical Methods in Natural Language Processing (EMNLP).
The Ideal Candidate is:
Innovative. You continually research and evaluate emerging technologies. You stay current on published state-of-the-art methods, technologies, and applications and seek out opportunities to apply them.
Creative. You thrive on bringing definition to big, undefined problems. You love asking questions and pushing hard to find answers. You’re not afraid to share a new idea.
Technical. You’re comfortable with open-source languages and are passionate about developing further. You have hands-on experience developing data science solutions using open-source tools and cloud computing platforms.
Statistically-minded. You’ve built models, validated them, and backtested them. You know how to interpret a confusion matrix or a ROC curve. You have experience with clustering, classification, sentiment analysis, time series, and deep learning.
Basic Qualifications:
Currently has, or is in the process of obtaining a Bachelor’s Degree plus 6 years of experience in data analytics, or currently has, or is in the process of obtaining a Master’s Degree plus 4 years of experience in data analytics, or currently has, or is in the process of obtaining PhD plus 1 year of experience in data analytics, with an expectation that required degree will be obtained on or before the scheduled start date.
At least 2 years’ experience in open source programming languages for large scale data analysis.
At least 2 years’ experience with machine learning.
At least 2 years’ experience with relational databases.
Preferred Qualifications:
PhD in “STEM” field (Science, Technology, Engineering, or Mathematics) or non-STEM but AI-adjacent field plus 3 years of experience in data analytics.
At least 1 year of experience working with AWS.
At least 4 years’ experience in Python for large scale data analysis.
At least 4 years’ experience with machine learning.
At least 4 years’ experience with SQL.
Capital One will consider sponsoring a new qualified applicant for employment authorization for this position.
This role is expected to accept applications for a minimum of 5 business days. No agencies please. Capital One is an equal opportunity employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex (including pregnancy, childbirth or related medical conditions), race, color, age, national origin, religion, disability, genetic information, marital status, sexual orientation, gender identity, gender reassignment, citizenship, immigration status, protected veteran status, or any other basis prohibited under applicable federal, state or local law.
#J-18808-Ljbffr
Manager, Data Scientist, Generative AI Systems (Genesis)Data is at the center of everything we do. As a startup, we disrupted the credit card industry by individually personalizing every credit card offer using statistical modeling and the relational database, cutting edge technology in 1988! Fast-forward a few years, and this little innovation and our passion for data has skyrocketed us to a Fortune 200 company and a leader in the world of data-driven decision-making.
As a Data Scientist at Capital One, you’ll be part of a team that’s leading the next wave of disruption at a whole new scale, using the latest in computing and machine learning technologies and operating across billions of customer records to unlock the big opportunities that help everyday people save money, time and agony in their financial lives.
Team DescriptionThe Generative AI Systems (Genesis) team within Card Data Science builds state-of-art, generative AI-based solutions for dialogue, text summarization, reading comprehension, speech recognition, image/document processing as well as time-series sequencing modeling. We partner with product, tech and design teams to deliver internal applications based on these solutions that drive efficiency in our business and data analytics teams, as well as customer-facing applications that enhance the customer experience. You will work with a seasoned group of natural language processing (NLP), speech, and computer vision specialists, experimenting with emerging technologies in generative AI, delivering software implementing these technologies, and contributing research to major NLP and AI/ML conferences.
Role DescriptionIn this role, you will:
Train, fine-tune, and customize large language models across multiple modalities--text, speech, and vision--for use in downstream applications such as dialogue and image processing.
Partner with a cross-functional team of data scientists, software engineers, and product managers to deliver a product customers love.
Leverage a broad stack of technologies — PyTorch, Hugging Face, Spark, LangChain and more — to reveal the insights hidden within huge volumes of numeric and textual data.
Build machine learning models through all phases of development, from design through training, evaluation, validation, and implementation.
Contribute research to top-tier NLP conferences such as Association for Computational Linguistics (ACL) and Empirical Methods in Natural Language Processing (EMNLP).
The Ideal Candidate is:
Innovative. You continually research and evaluate emerging technologies. You stay current on published state-of-the-art methods, technologies, and applications and seek out opportunities to apply them.
Creative. You thrive on bringing definition to big, undefined problems. You love asking questions and pushing hard to find answers. You’re not afraid to share a new idea.
Technical. You’re comfortable with open-source languages and are passionate about developing further. You have hands-on experience developing data science solutions using open-source tools and cloud computing platforms.
Statistically-minded. You’ve built models, validated them, and backtested them. You know how to interpret a confusion matrix or a ROC curve. You have experience with clustering, classification, sentiment analysis, time series, and deep learning.
Basic Qualifications:
Currently has, or is in the process of obtaining a Bachelor’s Degree plus 6 years of experience in data analytics, or currently has, or is in the process of obtaining a Master’s Degree plus 4 years of experience in data analytics, or currently has, or is in the process of obtaining PhD plus 1 year of experience in data analytics, with an expectation that required degree will be obtained on or before the scheduled start date.
At least 2 years’ experience in open source programming languages for large scale data analysis.
At least 2 years’ experience with machine learning.
At least 2 years’ experience with relational databases.
Preferred Qualifications:
PhD in “STEM” field (Science, Technology, Engineering, or Mathematics) or non-STEM but AI-adjacent field plus 3 years of experience in data analytics.
At least 1 year of experience working with AWS.
At least 4 years’ experience in Python for large scale data analysis.
At least 4 years’ experience with machine learning.
At least 4 years’ experience with SQL.
Capital One will consider sponsoring a new qualified applicant for employment authorization for this position.
This role is expected to accept applications for a minimum of 5 business days. No agencies please. Capital One is an equal opportunity employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex (including pregnancy, childbirth or related medical conditions), race, color, age, national origin, religion, disability, genetic information, marital status, sexual orientation, gender identity, gender reassignment, citizenship, immigration status, protected veteran status, or any other basis prohibited under applicable federal, state or local law.
#J-18808-Ljbffr