Logo
Amazon

Language Data Scientist, Amazon

Amazon, Seattle, Washington, us, 98127


Job ID: 2649441 | Amazon.com Services LLCThe Shopping Tech Foundation Team is looking for a Language Data Scientist to collaborate in developing solutions for LLM prompt engineering, LLM evaluation/benchmarking, and annotation efficiency. This position is an opportunity to apply your linguistic and data science expertise in a challenging but supportive environment.

Do you want to be part of the team developing the future technology that impacts the customer experience of ground-breaking products? Then come join us and make history.

Our team works on a variety of projects, including state of the art generative AI, LLM finetuning, alignment, prompt engineering, benchmarking solutions. We are customer obsessed and committed to delivering results with the highest quality and integrity.

As a Language Data Scientist, you will start by diving deep into a couple of critical LLM related projects. You will collaborate with fellow applied scientists, language data scientists, program managers, as well as stakeholders in engineering, annotation operation teams, and product teams to understand the role data plays in developing models that meet customer needs. You will analyze, follow, and improve processes for collecting, assessing and improving LLM inputs and outputs, and automating where appropriate.

You will apply state-of-the-art Generative AI techniques to analyze how well our data represents human language and run experiments to gauge downstream interactions. You will work collaboratively with other language data scientists and scientists to design and implement principled strategies for data optimization.

Key job responsibilities

Source, validate, and deliver high-quality language model artifacts, and linguistic dataCollaborate with stakeholders to design data collection and LLM development effortsOversee the progress and quality of several data collection, model development and annotation projects at a timeAdvocate for strict adherence to data guidelines and quality thresholdsExtend existing data collection, annotation, and quality assurance efforts to support feature and language expansionInnovate on data collection and LLM finetuning/prompt engineering methodologies, guidelines, quality metrics to support new requestsAutomate repetitive workflows and improve existing processesBASIC QUALIFICATIONS

2+ years of data scientist experience3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience3+ years of machine learning/statistical modeling data analysis tools and techniques, and parameters that affect their performance experienceExperience applying theoretical models in an applied environmentMaster's degree in a quantitative field such as statistics, mathematics, data science, business analytics, economics, finance, engineering, or computer sciencePREFERRED QUALIFICATIONS

Experience in Python, Perl, or another scripting languageExperience in a ML or data scientist role with a large technology companyKnowledge of relevant statistical measures such as confidence intervals, significance of error measurements, development and evaluation data sets, etc.Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

#J-18808-Ljbffr