Amazon

Data Engineer II, Amazon Robotics

Amazon, North Reading, Massachusetts, us, 01864

Description

Are you excited about developing generative AI and foundation models to revolutionize automation, robotics and computer vision? Are you looking for opportunities to build and deploy them on real problems at truly vast scale? At Amazon Fulfillment Technologies and Robotics we are on a mission to build high-performance autonomous systems that perceive and act to further improve our world-class customer experience - at Amazon scale. We are looking for scientists, engineers and program managers for a variety of roles.

The Infrastructure and Operations team within the Fleet Mobility AI program is seeking a passionate Data Engineer to provide massive data for training state-of-the-art Foundation Models and power the future of Amazon’s fleet of over 750,000 mobile robots. This includes merging data from multiple sources to aid in building multi-robot models that are able to predict, reason about, and generate scenarios for multi-robot systems. It also includes supporting the new AI initiatives by owning the data pipelines that fuel the Foundation Models and the downstream innovations. If you are a passionate Data Engineer with a knack for extracting value from complex data sources, this is the perfect opportunity to make your mark in an exciting, dynamic field.

Key job responsibilities

Work with cross-functional teams to gather and analyze the data requirements for building state-of-the-art AI models.

Design, develop, and maintain data pipelines to collect, clean, and store data from multiple diverse sources.

Implement data quality and validation mechanisms to ensure data and model integrity.

Work closely with Science teams to assist the downstream use cases of their models.

Optimize data processing, storage, and retrieval solutions for scalability, cost, and performance tradeoffs.

Feedback data issues and opportunities to various teams and support the improvement of data collection practices and processes.

A day in the life

Participate in the team's agile ceremonies, including planning, daily updates, review, and retrospectives.

Work with data scientists, software engineers, and data professionals to gather and clarify requirements, set goals and success metrics, and check progress against requirements.

Dive into exploration, profiling, and cleaning to support data analysis and model building.

Design and implement data pipelines.

Troubleshoot data issues, and feedback your findings with stakeholders.

Share knowledge across the organization and play a role in the Data Engineering Community of Practice.

Amazon offers a full range of benefits for you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include:

Medical, Dental, and Vision Coverage

Maternity and Parental Leave Options

Paid Time Off (PTO)

401(k) Plan

If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply!

About the team

You will be joining a research/innovation team with diverse, passionate engineers and data scientists who are dedicated to pushing the boundaries of Warehouse Robotics. The team adopts a highly collaborative innovation management process and an agile framework. Your expertise will contribute to the groundbreaking advancements aspired by the team from ideation to production.

Basic Qualifications

3+ years of data engineering experience

3+ years of analyzing and interpreting data with Redshift, Oracle, NoSQL etc. experience

Knowledge of distributed systems as it pertains to data storage and computing

Experience with data modeling, warehousing and building ETL pipelines

Experience working on and delivering end to end projects independently

Experience programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby

Experience with Redshift, Oracle, NoSQL etc.

Master's degree in computer science, engineering, analytics, mathematics, statistics, IT or equivalent

Familiarity and comfort with Python, SQL, Docker, and Shell scripting. Java preferred but not necessary.

Preferred Qualifications

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions

Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)

Experience as a data engineer or related specialty (e.g., software engineer, business intelligence engineer, data scientist) with a track record of manipulating, processing, and extracting value from large datasets

Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS

Experience with Apache Spark / Elastic Map Reduce

Experience with continuous delivery, infrastructure as code, microservices, in addition to designing and implementing automated data solutions using Apache Airflow, AWS StepFunctions, or equivalent

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.