Amazon Web Services (AWS)

Software Engineer - AI/ML, AWS Neuron Inference - Multimodal

Amazon Web Services (AWS), Cupertino, California, United States, 95014

Description

AWS Neuron is the complete software stack for the AWS Inferentia (Inf1/Inf2) and Trainium (Trn1), our cloud-scale Machine Learning accelerators. This role is for a machine learning engineer in the Inference team for AWS Neuron, responsible for development, enablement and performance tuning of a wide variety of ML model families, including massive-scale Large Language Models (LLM) such as GPT and Llama, as well as Stable Diffusion, Vision Transformers (ViT) and many more.The ML Inference team works side by side with chip architects, compiler engineers and runtime engineers to create, build and optimize distributed inference solutions with Trainium/Inferentia instances. Experience with training and optimizing inference on these large models using Python/C++ is a must. Model parallelization, quantization, memory optimization - vLLM, DeepSpeed and other distributed inference libraries can be central to this and extending all of them for the Neuron based system is the key.Key job responsibilities

You will help lead efforts to build distributed inference support into PyTorch, JAX, TensorFlow using XLA, the Neuron compiler, and runtime stacks. You will help optimize these models to ensure the highest performance and maximize their efficiency running on the custom AWS Trainium and Inferentia silicon and the Trn1, Inf1/2 servers. Strong software development (Python and C++) and Machine Learning knowledge (Multimodal, Computer Vision, Speech) are both critical to this role.Basic Qualifications

Bachelor's degree in computer science or equivalent3+ years of non-internship professional software development experience2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experienceExperience programming with at least one software programming languageExperience in machine learning, data mining, information retrieval, statistics or natural language processingPreferred Qualifications

Master's degree in computer science or equivalent3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experienceExperience in computer architecturePrevious software engineering expertise with Pytorch/Jax/Tensorflow, Distributed libraries and Frameworks, End-to-end Model Training.Previous experience with training multi-modal models for understanding and generating images/videos/audiosAmazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $129,300/year in our lowest geographic market up to $223,600/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.Company - Annapurna Labs (U.S.) Inc.Job ID: A2700036

#J-18808-Ljbffr