Advanced Micro Devices, Inc.
Machine Learning Performance Engineer
Advanced Micro Devices, Inc., San Jose, California, United States, 95199
WHAT YOU DO AT AMD CHANGES EVERYTHINGWe care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.AMD together we advance_THE ROLE:
We are seeking a Machine Learning Performance Engineer to focus on ML Model optimization, profiling, bottleneck analysis, and optimal mapping to GPU. If you are passionate about performance optimization, getting the best out of the HW, and shaping the future AI performance, then this role is for you.THE PERSON:
As a ML Performance Engineer, you will analyze and explore recent ML models, understand their compute and memory requirements, and optimize them on our various compute hardware for both inference and training. In addition to profiling and analyzing various workloads on current hardware, you will come up with new ways to improve their performance.The ideal candidate will have strong experience with software optimization, GPU programming, and hardware architecture. Passionate about getting the best performance out of various hardware under different tradeoffs.KEY RESPONSIBILITIES:
Benchmark, analyze, and optimize performance of key machine learning applications and participate in the co-design across AMD's ML hardware and software stack, on single and multi-GPU systems.Design, implement, and test GPU kernels and algorithms for tensor operations like matrix multiplication and convolutions used in a variety of high-performance machine learning libraries and frameworks.Deliver high-quality code and documentation following best practices for open-source software development.Communicate and collaborate with key technical experts across AMD and with our partners and customers to improve ROCm applications, libraries, and tools, as well as hardware.PREFERRED EXPERIENCE:
Understanding of CPU and GPU architectures and low-level optimization techniques including memory hierarchy, instruction scheduling, and performance tradeoffs.Strong background developing applications and libraries in C++, especially high-performance computing and/or scientific software.GPU software development using HIP, CUDA, or OpenCL.In-depth knowledge of best practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning.Excellent written, verbal, and presentation skills.ACADEMIC CREDENTIALS:
A PhD or Master plus equivalent experience in computer science, electrical engineering, or a related field.#LI-MV1#LI-HYBRID
#J-18808-Ljbffr
We are seeking a Machine Learning Performance Engineer to focus on ML Model optimization, profiling, bottleneck analysis, and optimal mapping to GPU. If you are passionate about performance optimization, getting the best out of the HW, and shaping the future AI performance, then this role is for you.THE PERSON:
As a ML Performance Engineer, you will analyze and explore recent ML models, understand their compute and memory requirements, and optimize them on our various compute hardware for both inference and training. In addition to profiling and analyzing various workloads on current hardware, you will come up with new ways to improve their performance.The ideal candidate will have strong experience with software optimization, GPU programming, and hardware architecture. Passionate about getting the best performance out of various hardware under different tradeoffs.KEY RESPONSIBILITIES:
Benchmark, analyze, and optimize performance of key machine learning applications and participate in the co-design across AMD's ML hardware and software stack, on single and multi-GPU systems.Design, implement, and test GPU kernels and algorithms for tensor operations like matrix multiplication and convolutions used in a variety of high-performance machine learning libraries and frameworks.Deliver high-quality code and documentation following best practices for open-source software development.Communicate and collaborate with key technical experts across AMD and with our partners and customers to improve ROCm applications, libraries, and tools, as well as hardware.PREFERRED EXPERIENCE:
Understanding of CPU and GPU architectures and low-level optimization techniques including memory hierarchy, instruction scheduling, and performance tradeoffs.Strong background developing applications and libraries in C++, especially high-performance computing and/or scientific software.GPU software development using HIP, CUDA, or OpenCL.In-depth knowledge of best practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning.Excellent written, verbal, and presentation skills.ACADEMIC CREDENTIALS:
A PhD or Master plus equivalent experience in computer science, electrical engineering, or a related field.#LI-MV1#LI-HYBRID
#J-18808-Ljbffr