Apple

On-device Machine Learning Infrastructure Engineer (Compiler & Runtime)

Apple, Cupertino, California, United States, 95014

On-device Machine Learning Infrastructure Engineer (Compiler & Runtime) Cupertino, California, United States Machine Learning and AI The On-Device Machine Learning team at Apple is responsible for the Research → Production of cutting-edge machine learning models that power magical user experiences on Apple’s hardware and software platforms. This team builds critical infrastructure that begins with onboarding the latest machine learning architectures to embedded devices, optimization toolkits to optimize these models, machine learning compilers and runtimes to execute these models efficiently, and the benchmarking, analysis, and debugging toolchain needed to improve new model iterations. This infrastructure underpins most of Apple’s critical machine learning workflows across Camera, Siri, Health, Vision, etc., and is an integral part of Apple Intelligence. Our group is looking for an ML Infrastructure Engineer, with a focus on graph compilers and runtimes. The role entails building the world’s foremost ML graph compilation and runtime system capable of optimizing & executing ML models efficiently on Apple products and services. Description As an engineer in this role, you will be primarily focused on building graph compilers that optimize ML graphs coming from popular ML frameworks (PyTorch, JAX, MLX, etc.) to execute performantly on Apple Silicon. The graph compiler and runtime provide out-of-the-box capability for executing ML models while also providing extensibility hooks for users to tailor specific goals. The role also has exposure to building higher-level APIs and toolings to enable developers to visualize, diagnose, and debug correctness and performance issues while onboarding models to on-device deployment. The ML compiler is the backbone of such infrastructure stack. The role requires an understanding of ML operator primitives, common compiler optimizations (frontend/middle-end), runtimes, and system software engineering. Key Responsibilities: Define and build the on-device graph compiler, runtime, and kernels executing ML operators. Build production-critical system software for executing ML models on Apple Silicon. Optimize model execution for various system goals such as performance, energy efficiency, and thermals. Minimum Qualifications: Bachelors in Computer Sciences, Engineering, or related discipline. Highly proficient in C++. Familiarity with Python. Familiarity with Operating Systems, embedding programming, parallel programming. Experience with any compiler stack (MLIR/LLVM/TVM/...). Sound understanding of ML fundamentals, including common architectures such as Transformers. Good communication skills, including the ability to communicate with cross-functional audiences. Preferred Qualifications: Experience with any on-device ML stack, such as TFLite, ONNX, ExecuTorch, etc. Experience with any ML authoring framework (PyTorch, TensorFlow, JAX, etc.) is a strong plus. Experience with accelerators, GPU programming is a strong plus.

#J-18808-Ljbffr