Acceler8 Talent

Machine Learning Systems

Acceler8 Talent, San Francisco, California, United States, 94199

Revolutionize AI with Us by Helping Everyone Save Time Join our mission to redefine human-computer collaboration and automate workflows with cutting-edge AI products. Be part of a team shaping the future of enterprise operations, leveraging Large Language Models (LLMs) to elevate organizational impact. Your Impact: Collaborate on delivering captivating experiences through Large Language Models. Architect Scalable ML Systems: Design and implement scalable machine learning and distributed systems for LLMs. Optimize Under the Hood: Innovate at lower stack levels, creating high-performing infrastructure with custom kernels. Master Parallelism Methods: Develop parallelism methods for large-scale LLM distribution training. Your Skills: Experience in training LLMs using Megatron, DeepSpeed, etc., and deploying with vLLM, TGI, TensorRT-LLM, etc. Possesses a strong grasp of the architectures of cutting-edge AI accelerators such as TPU, IPU, HPU, and their associated tradeoffs. Proficient in working under-the-hood with kernel languages like OAI Triton, Pallas, and compilers like XLA. Proven hands-on experience in tuning LLM workloads. Familiarity with MLPerf or production workloads is a plus. If you're passionate about driving AI innovation and pushing the boundaries of what's possible, we invite you to join our collaborative and forward-thinking team.