Logo
CentML

Senior Software Engineer - Compiler

CentML, San Francisco, California, United States, 94199


Overview:

Do you want to help drive the development of high-performance, power-efficient datacenter solutions for Deep Learning? Do you have an interest in how system architecture across GPU, networking, CPU and IO relate to brand new generative AI capabilities? Come join our team, and bring your experience and interests to help us optimize our next generation of inference and training frameworks and to redefine the deep learning industry once again.Responsibilities:

Communicate with our product teams and profile ML/DL workloads to acquire an in-depth understanding of the problems (e.g., slow kernels)Figure out where the bottleneck of the slow GPU kernels with profilersOptimize the GPU kernelsWrite tests and benchmarks to validate and evaluate our solutionsWho you are:

Bachelors or higher degree in Computer Science or EngineeringExcellent communication skills and the ability to work in a teamStrong coding skills (in at least one of Python and C++)Solid fundamentals in other computer science and computer engineering topics: algorithms and data structures, operating systems, computer architecture, etc.Strong academic records for candidates with bachelor’s degreesYou will stand out from the crowd if you have:

5+ years of experience in researching or contributing to HPC/ML/DL systems, frameworks or libraries (including the time of being a graduate student)Experience with GPU architecture and GPGPU programming:NVIDIA GPUs: CUDA programming and libraries and toolkits (e.g., cuDNN, cuBLAS, CUTLASS, nvprof, Nsight Compute, Nsight Systems, etc.);AMD GPUs: ROCm and its related libraries and toolkitsOpenCLExperience with developing high-performance kernels for CPUsExperience in developing ML or traditional compilersExperience with TPUStrong publication records in top HPC/ML/DL or computer system and architecture venues

#J-18808-Ljbffr