Greylock

GPU / CUDA Engineers (Multiple Openings)

Greylock, San Francisco, CA, United States

Several growth-stage investments of ours in San Francisco, CA are looking for experts in GPU Optimization / Inference Acceleration.

In general, these are the responsibilities:

Primarily focused on GPGPU programming to increase the performance of the product -- writing, debugging, and optimizing CUDA code from GPU kernel-level on upward to improve the holistic performance of new AI models
Play a key role creating all of the tooling and associated infrastructure to increase the performance of the company -- from fairly straight-forward projects (profilers) to incredibly complex (new inference engines)

In general, these are the expectations:

Proven background in CPU acceleration and/or GPU optimization (latter preferred) with a strong preference toward candidates who have expertise in CUDA Kernel hacking
Experience working in deep learning environments and/or on products targeting high-performance ML systems
Strong coding skills in high-performance environments (C/C++)

Please note: Due to the volume of applicants we typically receive, a follow-up email will not be sent unless a match is identified (sorry).

About us:

We are full-time, salaried employees of Greylock, and there are no fees associated with any of the work we do. Our team provides free candidate referrals/introductions to all of our active investments (one of the many services we provide), and we're always looking to add new people to our network of talent.