Nvidia
Senior Technical Program Manager - Infrastructure Capacity Management
Nvidia, Santa Clara, CA
Hardware Infrastructure is seeking a Senior Technical Program Manager to lead the strategy and execution of programs to support capacity forecasting, planning, allocation and management across our internal clusters. The GPU infrastructure we build and operate enables NVIDIA's most advanced AI and hardware researchers and engineers to create the future of computing. The scope of the capacity management work spans across compute, storage and network to ensure we have infrastructure that is functional, performant and reliable. This is a fast paced and evolving landscape that requires a senior TPM leader to guide engineering roadmaps to be delivered with high quality outcomes and a strong foundation of operational excellence. They will partner both internally within Hardware Infrastructure and externally with senior management and partner teams to scale the capacity management lifecycle. They will develop and standardize planning, reporting and execution methodologies and metrics to enable meeting the challenging objectives.What You'll Be Doing:Work across multiple internal customer teams to build robust demand models that accurately provide a comprehensive picture of capacity requirements across compute, storage and networkAssist and play a key role in shaping the technical strategy and execution for how our internal serving platform meets internal customer needsNurture a culture of continuous improvement, finding new opportunities across tooling, automation and processes to scale overall capacity managementTake lead in defining strategies that will help increase the efficiency and utilization of resources across internal clusters to minimize capacity wasteGuide a diverse set of engineering efforts in an agile program methodology across planning, prioritization, design, dependency management, implementation and execution.Bring a data-first approach to programs (metrics, OKRs, KPIs) to measure program success and for identifying areas of improvementCreate effective communication channels to provide varying audience levels insights into program status, risks and opportunities.Act as an effective technical and non-technical liaison between developers, customers and partners to drive organization alignment across a multi-functional matrixed set of leadsWhat We Need To See:B.S. (or equivalent experience) in Computer Science or a related technical field10+ years of experience across software engineering and/or technical program management roles with demonstrated expertise and mastery of technical and management practicesPrior experience developing process and tools to forecast, allocate and manage infrastructure resources across a diverse and large portfolio ($billions)Prior experience leading programs that span across multiple teams and engineers (100+)Experience managing large scale HPC and/or AI Infrastructure deployments that stretch across hardware and softwareExceptional communication and presentation skills for diverse technical and non-technical audiencesStrong multitasking abilities with a focus on thoroughness and rapid context switchingKnowledge of agile methodologies and the best in class project management toolsProactive and enthusiastic in identifying and implementing positive changes in software engineering and release management within a fast-paced environmentWays To Stand Out From The Crowd:Prior experience bringing up new datacenter capacity across cloud service providers and on-premise locationsPrior background in working with AI researchers and/or EDA developersSoftware development, release and support methodology and devopsNVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and hardworking people in the world on our team and our collaborative talent continues to drive NVIDIA's growth. We are seeking creative and independent engineers with real passion for technology!The base salary range is 188,000 USD - 299,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.SummaryLocation: US, CA, Santa ClaraType: Full time