Exciting Opportunity at MSK
Principal HPC Engineer
Research Technology Services, High-Performance Computing (HPC) is seeking an experienced Research Solution Engineer to complement our growing HPC team. You would be supporting Memorial Sloan Kettering Cancer Center as it strives to achieve its singular mission: ending cancer for life. This exciting position will join a team of experienced engineers, architects, and application specialists working towards this very challenging goal. You will have access to state-of-the-art equipment and support from world-leading researchers. Under the supervision of the Senior Director of Research Technology, the HPC Engineer will provide support for a complex multi-datacenter high-performance computing system.
Position Overview:
Design, propose, and deliver solutions that are appropriate for the business and technology strategies.
Facilitate the design of technology solutions including architecting and implementing solutions requiring integration of multiple platforms, operating systems, and applications across the enterprise.
Review and advise on standard software and hardware builds, system options, risks, costs versus benefits and impact on the enterprise business process and goals.
Develop and document the framework for integration and implementation for changes to technical standards.
Approve and modify designs and architectures to ensure compliance.
Assist in the development of and manages an architecture governance process.
Key Responsibilities:
Design and support scalable and fault tolerant HPC systems, including network design and resource allocation.
Leverage accelerators like GPUs to optimize HPC workloads.
Perform hands-on system administration for HPC environments.
Support and troubleshoot job scheduling and data management within the HPC environment.
Compile, install, and debug scientific applications tailored for research needs.
Develop and deliver training materials and sessions to educate researchers on HPC resources.
Key Qualifications:
Proven experience in an HPC environment, including job scheduling, data management, and system administration.
Infrastructure Knowledge: Ability to work on the infrastructure side of HPC systems, including designing scalable solutions and resource allocation.
Hands-on experience with accelerators like GPUs to accelerate HPC workloads.
Proficiency in scripting/programming languages such as BASH, Python, etc.
Understanding of scientific workflows and life sciences research.
Additional Information:
Schedule: 2 days a week, 3 days remote. Location Zuckerman Research Center, NYC.
Reporting to Associate Director, Infrastructure.
Pay Range:
$164,100.00 - $270,900.00
Helpful Links:
MSK Compensation Philosophy
Review Our Great Benefits Offering
#J-18808-Ljbffr
See details and apply
Principal Infrastructure Engineer