Logo
Insight Global

Senior Linux System Engineer

Insight Global, Rockville, Maryland, us, 20849


As a HPC Systems Engineer you will help ensure today is safe and tomorrow is smarter. Our work depends on HPC Systems Engineer joining our team to bridge the gap between our researchers and the high performance computing resources. You will be one of the faces of our High Performance Compute (HPC) clusters to the client's research community who will rely on you to help them get their important research work done. You will focus on supporting HPC hardware, installing scientific applications, optimizing submission scripts and running jobs, and monitoring the health of our client's HPC clusters; a 4000+ core HPC cluster that is GPU-focused and a 1,500+ core HPC cluster.

How a HPC Systems Engineer will Make an Impact:•Work with a 4000+ core HPC cluster that is GPU-focused and a 1,500+ HPC cluster supporting the hardware and operating system environments•Supporting bioinformatics applications for a large and diverse research community with needs in genomics, cryo-electron microscopy, and AI/ML•Monitor the portfolio of software applications and be proactive in planning upgrades and license renewals•Monitor and report on cluster performance and generate data to show usage and trends•Triage support requests from the research community and work with others in the Scientific Infrastructure team to resolve issues and complete service requests•Collaborate with researchers to guide them in effective use of the HPC resources, such as job scheduler submission, data formats, and building data workflows•Engage with researchers to understand their HPC needs to include data life cycle management, integration of scientific instruments to HPC, and storage capacity and compute requirements•Provide input to the Scientific Infrastructure team leader for setting priorities for cluster operations, scheduling policies, resources needed, etc.•Attend and actively participate in daily standup meetings to provide updates on progress, discuss obstacles, and co-ordinate tasks with other team members•Work collaboratively in a team environment to achieve project goals•Engage in open communication, share knowledge, and support fellow teammatesProvide feedback and contribute to the continuous improvement of team processes

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com .

To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/ .

Required Skills & Experience•BS/BA (or equivalent)•Five years of related experience•Minimum of five years of experience with servers, datacenters, networking, and related technologies•Minimum of five years of experience managing Linux systems•Experience with Spack package manager, including making packages from PyPi, R, Github•Experience installing and packaging GPU applications and optimizing job submission scripts that are used for ML model training, data mining operations, or high-res graphics rendering•Experience with Python scripting•Experience using Git distributed workflows•Experience with Ansible manage system configuration•Experience with Terraform for provisioning systems•Must be able to obtain a NIH Public Trust•Ability to translate technical concepts in HPC and research computing to scientists and other non- technical personnelAbility to determine meaningful metrics and usage data for leadership

Nice to Have Skills & Experience

HPC scheduler experience (esp. SLURM)

Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.