Logo
Apex Systems

HPC Linux Systems Administrator

Apex Systems, Boulder, Colorado, United States, 80301


Job#: 2050704Job Description:Our client is looking for a HPC Linux Systems Administrator to join their team to support the National Oceanic and Atmospheric Administration (NOAA), Research and Development High Performance Computing Systems (RDHPCS) customer at the NOAA Global Systems Laboratory in Boulder, Colorado.The qualified candidate will bring their hands-on technical and system administration expertise on-site to maintain the operational readiness and availability of NOAA's high performance computing systems, manage and support new technology insertions, and provide remote technical support and collaboration with our other supported NOAA sites at Fairmont, West Virginia and Princeton, New Jersey.We are looking for an individual to join our client's team to deploy, operate, and support leading-edge technology for NOAA RDHPCS. Specific technology training will be provided.HOW A SYSTEMS ENGINEER ADVISOR WILL MAKE AN IMPACTApply current systems administrative skills.Learn and deploy new technologies.Develop and deploy monitoring capabilities.Develop and implement tools for cluster administration.Provide technical support with a team of HPC System & Storage Administrators to resolve operational issues.Independent problem solving and troubleshooting to quickly advance towards viable resolutions.Perform hardware break/fix support, which may include node, blade, or board-level replacements, replacement of backplanes, failed DIMMs, hard drives, controller boards, failed cables, network switches, and other failed components.Manage and maintain spare part inventories.Perform tracking, shipping, and receiving of vendor RMAs.Develop, improve, and enhance user and system administration online documentation repositories.Support HPC system users by leveraging the helpdesk ticketing system.WHAT YOU’LL NEED TO SUCCEED:? Education:Bachelor’s degree or 8+ years of experience.? Required Experience:Experience with Systems Administration or IT support with diverse responsibilities.? Required Technical Skills:Hands-on experience with computer hardware maintenance and troubleshooting, such as identifying and replacing failed processors, DIMMs, disk drives, PCIe cards, and other field-replaceable components.Programming or scripting knowledge in at least one language (e.g., Bash, Perl, Python).? Security Clearance Level: Must be able to achieve T1 (Public Trust)? Required Skills and Abilities:Demonstrated experience deploying and managing large-scale HPC systems using OS provisioning tools (e.g., xCAT, Warewulf).Demonstrated experience using configuration management tools (e.g., Ansible, Puppet).Linux system administration experience (e.g., RedHat or Rocky Linux).Batch management/scheduling experience, Slurm preferred.Network interconnect configuration and monitoring experience (e.g., InfiniBand, Ethernet).Strong writing skills for technical documents, system procedures, user wiki’s and FAQs.? Other specific skills or competencies:Team player with the ability to work with a diverse team in both local and remote technical support environments.Resourceful with initiative to perform independent technical troubleshooting and identify/recommend solutions and improvements.Willingness and motivation to learn, grow, and retain and apply knowledge acquired towards future projects.Disciplined troubleshooting skills balanced with creative problem-solving skills to tackle highly complex large-scale technical problems.Attention to detail in areas such as time management, pre-planning, analytical thinking, observation, and active listening.? Preferred Skills: **Preferred skills, keep in mind that what you post here may limit your applicant pool**? Location: Remote, Hybrid, On Customer Site (Boulder, CO)? US Citizenship or Green Card RequiredEEO EmployerApex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law.Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package.

#J-18808-Ljbffr