Logo
General Dynamics Corporation

Senior HPC Systems Engineer

General Dynamics Corporation, Falls Church, Virginia, United States, 22042


Responsibilities for this PositionLocation:

USA VA Home Office (VAHOME)Full Part/Time:

Full timeJob Req:

RQ180317Type of Requisition:

PipelineClearance Level Must Currently Possess:

NoneClearance Level Must Be Able to Obtain:

NoneSuitability:Public Trust/Other Required:Job Family:

Systems Engineering

Job Qualifications:

Skills:

High-Performance Computing (HPC) Systems, Linux System Administration, Systems ManagementCertifications:

None - N/AExperience:

10 + years of related experienceUS Citizenship Required:

Yes

Job Description:

At GDIT, people are our differentiator. Our work depends on a Senior HPC Systems Engineer joining our team to support the National Oceanic and Atmospheric Administration (NOAA), Weather and Climate Operational Supercomputer System (WCOSS). This position is remote with some travel required.

WCOSS provides NOAA the operational High Performance Computing (HPC) resources essential to process sophisticated numerical models used to predict and understand atmospheric and oceanic phenomena for weather and climate operational use. Operating 24/7, the next 10-year WCOSS program will deliver significant computational capability that will evolve over time to keep pace with NOAA's growing environmental modeling needs.

We are looking for individuals to join GDIT's team to deploy, operate and support leading-edge technology for WCOSS. Specific technology training will be provided. CANDIDATES MUST HAVE AN ACTIVE PUBLIC TRUST CLEARANCE OR ABOVE TO BE CONSIDERED.

In this role, a typical day will include:

Applying current HPC systems administrative skills; desire to learn and deploy new technologies.

Developing and deploying monitoring capabilities.

Developing and implementing tools for cluster administration.

Providing technical support with team of HPC System & Storage Administrators to resolve operational issues.

Providing off-hour on-call support on a rotating basis.

Positions available at facility sites or remote, some travel required.

REQUIRED QUALIFICATIONS

Bachelor's degree or equivalent and 10+ years of experience with HPC systems operations.

Experience working in a 24X7 operational environment.

DESIRED QUALIFICATIONS

Demonstrated experience to deploy and manage large-scale HPC systems using OS provisioning tools (e.g., xCat, HPCM).

Demonstrated experience using configuration management tools (e.g., Ansible, Puppet).

Linux system administration experience (e.g., SLES, RedHat or CentOS).

Batch management/scheduling experience, PBSpro preferred.

Parallel filesystem configuration and monitoring experience (e.g., Lustre, NFS).

Network interconnect configuration and monitoring experience (e.g., Infiniband, Ethernet).

Programming or scripting in at least two languages (e.g., Bash, Perl, Python, C).

Strong writing skills for technical documents, system procedures, user wikis and FAQs.

Experience developing regression tests (e.g. pavilion, ReFrame).

Ability to work both independently and as part of a team.

Scheduled Weekly Hours:

40Travel Required:

NoneTelecommuting Options:

RemoteWork Location:

Any Location / Remote

GDIT is an Equal Opportunity/Affirmative Action employer.

#J-18808-Ljbffr