Logo
Saxon Global

Linux Systems Engineer

Saxon Global, Needham Heights, Massachusetts, United States,


Linux Systems Engineer - Hybrid onsite - W2 w/referral - Needham, MA

Hourly pay range:benefits - health insurance, dental insurance, 401k, PTOVisa types accepted: USC/GC

Principal Duties and Responsibilities

Cluster and Systems Administration: Manage and administer production systems used by researchers and Research Centers.Ansible Automation - Code refactoring to deploy and maintain systems and applications in Ansible templates.Analyze result of server monitoring and implement changes to improve performance, processing and utilization. Proposes, maintains and enforces polices, practices and security procedures.Work with users to deploy required applications and docker/singularity applications.Analyze and resolve customer and technical problems: Tuning cluster scheduling parameters, memory/CPU contention, scientific application compilation and run-time issues.Develop and maintain system documentation as well as user-facing knowledge base articles and how-to guides.Evaluate, select and deploy hardware and/or cloud solutions for research scientific computing. This includes CPU and GPU-based compute, high speed networking and data storage.Comfortable working within an Agile team (Slurm).Qualifications

BA/BS engineering degree in a quantitative field or system administration required or equivalent combination of skills/experience.5+ years minimum experience in working with systems administration in Linux environments for a scientific domain including NVIDIA GPU implementations.3+ years of experience with automation and configuration management using Ansible.3+ years of Docker and Kubernetes experienceA combination of education and experience may be substituted for requirements.Demonstrated ability in providing systems administration of up to several hundred Linux servers in an on-premise environment.Hands-on experience writing, maintaining Ansible code.Strong skills writing Linux shell scripts in (Bash).Experience with monitoring software such as open-Xmode or Prometeus.Experience with server deployment technologies (kickstart, PXE, IPMI).Understanding of DHCP, DNS, TCP/IP, NFS, SMB and HTTP network protocols.Strong verbal and written communication, ability to write clear technical documentation.High level of initiative and eagerness to learn new technologies.Familiarity with information technology security and data privacy considerations applicable to a healthcare environment is advantageous.Knowledge of HPC job scheduling platforms like LSF or Slurm.Experience with Git and Jira tools.Ability to multitask and prioritize work requirements, keeping team and management informed.Experience Kerberos authentication.Experience providing support to research investigators with diverse computing needs.