Logo
Alcority

HPC Engineer

Alcority, Dallas, Texas, United States, 75215


About the Role:

As an HPC (High-Performance Computing) Datacenter Engineer, you will be responsible for implementing and supporting state-of-the-art datacenter infrastructure solutions that support high-performance computing and scientific research. You will collaborate with cross-functional teams, including researchers, system administrators, network engineers, and data scientists, to understand their requirements and create efficient and scalable datacenter solutions. Your expertise in HPC technologies and emerging trends will be instrumental in driving innovation and optimizing performance within the datacenter environment. Join our team and contribute to the advancement of scientific research and innovation by designing and optimizing cutting-edge datacenter infrastructure.

Responsibilities:Datacenter Architecture Design: Develop and refine datacenter architecture blueprints and guidelines, considering performance, scalability, security, and efficiency aspects. Design and implement solutions for compute, storage, networking, and cooling infrastructure that align with HPC requirements.HPC Infrastructure Optimization: Continuously evaluate and enhance the datacenter infrastructure to maximize HPC performance and resource utilization. Identify and address potential bottlenecks and performance gaps, employing industry best practices and cutting-edge technologies.System Integration and Deployment: Collaborate with system administrators and engineers to ensure seamless integration and deployment of HPC systems. Oversee hardware and software installation, configuration, and testing activities.Research and Evaluation: Stay up to date with emerging HPC technologies, tools, and methodologies. Conduct research and feasibility studies on new hardware and software solutions to enhance datacenter capabilities. Evaluate vendor offerings and provide recommendations for procurement.Performance Monitoring and Troubleshooting: Monitor and analyze datacenter performance metrics to identify issues and implement necessary optimizations. Troubleshoot complex system problems, working closely with technical teams to ensure efficient resolution and minimal impact on operations.Security and Compliance: Collaborate with security teams to design and implement robust security measures within the datacenter infrastructure. Ensure compliance with relevant industry standards and regulations, such as HIPAA or GDPR, in data handling and storage.Documentation and Reporting: Create comprehensive technical documentation, including architectural diagrams, standard operating procedures, and configuration guidelines. Prepare regular reports on datacenter performance, capacity planning, and future infrastructure requirements.Team Collaboration and Leadership: Collaborate effectively with cross-functional teams, fostering a culture of knowledge sharing and innovation. Provide technical leadership and mentorship to junior team members, guiding them in adopting best practices and enhancing their skill sets.Requirements:

Bachelor's or master's degree in computer science, engineering, or a related field or equivalent experienceMinimum 5 years of experience as an HPC engineer or similar role, with a strong focus on engineering and optimization.In-depth knowledge of HPC technologies, including parallel computing, distributed storage systems, job scheduling, InfiniBand and Ethernet networking, GPU acceleration, and job scheduling frameworks.ZFS and NiFi are a plusExperience with automation tools Python, Ansible, Puppet / chefMonitoring tools - Prometheus, Ganlia, Nagios, SNMP, and TelegrafExperience with CFD (Computational Fluid Dynamics) workloads and associated HPC optimization a plusMust have familiarity with industry-standard tools and software used in HPC environments, such as Slurm, PBS Pro, Lustre, GPFS, OpenStack, and containerization technologies (e.g., Docker, Kubernetes).Strong problem-solving and analytical skills, with the ability to identify and resolve complex technical issues.Excellent communication and interpersonal skills, with the ability to collaborate effectively with diverse teams and stakeholders.Detail-oriented mindset with a strong focus on documentation and adherence to standards.Familiarity with security protocols and compliance requirements in the context of datacenter operations.Ability to adapt to a fast-paced and rapidly evolving technological landscape.It is impossible to list every requirement for, or responsibility of, any position. Similarly, we cannot identify all the skills a position may require since job responsibilities and the Company's needs may change over time. Therefore, the above job description is not comprehensive or exhaustive. The Company reserves the right to adjust, add to or eliminate any aspect of the above description. The Company also retains the right to require all employees to undertake additional or different job responsibilities when necessary to meet business needs.

Must be legally authorized to work in the United States without the need for employer sponsorship, now or at any time in the future.

Benefits & Perks:

Time Off:

25 days of PTO for full-time employees and 12 company holidays.Company Paid Benefits:

Life insurance, Short-term disability, Long-term disability, Paid parental leave, Employee Assistance Program, and medical insurance in our high deductible health plan.Optional Employee Paid Benefits:

Medical insurance in our EPO plan, Dental benefits, and Vision benefits. We also offer Health Savings Accounts, Flexible Spending Accounts, Supplemental Life insurance, and more.401(k):

Eligible after 60 days. Discretionary company match of 50% up to the first 6% of contributions.

EQUAL OPPORTUNITY EMPLOYER

ALCORITY IS AN EQUAL EMPLOYMENT OPPORTUNITY EMPLOYER. THE COMPANY'S POLICY IS NOT TO DISCRIMINATE AGAINST ANY APPLICANT OR EMPLOYEE BASED ON RACE, COLOR, RELIGION, NATIONAL ORIGIN, GENDER, AGE, SEXUAL ORIENTATION, GENDER IDENTITY OR EXPRESSION, MARITAL STATUS, MENTAL OR PHYSICAL DISABILITY, AND GENETIC INFORMATION, OR ANY OTHER BASIS PROTECTED BY APPLICABLE LAW. THE FIRM ALSO PROHIBITS HARASSMENT OF APPLICANTS OR EMPLOYEES BASED ON ANY OF THESE PROTECTED CATEGORIES.