Logo
Amazon

Principal Availability Engineer, DCE Availability

Amazon, Herndon, Virginia, United States, 22070


Principal Availability Engineer, DCE Availability

Job ID: 2730214 | Amazon Data Services, Inc.Availability Engineers are responsible for consultative and peer review of the design of all design disciplines within Amazon DCs world-wide. Beyond design focus, we work directly with operations, security teams, field engineering, construction management, and operations to implement processes and procedures that are functional, practical, and innovative with a primary focus on improvement of system availability.As an Availability Engineer, you will be evaluating the impact of data center products and features to meet ever-evolving customer needs as we continue expanding our fleet to hyper-scale. As an ideal candidate you:Possess Strong Engineering Judgement and are able to provide recommendations despite uncertaintyAre detail- and data- orientedHave experience managing engineering projects and consultantsBuild trust and relationships with different stakeholders (e.g., Operations, Commissioning, Construction, and Design)Be inclined to get into the field to see things up closeEach day you will interact with different teams responsible for all aspects of the data centers. You will prioritize your activities to support data center availability focusing on the actions that are most impactful. You will have the responsibility to think globally about all initiatives.AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.Key Job Responsibilities

Availability Engineers provide the following deliverables:Auditing and peer reviewing DC Infrastructure Engineering designs with focus on AvailabilityPerform an engineering-oriented analysis of past availability eventsProviding technical oversight for global COE action itemsDeveloping availability forecast modelsOversight & review availability performance metricsProviding multi root cause categorical analysis for availability eventsOversight & review Failure Mode and Effect Analysis studiesConceive, initiate, and lead availability projects with widespread impact on infrastructure design, innovation, and implementationSet the standard for customer serving coordination and repeatable processes as they relate to engineering, test, construction, commissioning, operations, and best practicesProviding technical oversight of the Regional Electrical and Mechanical Basis of Design (BOD), Construction Documentation, SoWs, procurement initiatives, supplier management, commissioning scripts, operation and maintenance manuals and procedures, and all relevant products and processes that drive and impact availabilityServing as a technical advisor for AWS data center electrical, mechanical, structural, site, civil, security, network, fire detection and suppression and all design disciplines as they relate to and drive increased availabilityWorking with internal teams to understand customer availability requirementsProviding technical oversight and review for LSE/CSE COEBASIC QUALIFICATIONS

Bachelor’s Degree in Electrical or Mechanical Engineering or equivalent experience.Cumulative 3+ years partnering with cross-functional teams working in critical facilities.Cumulative 10+ years of experience with mission critical facilities to include the following:Understanding of uninterruptible power sources, diesel generators, electrical switchgear, power distribution units, and automatic/static transfer switches.Understanding of chillers, cooling towers, direct and indirect evaporative cooling, and variable speed drives, and fan systems.Knowledge of building codes and regulations including Life Safety, IBC, NFPA, NEC, NESC and OSHA.Direct experience with the design, construction, operation, or maintenance of data centers.Ability to research new designs, technologies, construction methods, and innovative operations procedures of data center equipment and facilities.Ability to critically audit and provide customer-representative feedback on design concepts through exploration, development, deployment/construction, and through operations.Ability and willingness to think outside of the box to find creative and innovative solutions to improve availability through improved quality, reliability, and maintainability.Ability to perform complex business case analysis to justify technical decisions and present the justification to management in a high level review.Possess excellent communication skills, attention to detail, and maintain high quality standards.PREFERRED QUALIFICATIONS

Organized and have the ability to set priorities and meet deadlines and budget.Experience using a variety of web-based and other software tools for data analysis and visualization.Direct experience with the design, construction, operation, and maintenance of mission critical facilities, especially data centers.Experience as resident engineer or hands-on (in the field) design consultant.Knowledge of building codes and regulations including Life Safety, IBC, NFPA, NEC, NESC and OSHA.Experience reading, interpreting, and creating construction drawings, specifications, and submittal documents.Ability to carry design concepts through exploration, development, and into deployment/mass production.Ability to research new designs, technologies, construction methods, and innovative operations procedures of data center equipment and facilities.Ability to critically audit and provide customer-representative feedback on design concepts through exploration, development, deployment/construction, and through operations.Ability and willingness to think outside of the box to find creative and innovative solutions to improve availability through improved quality, reliability, and maintainability.Ability to perform complex business case analysis to justify technical decisions and present the justification to management in a high level review.Possess excellent communication and writing skills, attention to detail, and maintain high quality standards.Detailed understanding of both mechanical and electrical equipment/design related to data centers (Including but not limited to: uninterruptable power supplies, diesel generators, electrical switchgear, power distribution units, variable frequency drives, automatic/static transfer switches, chillers [air-cooled and water-cooled], pumps, cooling towers, heat exchangers, air handlers, economizers, etc.).EPMS/SCADA/BMS Controls system experience (software and/or hardware).Registered Professional Engineer.Advanced degree in engineering, business, or related field.Experience with large scale technical operations or large-scale compute facilities.Amazon is committed to a diverse and inclusive workplace.

Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

#J-18808-Ljbffr