SpaceX

Sr. Site Reliability Engineer - Top Secret Clearance

SpaceX, Hawthorne, California, United States, 90250

SR. SITE RELIABILITY ENGINEER - TOP SECRET CLEARANCEAs a Senior Site Reliability Engineer, you will architect, develop, and test key aspects of the infrastructure for an in-house solution for analysis, simulation, prototyping, and operation of software in support of all SpaceX flight systems. You will have full ownership of the automation and technical infrastructure to support scalable high-performance web applications that manage large volumes of data in addition to a suite of simulation and test products. In this high-impact role, you will work across engineering groups to build a high-throughput distributed system that will be used to develop, demonstrate, and operate cutting-edge software and hardware. We are looking for smart, motivated software engineers who enjoy taking on complex challenges, work well in dynamic environments, and care about software best practices.RESPONSIBILITIES:Architect application and database clusters leveraging microservices in support of SpaceX flight systems for both on-premises and in the cloud deploymentsDevelop automation to deploy and manage applications both on-premises and in the cloud utilizing infrastructure as code where necessaryCollaborate with software engineers to create highly scalable, operable and maintainable productsCollaborate with IT and software engineers to develop test automation suite leveraging DevOps infrastructureDevelop policies and automation in collaboration with IT and software engineers to ensure compliance with DevSecOps best practicesCollaborate with IT and software engineers to develop policies and automation to ensure security compliance requirements are fulfilledEngage in and improve the whole lifecycle of services -- from inception and design, through deployment, operation and refinementBASIC QUALIFICATIONS:Bachelor's degree in computer science, information systems/IT, or an engineering discipline; OR 5+ years of professional experience in software, DevOps, or site reliability engineering in lieu of a degree3+ year of experience with Linux operating systemsExperience building and managing production systems leveraging containerization technologies (i.e. Docker, Kubernetes)Experience with designing and managing solutions in cloud environments such as AWS, Azure or GCPExperience in Bash, Python, and/or other scripting languagesActive Secret, Top Secret, Top Secret SCI, OR ability and willingness to obtain a Top Secret clearancePREFERRED SKILLS AND EXPERIENCE:5+ years of systems administration, site reliability engineering, or DevOps experience3+ years of experience working with Kubernetes, Docker, or similar technologiesStrong understanding of message queue technologies such as RabbitMQ or KafkaStrong understanding of virtualization and hypervisor technologiesUnderstanding of databases and performance tuningExperience with identity management and authentication protocolsFocus on performance bottlenecks and performance improvement techniquesExcellent communication skills with the ability to communicate with customers, peers, management, etc. in both formal and informal situationsAbility to quickly learn new tools and frameworksADDITIONAL REQUIREMENTS:Willing to work extended hours and weekends when needed

#J-18808-Ljbffr