Palo Alto Networks
Principal Site Reliability Engineer (Advanced Threat Protection Infrastructure)
Palo Alto Networks, Santa Clara, California, us, 95053
Principal Site Reliability Engineer (Advanced Threat Protection Infrastructure)
Palo Alto Networks At Palo Alto Networks, everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. Who We Are We take our mission of protecting the digital way of life seriously. We are relentless in protecting our customers and we believe that the unique ideas of every member of our team contribute to our collective success. Our values were crowdsourced by employees and are brought to life through each of us every day. As a member of our team, you will be shaping the future of cybersecurity. We work fast, value ongoing learning, and we respect each employee as a unique individual. Job Description Your Career We are looking for an exceptional Principal Site Reliability Engineer to enhance our ATP Infra team. This role will work on producing mission-critical platforms, tools, and processes that will ensure the highest levels of availability and reliability of all our applications. Your Impact Write automation code for provisioning and operating infrastructure at massive scale. Design, build and operate Cloud infrastructure to enable reliable and rapid deployment of microservices with effective monitoring and resilient operations. Work with development teams to ensure the applications are production-ready, scalable, and reliable from the ground up. Identify and drive opportunities to improve automation for code deployment, management, and visibility of application services. Develop tools and frameworks to automate operational tasks, deployment of machines, services, and applications. Establish end-to-end monitoring and alerting on all critical components of the application. Participate in the on-call rotation supporting the platform and/or the production application. Direct root cause analysis of critical business and production issues. Develop and mentor other SREs on standard methodology from Infra orchestration and troubleshooting application service in production. Represent SRE in design reviews and work cross-functionally with Engineering teams on operational readiness. Qualifications Your Experience BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required. Expertise in configuration management with a framework such as Terraform, Ansible, and Helm. Strong experience with Kubernetes. Strong Linux administration, internals, and network troubleshooting. Expertise in Google cloud computing (GCP) and resource management/operations on its related services. Proficiency with a programming language like Python and shell scripting to automate tasks. Strong experience with CI/CD pipeline, GitHub, Jenkins, Artifactory. Strong experience with metrics and monitoring tools such as Grafana and Prometheus. Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions. Strong fundamentals in API gateway including Nginx or Envoy. Experience with cloud infrastructure and their performance & cost optimizations. Experience with AWS is a big plus. Excellent interpersonal skills and the ability to work well in a team. Passionate to learn, understand, and dissect new technology stack quickly on own. Experience in building and managing large relational database clusters (MySQL/Percona etc.) will be a plus. The Team Our engineering team is at the core of our products – connected directly to the mission of preventing cyberattacks. We are constantly innovating – challenging the way we, and the industry, think about cybersecurity. Compensation Disclosure The compensation offered for this position will depend on qualifications, experience, and work location. The starting base salary is expected to be between $147,000 - $225,000/YR. The offered compensation may also include restricted stock units and a bonus. Our Commitment We’re problem solvers that take risks and challenge cybersecurity’s status quo. We are committed to providing reasonable accommodations for all qualified individuals with a disability. Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to legally protected characteristics.
#J-18808-Ljbffr
Palo Alto Networks At Palo Alto Networks, everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. Who We Are We take our mission of protecting the digital way of life seriously. We are relentless in protecting our customers and we believe that the unique ideas of every member of our team contribute to our collective success. Our values were crowdsourced by employees and are brought to life through each of us every day. As a member of our team, you will be shaping the future of cybersecurity. We work fast, value ongoing learning, and we respect each employee as a unique individual. Job Description Your Career We are looking for an exceptional Principal Site Reliability Engineer to enhance our ATP Infra team. This role will work on producing mission-critical platforms, tools, and processes that will ensure the highest levels of availability and reliability of all our applications. Your Impact Write automation code for provisioning and operating infrastructure at massive scale. Design, build and operate Cloud infrastructure to enable reliable and rapid deployment of microservices with effective monitoring and resilient operations. Work with development teams to ensure the applications are production-ready, scalable, and reliable from the ground up. Identify and drive opportunities to improve automation for code deployment, management, and visibility of application services. Develop tools and frameworks to automate operational tasks, deployment of machines, services, and applications. Establish end-to-end monitoring and alerting on all critical components of the application. Participate in the on-call rotation supporting the platform and/or the production application. Direct root cause analysis of critical business and production issues. Develop and mentor other SREs on standard methodology from Infra orchestration and troubleshooting application service in production. Represent SRE in design reviews and work cross-functionally with Engineering teams on operational readiness. Qualifications Your Experience BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required. Expertise in configuration management with a framework such as Terraform, Ansible, and Helm. Strong experience with Kubernetes. Strong Linux administration, internals, and network troubleshooting. Expertise in Google cloud computing (GCP) and resource management/operations on its related services. Proficiency with a programming language like Python and shell scripting to automate tasks. Strong experience with CI/CD pipeline, GitHub, Jenkins, Artifactory. Strong experience with metrics and monitoring tools such as Grafana and Prometheus. Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions. Strong fundamentals in API gateway including Nginx or Envoy. Experience with cloud infrastructure and their performance & cost optimizations. Experience with AWS is a big plus. Excellent interpersonal skills and the ability to work well in a team. Passionate to learn, understand, and dissect new technology stack quickly on own. Experience in building and managing large relational database clusters (MySQL/Percona etc.) will be a plus. The Team Our engineering team is at the core of our products – connected directly to the mission of preventing cyberattacks. We are constantly innovating – challenging the way we, and the industry, think about cybersecurity. Compensation Disclosure The compensation offered for this position will depend on qualifications, experience, and work location. The starting base salary is expected to be between $147,000 - $225,000/YR. The offered compensation may also include restricted stock units and a bonus. Our Commitment We’re problem solvers that take risks and challenge cybersecurity’s status quo. We are committed to providing reasonable accommodations for all qualified individuals with a disability. Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to legally protected characteristics.
#J-18808-Ljbffr