Logo
Apple

Sr. Site Reliability Engineer (SRE) - iCloud

Apple, Cupertino, California, United States, 95014


Sr. Site Reliability Engineer (SRE) - iCloud

Cupertino, California, United States

Software and Services

The Apple Service Engineering - iCloud SRE team is looking for Site Reliability Engineers to build and run the services that hundreds of millions of customers use every day. This team provides systems that are foundational for many of Apple’s services such as iCloud, iMessage, and FaceTime. The best candidates will have both demonstrated Software Development skills and strong Linux / Systems / Cloud expertise. Our customers count on us to provide extraordinary availability, scalability, and security for services that “just work.” We're looking for a talented and passionate person who loves designing, engineering, and running systems and infrastructure that will help millions of customers.

Description

The services that Apple and iCloud runs are massive; iCloud comprises a set of platforms and products which are foundational for both users and other Apple Services. As an SRE @ Apple, you'll need to solve problems using data, teamwork, and your own expertise. SREs @ Apple own the full infrastructure stack; from device driver performance debugging to content delivery network traffic management, our responsibilities are both broad and deep. Systems are run both directly on Linux and in the Cloud. We run a mix of open source and internally developed tools for system & configuration management, provisioning, software deployment, and monitoring. You'll learn these tools and have opportunities to improve them. Our team is collaborative; we work closely with the development teams we support to deliver the best results for Apple. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.

Responsibilities:

Deploy, support, and monitor new and existing services, platforms, and application stacks.

Use scale testing to measure, tune, and optimize system performance.

Enhance, architect, author, and deliver software to improve the availability, scalability, and security of Apple's internet services.

Build and manage systems, infrastructure, and applications through automation.

Participate in periodic on-call duties.

Minimum Qualifications:

Strong sense of ownership, customer service, and integrity demonstrated through clear communication.

Experience in building and scaling distributed systems in a public, private, or hybrid cloud environment.

Experience with deploying, supporting, and monitoring new and existing services, platforms, and application stacks.

Excellent troubleshooting and problem-solving skills.

Passion for eliminating repetitive manual processes using automation and improving them through repeated iteration.

Proven track record to write programs using a high-level programming language like Java, Go, Python, or Perl.

Experience handling large numbers of diverse systems with configuration management systems like Puppet, Chef, Ansible, or Salt.

Understanding of the Linux Operating System, including Kernel, Memory, Process, Threads, Static/Shared Libraries, IPC, and Signals.

BS in Computer Science or related field, or equivalent employment.

Preferred Qualifications:

Experience with scale testing, disaster recovery, and capacity planning.

Proclivity towards efficient programming emphasizing improvement via complexity analysis.

Understanding of standard networking protocols and components such as HTTP, DNS, ECMP, TCP/IP, ICMP, the OSI Model, Subnetting, and Load Balancing strategies.

Apple is an equal opportunity employer that is committed to inclusion and diversity.

#J-18808-Ljbffr