Logo
Omni Inclusive

SR.SRE (Linux & Windows)

Omni Inclusive, Los Angeles, California, United States, 90079


The Systems Reliability Engineering (SRE) team helps elevate SRE practices at TWDC, promoting and on-boarding new technologies, solving complex problems and integrating with next generation digital platforms.

Systems Reliability Engineers use a software engineering approach to architect, design, automate, monitor, and build applications at scale. This includes operating and engineering software with close business segment alignment to deliver platforms through efficient, effective and resilient architectures. SREs are talented engineers that are focused on improving quality through a data driven approach: instrumentation, automation, and functional/unit testing.

The Senior SRE will help create, build and deliver amazing experiences for our guests, fans and businesses. Primary responsibilities include helping existing, new and emerging business teams onboard technologies or platforms to accelerate their businesses. This will include consultation, designing, building, and supporting development pipelines, automating infrastructure and operations, creating telemetry for monitoring, engineering high reliability and reinforcing best practices to secure our company and guest data.

The Senior SRE is expected to have systems administration skills in Linux and Windows platforms, and must have experience with software development (e.g. Python, Go, Java, Node), CI Pipeline tools (e.g. Jenkins), Git source management, cloud hosting (AWS, GCP & Azure), container computing (e.g. Docker, OCI), web technologies and the DevOps team culture. This position will also bring a working knowledge on systems, network, operational excellence and application stability, security, performance, and capacity management, as well as documentation.

The Senior SRE must be prepared to work with engineering, creative and production teams in an extremely collaborative and high-energy environment to brainstorm, architect, gather requirements, troubleshoot, and provide stellar customer support. The ideal Senior SRE is passionate about constantly learning, applying technology to solve complex problems, and is a highly motivated, optimistic, proactive, creative thought leader and project manager.

The Senior SRE will:

Translate ideas into tangible products that shape experiences by focusing on a systematic approach to automation, resiliency, efficiency, stability, security, performance, and capacity management, as well as documentation and serve as a subject matter expert through internal and external tech talks and conferences.

Support initial discovery, architecture, design, automation, implementation and operationalization, including:Business Engagement and Requirements GatheringArchitectural Review, Proof of Concept Work, and OnboardingProject: Build and Operationalize New Systems/Sites/Services/ProductsSystematic Load Testing, Troubleshooting, Optimization and TuningCreate System and Application Monitors, Trending Metrics and ReportsDevelopment: Tools and Automation FrameworksHosting Platforms and Infrastructure Design and SupportDocumentation: Creation of Application Infrastructure Design documents, Operational Runbooks, and Knowledge Base Articles

Fluent in multiple scripting languages and advanced skills in programming languages (e.g. Go, Python, Ruby, Dart, Node, Java, others alike) with ability to build test coverage for all software being developed.Systems administration skills on Linux and Windows platformsNetworking skills and protocols (e.g. HTTP, TLS, SSH, DNS)Software Development Continuous Integration (CI) Pipeline knowledge (e.g. Jenkins, Gitlab CI)Experience with Distributed Systems and Container Platforms (e.g. Kubernetes/GKE, ECS, Mesos, Fargate, Nomad)Experience with Source Control Management systems (e.g. Git)Expertise in public and private cloud hosting services (AWS, Google Cloud, Azure)Recognized as an expert on at least one OS and proficient in multiple operating systems, including OS performance monitoring, setup, configuration, tuning, and troubleshooting.Proficient in web server technologies (e.g. Apache, Node.js, NginX, Tomcat, IIS, Caddy Server) including setup, configuration, performance monitoring, tuning, clustering, and debugging (e.g. JConsole).Proficient with data technologies (e.g. NoSQL, MySQL, MongoDB, Redis, Elastic) including being able to perform basic setup, configuration, and troubleshooting.Able to implement existing base standards for new systems and/or applications for all of the following:o Site/Systems monitoring and instrumentationo Application monitoring and instrumentationo System monitoring and instrumentationo Resilience, performance & Telemetry dataAble to diagnose simple to complex system and process problems.Demonstrate exceptional troubleshooting methodology, including the ability to author and instruct new methodologies to the SRE team.Independently resolve moderately to highly complex system and application incidents.Able to identify and propose system and application fixes for performance bottlenecks.Able to evaluate new application requirements for capacity and run-time best practices.Able to evaluate new systems and/or infrastructure solutions for technical feasibility against known requirements and standards.Effective at dealing with change: Able to transition in role or handle a significant modification or technology with minimal ramp-up time and with very little guidance.

Communication and Leadership Requirements

Excellent verbal and written communication to all levels in the organization.Demonstrates curiosity and continuous learning and self-improvement.Ability to write operational specs, architectural diagrams, test plans and requirements management.Communication of ideas and solutions in a clear and organized manner.Clear and effective presentations to groups of people, including internal and external conference presentations..Construction of concise and complete technical documentation.Mentoring of other engineers on technical material.Able to quickly and adeptly understand the needs of the business and be able to translate those needs into actionable items.

BS in Computer Science or related field with 5+ years

Strong communication skillsMost important Technical skills to have:- CI/CD Pipelines- Githab,Gitab-Cloud environment experience-Scripting-multiple languages-Terraform