Logo
Cognizant North America

Lead Site Reliability Engineer

Cognizant North America, Plano, Texas, us, 75086


About Cognizant's Digital Engineering Practice:At Cognizant Digital Engineering, a small cross-functional team comprised of a Product Manager, an Architect, Full-Stack Developers, UI/UX designers, and Big Data analysts builds higher quality software faster than siloed individuals working independently. Small, nimble engineering teams generate collective empathy and comradery, thus increasing their ability to anticipate unforeseen development scope changes and maintain high-quality deliverables. Across our US Studio system or within client development sites, our Digital Engineering teams ideate and develop innovative cloud-based solutions following a Lean-Agile process with a DevOps culture. Working in Cognizant Digital Engineering provides DevOps engineers consistent opportunities to push digital boundaries while growing their exposure to transformational technologies.The Role:Cognizant is looking for an experienced and innovative

Lead SRE Engineer

to serve our diverse base of global clients. As a member of our team, you will build cutting-edge, cloud-based software that powers modern business. An ideal candidate is someone who enjoys working in a diverse, collaborative, geographically distributed team. Similarly, the ideal candidate is an expert engineer who values the 'team', drives continuous improvement, and is unafraid to challenge the legacy status quo with creative cloud-based solutions.Location: Plano, TexasResponsibilities:Strong SRE experience with Java, AWS, DevOps, deployment strategy, and monitoring tools. Hands-on experience with Dynatrace, Splunk, CICD, Grafana, etc.Application troubleshooting experience, focusing on core SRE metrics before going to production, including uptime vs availability, monitoring vs observability, and incident management.Familiarity with SLO, SLA, SLI, or other SRE terminology.Experience deploying using CICD pipelines and debugging/troubleshooting issues, coordinating with application teams such as Java, Spring Boot, Python, .Net, etc.Ability to perform API performance testing using tools such as JMeter or Blazemeter.Experience identifying root cause analysis (RCA) for production issues in AWS environments with multiple microservices.Expertise in Terraform to manage infrastructure as code, troubleshooting, and resolving technical issues to ensure smooth operation of applications.Champion site reliability culture and practices, exerting technical influence throughout the team.Lead initiatives to improve the reliability and stability of applications and platforms using data-driven analytics.Collaborate with team members to identify comprehensive service level indicators and establish reasonable service level objectives and error budgets with customers.Demonstrate a high level of technical expertise within one or more technical domains and proactively solve technology-related bottlenecks.Act as the main point of contact during major incidents, quickly identifying and solving issues to avoid financial losses.Document and share knowledge within the organization via internal forums and communities.Required Skills:8+ years of relevant work experience.Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices.Fluency in Java programming.Proficiency in observability tools such as Splunk, Grafana, Dynatrace, Prometheus, Datadog.Experience with continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform).Experience with container orchestration (e.g., ECS, Kubernetes, Docker).Experience with infrastructure as code tools such as Terraform and managing/supporting cloud-based applications, preferably AWS.Excellent communication skills.Benefits:

Cognizant offers the following benefits for this position, subject to applicable eligibility requirements:Medical/Dental/Vision/Life InsurancePaid holidays plus Paid Time Off401(k) plan and contributionsLong-term/Short-term DisabilityPaid Parental LeaveEmployee Stock Purchase PlanDisclaimer: The salary, other compensation, and benefits information is accurate as of the date of this posting. Cognizant reserves the right to modify this information at any time, subject to applicable law.

#J-18808-Ljbffr