Logo
ZipRecruiter

Senior DevOps Engineer with SRE

ZipRecruiter, Los Angeles, California, United States, 90079

Save Job

Job DescriptionJob DescriptionOverview:The Senior DevOps Engineer with SRE plays a critical role in our organization, responsible for designing and implementing solutions that improve our software development and operations processes. This position is vital to ensuring the reliability, scalability, and performance of our systems.Key Responsibilities:

Collaborate with development teams to create and maintain CI/CD pipelines

Design, implement, and maintain infrastructure as code using tools like Terraform

Monitor and improve system stability, performance, and reliability

Automate manual processes to improve efficiency and reduce human error

Implement and maintain security measures for the infrastructure

Troubleshoot and resolve issues in development, testing, and production environments

Collaborate with cross-functional teams to ensure smooth deployment and operation of systems

Participate in on-call rotation and respond to incidents as needed

Implement and maintain best practices in areas such as logging, monitoring, and alerting

Train and mentor junior team members

Contribute to the continuous improvement of DevOps processes and procedures

Evaluate new tools and technologies to improve DevOps processes

Manage and optimize cloud resources

Develop and maintain disaster recovery and business continuity plans

Participate in capacity planning and scalability assessments

Required Qualifications:

Bachelor's degree in Computer Science, Engineering, or related field

5+ years of experience in a DevOps or SRE role

Strong understanding of cloud platforms such as AWS, Azure, or GCP

Proficiency in at least one scripting such as Python, Ruby, or Bash

Experience with containerization and orchestration tools like Docker and Kubernetes

Deep knowledge of version control systems such as Git

Expertise in automation and configuration management tools like Ansible, Chef, or Puppet

Ability to troubleshoot and optimize software and infrastructure performance

Experience with monitoring and logging tools like Prometheus, ELK stack, or similar

Strong understanding of networking and security principles

Excellent communication and collaboration skills

Ability to work effectively in a fast-paced, dynamic environment

Relevant certifications such as AWS Certified DevOps Engineer or Certified Kubernetes Administrator are a plus

Experience with agile and Scrum methodologies

Proven track record of implementing and managing scalable, secure, and highly available systems