Logo
ZipRecruiter

Cloud Site Reliability Engineer (SRE)

ZipRecruiter, Washington, District of Columbia, us, 20022


Job DescriptionJob DescriptionWe are seeking a skill, legally authorized to work in the US Cloud Site Reliability Engineer. Do you have an interest in Infrastructure Engineering, software architecture design and cloud computing? SRE/Cloud Engineers are responsible for creating infrastructure designs and guiding the development and implementation of cloud applications, systems, and processes. You will help build cutting-edge proof-of-concepts and collaborate with other engineering teams during implementation. The ideal candidate will need to be able to handle multiple tasks in a fast-paced team environment.

Key Job Functions

• Design automation implementation strategy for business and IT processes using the latest AWS technologies.

• Design, implement and maintain robust Infrastructure pipelines and deployment process for machine learning models and AI applications, ensuring reliability and efficient resource utilization.

• Build and maintain a robust infrastructure platform to support and enhance business processes and engineering capabilities through self-service deployment capabilities.

• Assist in automating Infrastructure as Code in Amazon Web Services with CloudFormation or Terraform.

• Build CI/CD pipeline utilizing AWS suite of services to orchestrate provisioning and deployment of both large- and small-scale systems.

• Write unit tests and integration tests to validate infrastructure code.

• Identify engineering defects in the existing code base and constantly improve the code quality.

• Establish and mature standards for observability, monitoring, and troubleshooting to detect and respond to production issues effectively and proactively.

• Automate routine operational tasks to improve the efficiency of ML Ops processes and reduce manual intervention.

• Support incident response processes, including troubleshooting and root cause analysis of system failures and performance degradation.

• Define and document best practices and strategies regarding application development and deployment activities.

Qualifications:

Education

• Bachelor’s degree in Computer Science, MIS or related technical field required.

Minimum Experience

• Minimum 3-5 years of Infrastructure and DevOps related engineering experience

Specialized Knowledge & Skills

• Proficiency in at least one high-level scripting (PowerShell, Python, Bash, etc.).

• Legally authorized to work in the US or is a US . • Working knowledge of Continuous Integration and Continuous Deployment methodologies.

• Experience in microservice migration, including containerization and serverless architecture.

• Experience in building custom templates leveraging AWS CFT.

• Experience building pipelines with CI/CD tools.

• Experienced in architecting for high availability and cost-effective scalability in AWS.

• Knowledge in wide variety of open-source technologies and cloud services.

• Experience working in Linux/Unix and Windows administration.

• Hands on experience in following the iterative and agile SDLC.

• Experience with system capacity and planning, as well as functional configuration and audit.

• Must be available for occasional weekend or non-business hour activity involvement related to Infrastructure release, maintenance and patching as needed on a rotational basis.

#J-18808-Ljbffr