Inworld AI

Staff Cloud DevOps/Site Reliability Engineer (SRE) - USA

Inworld AI, Mountain View, CA, United States

Our Technical Operations team manages the infrastructure, DevOps, and Site Reliability of our platform. We are looking for a Staff Cloud DevOps/Site Reliability Engineer to join our team.

Qualifications

Bachelor's degree in Computer Science, Engineering, or a related field
7+ years of experience as a DevOps, Infrastructure, Operations, or Site Reliability Engineer (or as a software engineer with relevant experience).
At least 2 years experience each with:

Terraform
Helm
Kubernetes
AWS, Azure, or GCP
CI/CD using modern tools (GitOps)

Optional (not required but considered a plus):

MLOps (building, orchestrating, and maintaining Machine Learning Pipelines)
Prometheus / Grafana
Multi-cloud deployments (2 or more)
ArgoCD
Network management and VPNs

Responsibilities

Infrastructure: Maintain and contribute to Infrastructure-as-Code (Terraform)
DevOps and CI/CD Pipelines: Orchestrate pipelines using Github Actions, Helm, ArgoCD
Microservices scalability: Kubernetes Administration
Cloud Administration
Site Reliability: Measure and monitor availability, latency, and overall service health, drive incident management and post-mortem analysis

Location

In-office location: Mountain View, CA, United States.
Remote location: United States.

Compensation

The US base salary range for this full-time position is $180,000 - $280,000. In addition to base pay, total compensation includes equity and benefits. Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs. The base pay range is subject to change and may be modified in the future.

#J-18808-Ljbffr