ZipRecruiter
Staff Software Engineer
ZipRecruiter, San Francisco, California, United States, 94199
Job DescriptionJob Description Join us at Cleric
We're building an autonomous AI agent that investigates and resolves production incidents. Our agent combines LLMs with tools to understand systems, reason through problems, and take corrective actions - even for issues it hasn't encountered before. The goal is to let engineers focus on building products, not fighting fires.
We're a small team of AI and infrastructure veterans backed by leading AI investors. Our product is already running in production at high-scale companies across fintech, ride-sharing, and autonomous vehicles.
About the role
As Staff Software Engineer at Cleric, you'll build the core systems that power our autonomous AI agent. Your work will span the entire agent architecture - from the reasoning engine and tool integration framework, to the systems that help the agent learn from its actions and improve over time.
You'll design and implement the foundations that enable our agent to understand complex production environments, make decisions, and take actions. This includes building the evaluation frameworks to measure agent performance, systems to capture and learn from historical actions, and platforms to test agent behavior safely before deployment.
A significant part of your role will focus on creating the infrastructure that allows our agent to expand its capabilities. You'll develop systems for tool integration, memory management, and continuous learning that help the agent handle increasingly complex scenarios. You'll also build the observability and monitoring systems needed to understand agent behavior and validate its decisions.
You'll work directly with our early customers, translating their needs into technical requirements and helping shape our product direction. As one of our early engineers, you'll establish engineering practices, mentor team members, and make key architectural decisions that will scale with our growth.
You have
6+ years of software engineering experience in production systems
Strong Python experience and software engineering fundamentals
Strong understanding of observability practices and tools for distributed systems
Extensive experience with Kubernetes, Helm, and Terraform and major cloud providers
Obsessed with continuous delivery and helping engineering teams ship quickly
History of on-call and production incident management
Experience mentoring engineers and driving technical decisions
Ability to challenge assumptions and propose pragmatic solutions
Nice to have
Experience with LLM-based systems or LLM based agents
Background in observability, monitoring, or production systems
Previous startup experience
What you’ll do
Design and build our AI agent's core reasoning engine, tool framework, memory, learning, and evaluation systems
Create testing environments for safe agent development
Collaborate with ML engineers to expand agent capabilities
Develop platforms to observe and improve agent behavior
Work with early customers to rapidly iterate on features
Help shape our technical architecture and engineering culture
How we work
Small teams, big impact: We believe that small teams can deliver great products.
Culture matters: We value radical candor in a positive and inclusive work environment.
In-person collaboration: We believe in working closely to deliver the best results.
AI-first approach: We don't simply build AI products; we augment ourselves with it.
Interview process (you'll meet most of the team via the process)
Intro Call
Discuss your experience, the company, product, and the role.
Software Engineering Session (1 hour)
Collaboratively build an application.
Focus on practical software engineering, not algorithm challenges.
System Design Session (90 mins)
Work through a system design problem relevant to your daily work.
Product Thinking and Engineering Practices (60 mins)
Talk about your perspectives on building a great product.
Deep dive on engineering practices and culture
Compensation Range: $160K - $220K
We're building an autonomous AI agent that investigates and resolves production incidents. Our agent combines LLMs with tools to understand systems, reason through problems, and take corrective actions - even for issues it hasn't encountered before. The goal is to let engineers focus on building products, not fighting fires.
We're a small team of AI and infrastructure veterans backed by leading AI investors. Our product is already running in production at high-scale companies across fintech, ride-sharing, and autonomous vehicles.
About the role
As Staff Software Engineer at Cleric, you'll build the core systems that power our autonomous AI agent. Your work will span the entire agent architecture - from the reasoning engine and tool integration framework, to the systems that help the agent learn from its actions and improve over time.
You'll design and implement the foundations that enable our agent to understand complex production environments, make decisions, and take actions. This includes building the evaluation frameworks to measure agent performance, systems to capture and learn from historical actions, and platforms to test agent behavior safely before deployment.
A significant part of your role will focus on creating the infrastructure that allows our agent to expand its capabilities. You'll develop systems for tool integration, memory management, and continuous learning that help the agent handle increasingly complex scenarios. You'll also build the observability and monitoring systems needed to understand agent behavior and validate its decisions.
You'll work directly with our early customers, translating their needs into technical requirements and helping shape our product direction. As one of our early engineers, you'll establish engineering practices, mentor team members, and make key architectural decisions that will scale with our growth.
You have
6+ years of software engineering experience in production systems
Strong Python experience and software engineering fundamentals
Strong understanding of observability practices and tools for distributed systems
Extensive experience with Kubernetes, Helm, and Terraform and major cloud providers
Obsessed with continuous delivery and helping engineering teams ship quickly
History of on-call and production incident management
Experience mentoring engineers and driving technical decisions
Ability to challenge assumptions and propose pragmatic solutions
Nice to have
Experience with LLM-based systems or LLM based agents
Background in observability, monitoring, or production systems
Previous startup experience
What you’ll do
Design and build our AI agent's core reasoning engine, tool framework, memory, learning, and evaluation systems
Create testing environments for safe agent development
Collaborate with ML engineers to expand agent capabilities
Develop platforms to observe and improve agent behavior
Work with early customers to rapidly iterate on features
Help shape our technical architecture and engineering culture
How we work
Small teams, big impact: We believe that small teams can deliver great products.
Culture matters: We value radical candor in a positive and inclusive work environment.
In-person collaboration: We believe in working closely to deliver the best results.
AI-first approach: We don't simply build AI products; we augment ourselves with it.
Interview process (you'll meet most of the team via the process)
Intro Call
Discuss your experience, the company, product, and the role.
Software Engineering Session (1 hour)
Collaboratively build an application.
Focus on practical software engineering, not algorithm challenges.
System Design Session (90 mins)
Work through a system design problem relevant to your daily work.
Product Thinking and Engineering Practices (60 mins)
Talk about your perspectives on building a great product.
Deep dive on engineering practices and culture
Compensation Range: $160K - $220K