Metr
DevOps Engineer
Metr, Berkeley, California, United States, 94709
Berkeley, California or hybrid. We’re open to remote for exceptional candidates.METR is looking for a Sr. DevOps engineer to manage our cloud based infrastructure, notably the deployment of our open source evals platform Vivaria.Responsibilities
Manage our cloud infrastructure (AWS)Implement and advise on best practices for the usage of containerization services (Docker, Kubernetes), including GPU and multi-cloud workloadsManage our networking infrastructure (Tailscale)Advise and implement best practices to increase scalability and reliability of our systems (order of hundreds of concurrent running containers), including helping transition to a serverless architecture where appropriateOpportunities to advise on and/or help implement our growing data pipelinesEstablish patterns and best practices for CI/CD across our organizationWhat we’re looking for
An ideal candidate is someone who has substantial expertise in the above technologies and can advise on nonstandard usages. For example, we create large numbers of containers whose process state needs to be non-ephemeral, which is an unusual requirement for k8s. Experience with DevSecOps and application/infrastructure hardening is also valuable.We expect that candidates will have had 5+ years of experience with DevOps and strong expertise in k8s in particular, although we are open to candidates who can demonstrate substantial expertise through open source projects/portfolios.In this role, you’d have a voice in shaping the technology and architecture of METR’s evaluation platform as we scale it to the next order of magnitude.About us
METR is a non-profit doing empirical research to test for whether frontier AI models possess the capability to permanently disempower humanity. We develop scientific methods to assess these risks accurately, and work with frontier AI companies (e.g., OpenAI, Anthropic), and government agencies to deploy these assessments. Our work helps ensure the safe development and deployment of transformative AI systems.Some highlights of our work so far:Establishing autonomous replication evaluations: Thanks to our work, it’s now an industry norm to test models for autonomous capabilities (such as self-improvement and self-replication).Pre-release evaluations: We’ve worked with OpenAI and Anthropic to evaluate their models pre-release, and our research has been widely cited by policymakers, AI labs, and within government.Inspiring lab evaluation efforts: Multiple leading AI companies are building their own internal evaluation teams, inspired by our work.We are a motivated, fast-paced, growing team (currently ~20 people). Candidates should be excited about working entrepreneurially in a rapidly changing environment while helping to strengthen the organization's operational rigor.Logistics
Successful candidates will likely complete some rounds of paid work tests and interviews, followed by an in-person trial (assuming you’re able to). Registering interest is short/quick — there’s just a single required question, which can be answered with a few bullet points.Deadline to apply :
None. Applications will be reviewed on a rolling basis.Location: This role would be in-person out of our beautiful coworking space in Berkeley, CA.Apply for this job
We encourage you to apply even if your background may not seem like the perfect fit!
We would rather review a larger pool of applications than risk missing out on a promising candidate for the position. If you lack US work authorization, we can likely sponsor a cap-exempt H-1B visa for this role.We are committed to diversity and equal opportunity in all aspects of our hiring process. We do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We welcome and encourage all qualified candidates to apply for our open positions.
#J-18808-Ljbffr
Manage our cloud infrastructure (AWS)Implement and advise on best practices for the usage of containerization services (Docker, Kubernetes), including GPU and multi-cloud workloadsManage our networking infrastructure (Tailscale)Advise and implement best practices to increase scalability and reliability of our systems (order of hundreds of concurrent running containers), including helping transition to a serverless architecture where appropriateOpportunities to advise on and/or help implement our growing data pipelinesEstablish patterns and best practices for CI/CD across our organizationWhat we’re looking for
An ideal candidate is someone who has substantial expertise in the above technologies and can advise on nonstandard usages. For example, we create large numbers of containers whose process state needs to be non-ephemeral, which is an unusual requirement for k8s. Experience with DevSecOps and application/infrastructure hardening is also valuable.We expect that candidates will have had 5+ years of experience with DevOps and strong expertise in k8s in particular, although we are open to candidates who can demonstrate substantial expertise through open source projects/portfolios.In this role, you’d have a voice in shaping the technology and architecture of METR’s evaluation platform as we scale it to the next order of magnitude.About us
METR is a non-profit doing empirical research to test for whether frontier AI models possess the capability to permanently disempower humanity. We develop scientific methods to assess these risks accurately, and work with frontier AI companies (e.g., OpenAI, Anthropic), and government agencies to deploy these assessments. Our work helps ensure the safe development and deployment of transformative AI systems.Some highlights of our work so far:Establishing autonomous replication evaluations: Thanks to our work, it’s now an industry norm to test models for autonomous capabilities (such as self-improvement and self-replication).Pre-release evaluations: We’ve worked with OpenAI and Anthropic to evaluate their models pre-release, and our research has been widely cited by policymakers, AI labs, and within government.Inspiring lab evaluation efforts: Multiple leading AI companies are building their own internal evaluation teams, inspired by our work.We are a motivated, fast-paced, growing team (currently ~20 people). Candidates should be excited about working entrepreneurially in a rapidly changing environment while helping to strengthen the organization's operational rigor.Logistics
Successful candidates will likely complete some rounds of paid work tests and interviews, followed by an in-person trial (assuming you’re able to). Registering interest is short/quick — there’s just a single required question, which can be answered with a few bullet points.Deadline to apply :
None. Applications will be reviewed on a rolling basis.Location: This role would be in-person out of our beautiful coworking space in Berkeley, CA.Apply for this job
We encourage you to apply even if your background may not seem like the perfect fit!
We would rather review a larger pool of applications than risk missing out on a promising candidate for the position. If you lack US work authorization, we can likely sponsor a cap-exempt H-1B visa for this role.We are committed to diversity and equal opportunity in all aspects of our hiring process. We do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We welcome and encourage all qualified candidates to apply for our open positions.
#J-18808-Ljbffr