Infrastructure Engineer
Roboflow - San Francisco, California, United States, 94199
Work at Roboflow
Overview
- View job
Overview
Who We Are
Our mission is to make the world programmable. Sight is one of the key ways we understand the world, and soon this will be true for the software we use, too. At Roboflow, we’re building the tools, community, and resources needed to make the world programmable with artificial intelligence. Roboflow simplifies building and using computer vision models. Today, over 1M+ developers, including those from half the Fortune 100, use Roboflow’s machine learning open source and hosted tools. What We're Looking For
Primarily, you like to make great things with passionate colleagues. You are someone that likes to own outcomes, not only inputs. You’re motivated by having responsibility and accountability. You’re eager to ‘do the work,’ big and small. What You'll Do
The focus of this role is on securing, scaling, and maintaining the infrastructure that powers our product backend, including: our cloud architecture, databases, file storage, search cluster, micro-services, and machine learning pipelines. Skillset
Some or all of the following would be helpful: Production experience with Kubernetes Infrastructure-as-code - Terraform, Kubernetes Helm charts, bash scripting and Python-based automation in production environments Scale - operating infrastructure for large scale applications, especially in the machine learning/AI space Site reliability - alerting, monitoring, scaling services in AWS and GCP clouds Node.js and Python programming skills; ability to work with full-stack developers on designing, developing, and operating SaaS applications Experience with machine learning/big data at scale (GPU, Docker and Kubernetes) Experience with CI/CD automation (for example Github actions, Spacelift) Prior experience with machine learning libraries and stacks (Pytorch, Tensorflow, OpenCV) is a plus. Awareness of security best practices and tightening infrastructure for highly secure cloud operations; ideally experienced in a GDPR, ISO 27001 and/or SoC2 certification for SaaS applications Examples of tasks Running a high availability machine learning inference service Work with customer security teams to securely integrate Roboflow with their systems Develop infrastructure-as-code solutions to scale Roboflow in a cost-effective manner Work with the engineering team to define SLAs and SLOs Diving into cost optimization opportunities across the Roboflow stack Participate in on-call rotations Who You'll Be Working With
Our team of ~60 attracts talent like executives that wanted to return to building, founders with a 100M+ exit, Roboflow users turned team members, open source contributors, and many exceptional others. Where You'll Work
Roboflow is distributed across the US and Europe. We currently have Hubs in New York City and San Francisco. We provide opportunities to work in person with other team members as much as you'd like, while also supporting remote team members. When You'll Work
Roboflow primarily operates during the daytime hours in the US and there are some synchronous meetings you’ll be expected to attend each week. Apart from that, we have a flexible schedule. What You'll Receive
The target compensation for this role is USD $165,000 base. In addition to our cash compensation, we offer generous perks and benefits. Interview Process (~5 hours)
Below is the interview process you can expect for this role. We are all motivated to work with an exceptional team and don't currently have in-house recruiters. Not sure if this is you?
We want a diverse, global team with a broad range of experience and perspectives. If this job sounds great, but you’re not sure if you qualify, we carefully consider every application. Learn More About Us
We are building a diverse Distributed team that is distributed across the globe. Roboflow is an equal opportunity workplace; we welcome people from all backgrounds, communities, and experiences.
#J-18808-Ljbffr