Telly

Principal Infrastructure Engineer

Telly, Los Angeles, California, United States, 90079

Telly is reinventing the television and reimagining advertising as a FastCompany 'Most Innovative Company' for 2024. Join people from across the entertainment, tech, and hardware space to, not just change the channel, but build the ultimate television experience for the largest possible audience. We think the TV is an essential gathering spot. A reflection of the people that share in its smarts. A unifying hub to stay entertained, informed, fit, and connected. We call this Telly and it’s the heartbeat of your home.

If you’re all in on making the biggest innovation in TV since color, read on!

Position Summary: A Principal Infrastructure Engineer at Telly is tasked with leading the design, architecture, and automation of scalable, secure infrastructure solutions while optimizing CI/CD pipelines and monitoring systems. They handle high-priority incidents, drive scalability, and ensure security and cost efficiency, all while promoting infrastructure as code practices. In addition to their technical expertise, they mentor junior engineers, lead cross-team collaboration, and contribute to the technical roadmap. They manage infrastructure-related projects, improve processes, and communicate system performance to stakeholders, while participating in hiring, team development, and budgeting.

Key Responsibilities:

Infrastructure Design & Architecture: Lead the design, architecture, and implementation of scalable, secure, and reliable cloud infrastructure solutions.

Automation & Tooling: Develop and maintain automation scripts and tools to streamline infrastructure deployment, management, and monitoring.

CI/CD Pipelines: Design, implement, and optimize continuous integration and continuous deployment (CI/CD) pipelines to ensure fast and reliable code releases.

Monitoring & Observability: Oversee the implementation of monitoring, logging, and observability solutions to track system health, performance, and uptime.

Incident Management & Troubleshooting: Act as the top-level escalation point for resolving high-priority infrastructure issues and incidents, conducting root cause analysis and post-mortems.

Infrastructure as Code (IaC): Lead the adoption and maintenance of infrastructure as code practices using tools like Terraform.

Security & Compliance: Ensure infrastructure security practices are adhered to, including managing firewalls, access controls, vulnerability assessments, and compliance with industry standards.

Cost Optimization: Assist in managing cloud and infrastructure costs, identifying optimization opportunities without sacrificing performance or availability.

Scalability & Performance Optimization: Drive initiatives to improve system scalability, reliability, and performance, anticipating future needs based on business growth.

Team Mentorship & Development: Provide mentorship and guidance to junior and mid-level engineers, promoting career growth and continuous skill development.

Cross-Team Collaboration: Serve as the primary technical liaison between engineering teams (development, QA, product) and other stakeholders for infrastructure and operations-related decisions.

Technical Roadmap & Strategy: Contribute to defining the technical vision and roadmap for DevOps, SRE, and infrastructure practices in alignment with business objectives.

Project Leadership: Lead infrastructure-related projects, ensuring timely delivery, appropriate resource allocation, and effective communication across teams.

Process Improvement: Identify areas for process improvement and implement best practices for engineering workflows, incident management, and deployment strategies.

Team Leadership: May have direct or dotted-line reports, participating in hiring, performance reviews, and setting team goals.

Stakeholder Communication: Report on infrastructure performance, reliability metrics, and project progress to senior management, presenting complex technical details in an understandable manner.

Budgeting & Vendor Management: Assist in budgeting for infrastructure tools, services, and external vendors, ensuring cost-effective decision-making aligned with organizational goals.

Qualifications:

Minimum of 10+ years of experience in infrastructure engineering, with a proven track record in designing and managing scalable, high availability systems.

Deep expertise in Amazon Web Services (AWS), with extensive experience in critical services such as API Gateway, Route53, Cognito, Elastic Container Service (ECS), Lambda, and IoT Core.

Demonstrated ability to architect and implement scalable, low-latency environments, ensuring performance and reliability at scale.

Mastery of Infrastructure as Code (IaC), with a strong emphasis on Terraform, to drive efficient, repeatable infrastructure deployments.

Extensive experience in implementing Continuous Integration and Continuous Delivery (CI/CD) pipelines, and developing tools to enhance developer productivity.

Proven experience leading and mentoring small technical teams, with a focus on driving results and fostering collaboration.

Exceptional analytical, problem-solving, and communication skills, both written and verbal, with the ability to influence and guide technical direction.

Highly self-organized and motivated, with the ability to independently lead projects, manage dependencies, and deliver results in a complex environment.

Exposure to Cost optimization and FinOps

Preferred Experience:

Extensive hands-on experience in one or more statically typed languages (e.g., Java, C#, C, C++) or dynamically typed languages (e.g., JavaScript, Python, Ruby).

Strong understanding of FinOps (Cloud Financial Management) principles, with the ability to optimize cloud costs while maintaining high performance and scalability.

Experience in the TV entertainment domain, with a deep understanding of industry-specific challenges and solutions.

Proven experience working with remote and distributed teams across various time zones and cultures, with the ability to lead and collaborate effectively in a global context.

What We Offer:

Competitive salary and benefits package.

Opportunity to lead and shape the SRE and DevOps function at an innovative and fast-growing company.

Collaborative and dynamic work environment with a focus on continuous learning and development.