Logo
Resource Logistics

SRE Lead

Resource Logistics, Houston, TX, United States


Job Summary:

We seek an experienced SRE Lead to lead our team in ensuring system reliability, performance, and scalability. The candidate will drive infrastructure automation, optimize performance, and lead incident management, while fostering a culture of continuous improvement

Key Responsibilities:
Technical Leadership: Build and mentor a team of SREs; set goals, conduct reviews, and drive SRE best practices.
System Reliability: Oversee the design and maintenance of high-availability systems; lead performance monitoring and issue resolution.
Automation & CI/CD: Lead development of automation scripts and enhance CI/CD pipelines using tools like Terraform, Ansible, and others
Observability: Deploy and manage tools (e.g., New Relic) for system monitoring; develop dashboards and alerts
Incident Management: Lead Root Cause Analysis (RCA) and refine incident response processes
Performance Optimization: Provide strategic insights to enhance application and database performance (Java, Kafka, SQL)

Qualifications:
• Proven experience managing SRE or related teams in an eCommerce or highly distributed systems environment.
• Strong skills in automation tools (Terraform, Ansible) and observability solutions (New Relic), with an emphasis on managing large-scale distributed systems.
• Experience working with SAP modules in conjunction with custom applications or microservices Clienthitectures.
• Good understanding of storage technologies (SAN/NAS), network infrastructure (load balancers, firewalls), and their impact on system performance in high-throughput environments.
• Background in optimizing performance for Java-based applications, Spring Boot services, Kafka message brokers, SQL/NoSQL databases, and middleware components.
• Familiarity with middleware technologies such as Kafka in distributed environments.
• Excellent leadership, problem-solving, communication skills with experience working cross-functionally between development teams, infrastructure teams, and business stakeholders.