Resource Logistics
SRE Lead
Resource Logistics, Houston, TX, United States
Job Summary:
We seek an experienced SRE Lead to lead our team in ensuring system reliability, performance, and scalability. The candidate will drive infrastructure automation, optimize performance, and lead incident management, while fostering a culture of continuous improvement
Key Responsibilities:
• Technical Leadership: Build and mentor a team of SREs; set goals, conduct reviews, and drive SRE best practices.
• System Reliability: Oversee the design and maintenance of high-availability systems; lead performance monitoring and issue resolution.
• Automation & CI/CD: Lead development of automation scripts and enhance CI/CD pipelines using tools like Terraform, Ansible, and others
• Observability: Deploy and manage tools (e.g., New Relic) for system monitoring; develop dashboards and alerts
• Incident Management: Lead Root Cause Analysis (RCA) and refine incident response processes
• Performance Optimization: Provide strategic insights to enhance application and database performance (Java, Kafka, SQL)
Qualifications:
• Proven experience managing SRE or related teams in an eCommerce or highly distributed systems environment.
• Strong skills in automation tools (Terraform, Ansible) and observability solutions (New Relic), with an emphasis on managing large-scale distributed systems.
• Experience working with SAP modules in conjunction with custom applications or microservices Clienthitectures.
• Good understanding of storage technologies (SAN/NAS), network infrastructure (load balancers, firewalls), and their impact on system performance in high-throughput environments.
• Background in optimizing performance for Java-based applications, Spring Boot services, Kafka message brokers, SQL/NoSQL databases, and middleware components.
• Familiarity with middleware technologies such as Kafka in distributed environments.
• Excellent leadership, problem-solving, communication skills with experience working cross-functionally between development teams, infrastructure teams, and business stakeholders.
We seek an experienced SRE Lead to lead our team in ensuring system reliability, performance, and scalability. The candidate will drive infrastructure automation, optimize performance, and lead incident management, while fostering a culture of continuous improvement
Key Responsibilities:
• Technical Leadership: Build and mentor a team of SREs; set goals, conduct reviews, and drive SRE best practices.
• System Reliability: Oversee the design and maintenance of high-availability systems; lead performance monitoring and issue resolution.
• Automation & CI/CD: Lead development of automation scripts and enhance CI/CD pipelines using tools like Terraform, Ansible, and others
• Observability: Deploy and manage tools (e.g., New Relic) for system monitoring; develop dashboards and alerts
• Incident Management: Lead Root Cause Analysis (RCA) and refine incident response processes
• Performance Optimization: Provide strategic insights to enhance application and database performance (Java, Kafka, SQL)
Qualifications:
• Proven experience managing SRE or related teams in an eCommerce or highly distributed systems environment.
• Strong skills in automation tools (Terraform, Ansible) and observability solutions (New Relic), with an emphasis on managing large-scale distributed systems.
• Experience working with SAP modules in conjunction with custom applications or microservices Clienthitectures.
• Good understanding of storage technologies (SAN/NAS), network infrastructure (load balancers, firewalls), and their impact on system performance in high-throughput environments.
• Background in optimizing performance for Java-based applications, Spring Boot services, Kafka message brokers, SQL/NoSQL databases, and middleware components.
• Familiarity with middleware technologies such as Kafka in distributed environments.
• Excellent leadership, problem-solving, communication skills with experience working cross-functionally between development teams, infrastructure teams, and business stakeholders.