GEICO
Senior Manager, Engineering - SRE Network Operations Center (NOC)
GEICO, Chevy Chase, Maryland, United States, 20815
GEICO is seeking a dynamic, highly motivated Senior Manager to join our Reliability Engineering organization to oversee our Network Operations Center (NOC), a central point of communication and incident management across our TECH organization. You will be part of a team that facilitates Incident calls, measures, and improves production performance, availability, and reliability through sustainable engineering practices for our mission critical systems. You will work closely with our Product, Platform, Security, and other Infrastructure teams to continuously automate, improve our products' availability to our customers. You will also be managing a team of NOC engineers with different technology expertise who are passionate about triaging and collaborating with various Product groups across the organization to resolve issues effectively and efficiently.
The Senior Manager, Engineering - SRE Network Operations Center (NOC)
is the cornerstone of the NOC's transformation into an SRE-centric organization. This role demands a visionary leader who can bridge the gap between traditional NOC practices and modern Incident Response SRE methodologies, ensuring that the NOC operates with maximum efficiency, reliability, and resilience. The Senior Manager is responsible for the strategic direction, leadership, and operational management of the NOC SRE team, ensuring the team delivers on its 24/7/365 mission.
Key Responsibilities:
Strategic Leadership:
Define the strategic direction for the NOC with a focus on adopting and embedding SRE practices across all operational processes. This includes promoting a culture of continuous improvement, automation, and reliability engineering.Team Leadership:
Lead a team of 15+ Incident Response SRE engineers, providing guidance, mentorship, and support to ensure high performance. This includes managing performance reviews, professional development, and career growth opportunities for team members.Incident Management:
Serve as the ultimate incident commander during critical incidents, ensuring that the incident response process is handled efficiently and effectively from detection to resolution. This includes overseeing incident communication and ensuring that all stakeholders are informed and aligned.Operational Oversight:
Develop and maintain a robust schedule that ensures 24/7/365 coverage by the NOC SRE team, optimizing shift patterns and staffing levels to meet operational demands.SRE Transformation:
Drive the adoption of SRE practices, working closely with senior leadership to implement changes that reduce toil, enhance reliability, and improve the overall incident response process. This includes spearheading cultural changes within the organization to embrace SRE principles.Executive Communication:
Oversee the creation and delivery of high-quality incident communication reports to executives and key stakeholders, ensuring that all communications are clear, accurate, and actionable.Observability and Monitoring:
Oversee the development and maintenance of observability dashboards using Grafana and Prometheus, ensuring that these tools provide meaningful insights and actionable data for the NOC and other teams.Notification and Escalation:
Establish and maintain a robust notification, escalation, and paging process that ensures support teams are mobilized as quickly as possible during incidents.Simulations and Drills:
Plan and execute regular simulations and dry runs to build muscle memory for incident response processes, ensuring that the team is prepared for any eventuality.Backlog Management:
Oversee the continual maintenance and grooming of the NOC SRE backlog, prioritizing improvements and ensuring that the team remains focused on high-impact tasks.Education, Work Experience, and Qualifications:•Bachelor's degree in Computer Science, Information Technology or a related field•Cloud Certifications are a plus (preferably Azure or AWS)•10+ years of hands-on work experience supervising personnel in a technical environment is a plus•Must have excellent troubleshooting skills and a thorough understanding of Operations, Incident Management, systems engineering, and infrastructure support in a cross-platform environment•Must be on-call to provide support for Major Incidents (MI) and NOC issues as needed•Experience leading a team in a fast-paced environment is highly desirable•Excellent verbal & written communication skills are required. Must have excellent technical writing skills and be able to create procedural documentation and departmental policies when required.•Must have an overall understanding of cloud computing and other internet technologies and the ability to communicate and discuss those technologies with all levels of management and internal customers.Desired Qualifications:Skill Breadth and Depth and Projects:•Ability to facilitate resolution of multiple incidents at any given time•Excellent team building and leadership skills•Strong interpersonal and presentation skills•Excellent understanding of incident management•Excellent understanding of infrastructure systems and tools
Technical Skills:•Experience in Agile methodologies such as Kanban and Scrum•Observability tools such as Splunk, NetQoS, Dynatrace, Aternity, Moogsoft, etc.•Strong knowledge and experience in IT and network systems troubleshooting•Excellent understanding of Windows and Linux operating systems•Excellent understanding of Networking technologies•Good understanding of Cloud Computing technologies and concepts (SaaS, PaaS, IaaS, etc.)•Azure Fundamentals (AZ 900) Certification•Azure PowerShell or any shell-scripting language - Bash/KSH/Perl/Python/etc.•Experience with full stack engineering & development from Java front end services to backend storage systems in both SQL and no-SQL contexts•Solid ability to communicate effectively with both technical staff and end users; by providing exceptional client service skills.•Requires strong problem solving, time management, flexibility, and communication skills.•Must have the ability to multi-task, organize and document many tasks at one time.•Self-starter with strong organizational and time management skills, self-directed and able to handle multiple priorities with demanding timeframes.•Willingness and ability to work flexible hours and travel (up to 5%) which may include overnight and weekends.•Unquenchable thirst for knowing everything within our Geico platform and learning new technologies.
Benefits:
At GEICO, we make sure you have the support and resources to leverage and develop your skills, secure your financial future, and take care of your health and well-being. GEICO continually seeks to provide a workplace where everyone can be their authentic self. To help achieve this goal, we support associate-led Employee Resource Groups that foster a true sense of community. Through GEICO's competitive benefits offerings and various training and development opportunities, we have you covered with our Total Rewards Program* that includes:
Premier Medical, Dental and Vision Insurance with no waiting period**Paid Vacation, Sick and Parental Leave401(k) PlanTuition Assistance including Direct Billing and Reimbursement payment plan optionsPaid Training, Licensures, and Certificates*Benefits may be different by location. Benefit eligibility requirements vary and may include length of service.
**Coverage begins on the date of hire. Must enroll in New Hire Benefits within 30 days of the date of hire coverage to take effect.
The safety of our associates, both current and future, is GEICO's highest priority. At this time, most of our associates are working remotely due to the current COVID-19 pandemic. Candidates who are selected for this position will be trained remotely and must be able to work from home in a designated work area.
GEICO is proud to be an equal opportunity employer. We are committed to cultivating an environment where equal employment opportunities are available to all associates and job applicants regardless of race, color, religious creed, national origin, ancestry, age, gender, pregnancy, sexual orientation, gender identity, marital status, familial status, disability or genetic information, in compliance with applicable federal, state and local law. GEICO celebrates diversity and believes it is critical to our success. As such, we are committed to recruit, develop, and retain the most talented individuals to join our team.
#LI-AL1
Annual Salary$130,000.00 - $260,000.00The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate/ annual salary to be offered to the selected candidate. Factors include, but are not limited to, the scope and responsibilities of the role, the selected candidate's work experience, education and training, the work location as well as market and business considerations.
At this time, GEICO will not sponsor a new applicant for employment authorization for this position.
Benefits:
As an Associate, you'll enjoy our Total Rewards Program* to help secure your financial future and preserve your health and well-being, including:
Premier Medical, Dental and Vision Insurance with no waiting period**Paid Vacation, Sick and Parental Leave401(k) PlanTuition ReimbursementPaid Training and Licensures
*Benefits may be different by location. Benefit eligibility requirements vary and may include length of service.
**Coverage begins on the date of hire. Must enroll in New Hire Benefits within 30 days of the date of hire for coverage to take effect.
The equal employment opportunity policy of the GEICO Companies provides for a fair and equal employment opportunity for all associates and job applicants regardless of race, color, religious creed, national origin, ancestry, age, gender, pregnancy, sexual orientation, gender identity, marital status, familial status, disability or genetic information, in compliance with applicable federal, state and local law. GEICO hires and promotes individuals solely on the basis of their qualifications for the job to be filled.
GEICO reasonably accommodates qualified individuals with disabilities to enable them to receive equal employment opportunity and/or perform the essential functions of the job, unless the accommodation would impose an undue hardship to the Company. This applies to all applicants and associates. GEICO also provides a work environment in which each associate is able to be productive and work to the best of their ability. We do not condone or tolerate an atmosphere of intimidation or harassment. We expect and require the cooperation of all associates in maintaining an atmosphere free from discrimination and harassment with mutual respect by and for all associates and applicants.
The Senior Manager, Engineering - SRE Network Operations Center (NOC)
is the cornerstone of the NOC's transformation into an SRE-centric organization. This role demands a visionary leader who can bridge the gap between traditional NOC practices and modern Incident Response SRE methodologies, ensuring that the NOC operates with maximum efficiency, reliability, and resilience. The Senior Manager is responsible for the strategic direction, leadership, and operational management of the NOC SRE team, ensuring the team delivers on its 24/7/365 mission.
Key Responsibilities:
Strategic Leadership:
Define the strategic direction for the NOC with a focus on adopting and embedding SRE practices across all operational processes. This includes promoting a culture of continuous improvement, automation, and reliability engineering.Team Leadership:
Lead a team of 15+ Incident Response SRE engineers, providing guidance, mentorship, and support to ensure high performance. This includes managing performance reviews, professional development, and career growth opportunities for team members.Incident Management:
Serve as the ultimate incident commander during critical incidents, ensuring that the incident response process is handled efficiently and effectively from detection to resolution. This includes overseeing incident communication and ensuring that all stakeholders are informed and aligned.Operational Oversight:
Develop and maintain a robust schedule that ensures 24/7/365 coverage by the NOC SRE team, optimizing shift patterns and staffing levels to meet operational demands.SRE Transformation:
Drive the adoption of SRE practices, working closely with senior leadership to implement changes that reduce toil, enhance reliability, and improve the overall incident response process. This includes spearheading cultural changes within the organization to embrace SRE principles.Executive Communication:
Oversee the creation and delivery of high-quality incident communication reports to executives and key stakeholders, ensuring that all communications are clear, accurate, and actionable.Observability and Monitoring:
Oversee the development and maintenance of observability dashboards using Grafana and Prometheus, ensuring that these tools provide meaningful insights and actionable data for the NOC and other teams.Notification and Escalation:
Establish and maintain a robust notification, escalation, and paging process that ensures support teams are mobilized as quickly as possible during incidents.Simulations and Drills:
Plan and execute regular simulations and dry runs to build muscle memory for incident response processes, ensuring that the team is prepared for any eventuality.Backlog Management:
Oversee the continual maintenance and grooming of the NOC SRE backlog, prioritizing improvements and ensuring that the team remains focused on high-impact tasks.Education, Work Experience, and Qualifications:•Bachelor's degree in Computer Science, Information Technology or a related field•Cloud Certifications are a plus (preferably Azure or AWS)•10+ years of hands-on work experience supervising personnel in a technical environment is a plus•Must have excellent troubleshooting skills and a thorough understanding of Operations, Incident Management, systems engineering, and infrastructure support in a cross-platform environment•Must be on-call to provide support for Major Incidents (MI) and NOC issues as needed•Experience leading a team in a fast-paced environment is highly desirable•Excellent verbal & written communication skills are required. Must have excellent technical writing skills and be able to create procedural documentation and departmental policies when required.•Must have an overall understanding of cloud computing and other internet technologies and the ability to communicate and discuss those technologies with all levels of management and internal customers.Desired Qualifications:Skill Breadth and Depth and Projects:•Ability to facilitate resolution of multiple incidents at any given time•Excellent team building and leadership skills•Strong interpersonal and presentation skills•Excellent understanding of incident management•Excellent understanding of infrastructure systems and tools
Technical Skills:•Experience in Agile methodologies such as Kanban and Scrum•Observability tools such as Splunk, NetQoS, Dynatrace, Aternity, Moogsoft, etc.•Strong knowledge and experience in IT and network systems troubleshooting•Excellent understanding of Windows and Linux operating systems•Excellent understanding of Networking technologies•Good understanding of Cloud Computing technologies and concepts (SaaS, PaaS, IaaS, etc.)•Azure Fundamentals (AZ 900) Certification•Azure PowerShell or any shell-scripting language - Bash/KSH/Perl/Python/etc.•Experience with full stack engineering & development from Java front end services to backend storage systems in both SQL and no-SQL contexts•Solid ability to communicate effectively with both technical staff and end users; by providing exceptional client service skills.•Requires strong problem solving, time management, flexibility, and communication skills.•Must have the ability to multi-task, organize and document many tasks at one time.•Self-starter with strong organizational and time management skills, self-directed and able to handle multiple priorities with demanding timeframes.•Willingness and ability to work flexible hours and travel (up to 5%) which may include overnight and weekends.•Unquenchable thirst for knowing everything within our Geico platform and learning new technologies.
Benefits:
At GEICO, we make sure you have the support and resources to leverage and develop your skills, secure your financial future, and take care of your health and well-being. GEICO continually seeks to provide a workplace where everyone can be their authentic self. To help achieve this goal, we support associate-led Employee Resource Groups that foster a true sense of community. Through GEICO's competitive benefits offerings and various training and development opportunities, we have you covered with our Total Rewards Program* that includes:
Premier Medical, Dental and Vision Insurance with no waiting period**Paid Vacation, Sick and Parental Leave401(k) PlanTuition Assistance including Direct Billing and Reimbursement payment plan optionsPaid Training, Licensures, and Certificates*Benefits may be different by location. Benefit eligibility requirements vary and may include length of service.
**Coverage begins on the date of hire. Must enroll in New Hire Benefits within 30 days of the date of hire coverage to take effect.
The safety of our associates, both current and future, is GEICO's highest priority. At this time, most of our associates are working remotely due to the current COVID-19 pandemic. Candidates who are selected for this position will be trained remotely and must be able to work from home in a designated work area.
GEICO is proud to be an equal opportunity employer. We are committed to cultivating an environment where equal employment opportunities are available to all associates and job applicants regardless of race, color, religious creed, national origin, ancestry, age, gender, pregnancy, sexual orientation, gender identity, marital status, familial status, disability or genetic information, in compliance with applicable federal, state and local law. GEICO celebrates diversity and believes it is critical to our success. As such, we are committed to recruit, develop, and retain the most talented individuals to join our team.
#LI-AL1
Annual Salary$130,000.00 - $260,000.00The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate/ annual salary to be offered to the selected candidate. Factors include, but are not limited to, the scope and responsibilities of the role, the selected candidate's work experience, education and training, the work location as well as market and business considerations.
At this time, GEICO will not sponsor a new applicant for employment authorization for this position.
Benefits:
As an Associate, you'll enjoy our Total Rewards Program* to help secure your financial future and preserve your health and well-being, including:
Premier Medical, Dental and Vision Insurance with no waiting period**Paid Vacation, Sick and Parental Leave401(k) PlanTuition ReimbursementPaid Training and Licensures
*Benefits may be different by location. Benefit eligibility requirements vary and may include length of service.
**Coverage begins on the date of hire. Must enroll in New Hire Benefits within 30 days of the date of hire for coverage to take effect.
The equal employment opportunity policy of the GEICO Companies provides for a fair and equal employment opportunity for all associates and job applicants regardless of race, color, religious creed, national origin, ancestry, age, gender, pregnancy, sexual orientation, gender identity, marital status, familial status, disability or genetic information, in compliance with applicable federal, state and local law. GEICO hires and promotes individuals solely on the basis of their qualifications for the job to be filled.
GEICO reasonably accommodates qualified individuals with disabilities to enable them to receive equal employment opportunity and/or perform the essential functions of the job, unless the accommodation would impose an undue hardship to the Company. This applies to all applicants and associates. GEICO also provides a work environment in which each associate is able to be productive and work to the best of their ability. We do not condone or tolerate an atmosphere of intimidation or harassment. We expect and require the cooperation of all associates in maintaining an atmosphere free from discrimination and harassment with mutual respect by and for all associates and applicants.