Senior Cloud and Server Engineer
Madison Square Garden Network, Las Vegas, NV
Madison Square Garden Entertainment Corp. (MSG Entertainment) is a leader in live entertainment, delivering unforgettable experiences while forging deep connections with diverse and passionate audiences. The Company’s portfolio includes a collection of world-renowned venues – New York’s Madison Square Garden, The Theater at Madison Square Garden, Radio City Music Hall, and Beacon Theatre; and The Chicago Theatre – that showcase a broad array of sporting events, concerts, family shows, and special events for millions of guests annually. In addition, the Company features the original production, the Christmas Spectacular Starring the Radio City Rockettes, which has been a holiday tradition for 90 years. More information is available at www.msgentertainment.com.
Who are we hiring?
The Senior Cloud and Server Engineer manages and optimizes cloud environments in AWS, Azure, and Nutanix, along with physical Windows and Linux servers. This involves resolving complex server and storage issues, analyzing logs, troubleshooting hardware, and ensuring maximum performance and availability. The Engineer is responsible for configuring and supporting S3 bucket workflows, optimizing cloud and server processes, monitoring storage health, and managing data backups, archives, and media across multiple servers. With a focus on delivering exceptional service, The Senior Clous and Storage Engineer will continuously improve infrastructure to support business goals and ensure efficient, reliable operations.
What will you do?
- Ensure that bare metal, Nutanix, and cloud-based systems (AWS, Azure) are optimized for the utilization of NAS storage and resource allocation.
- Identify and resolve internal errors and manage capacity across all virtual, Nutanix, and physical environments.
- Examine, evaluate, and monitor incoming server/storage change requests.
- Offer Level 2 and Level 3 support that includes operations and change management, fulfilling requests involving Hyperconverged Infrastructure, Nutanix, NAS, SAN, and Rubrik backup solutions.
- Responsible for accurately documenting and maintaining Infrastructure diagrams, equipment lists, and documenting all infrastructure changes.
- Conduct periodic reviews and apply system firmware upgrades, application software updates, and OS patching for all infrastructure systems.
- Create and manage production support availability, outages, runbooks, process documentation, and status reports. Collaborate with other support teams to identify and document monitoring and measurement requirements.
- Perform daily server/storage performance monitoring, troubleshooting, and fault analysis, including hardware troubleshooting and repair.
- Generate and respond to incident tickets, monitor interfaces, and handle escalations.
- Deploy and maintain server/storage monitoring, analysis, and reporting tools.
- Install and maintain server/storage hardware and software, ensuring system patching and updates are performed in a timely manner.
- Collaborate with network infrastructure teams to ensure seamless integration and functionality of storage and server systems.
- Adapt to the evolving needs of the organization, providing flexible and scalable solutions for Nutanix, server, and storage infrastructure.
- Participate in a 24x7 on-call rotation for critical infrastructure support, including scheduled system patching rotation to maintain up-to-date security and performance standards.
What do you need to succeed?
- 5+ years of enterprise infrastructure operations experience
- 5+ years of working with high performance systems platforms such as compute, storage, data centers and automation
- 3+ years of vendor management experience
- Bachelor’s degree in engineering, computer science or computer engineering or equivalent experience.
- Previous experience at an enterprise level company supporting all infrastructure with absolute fault tolerance in mind
- Experience deploying various tool sets which support monitoring and alerting across diverse platforms
- Integration experience working with event correlation platforms and ticketing systems such as ServiceNow
- Experience with advanced server/storage operations utilizing various product platforms such as Nutanix, Teradici, AWS, Azure, EMC, and Qumulo
- Ability to technically troubleshoot server and storage related outages
- Self-directed and solutions-oriented initiative with good sound decision making skills
- Previous exposure working closely with production, external partners, and creative development teams.
- Production-minded, hands-on work style with a focus on customer service and operational excellence
Special Requirements
- Must be available for on-call and overnight patching responsibilities on a rotating schedule as needed
#LI-Onsite
At MSG, we recognize the importance of upskilling employees’ talents and strengths so they can drive their careers forward. We are proud to offer a robust set of tools and resources to help employees understand their interests and purpose, harness their talents and obtain the skills they need to reach the next step in their careers. Growth and longevity for our employees are top priorities here.
We value diversity and are looking for extraordinary employees of all backgrounds! MSG is an Equal Opportunity Employer and provides equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, sexual and reproductive health choices, national origin, citizenship, age, genetic information, disability, or veteran status. In addition to federal law mandates, MSG complies with all applicable state and local laws governing nondiscrimination in all locations and will consider requests for reasonable accommodations as required.