MSG Sphere Studio  Company Logo

MSG Sphere Studio

Burbank, California - United States
View all jobs at this company »

Senior Storage Server Engineer


Who are we hiring?

The Senior Storage Server Operations Engineer is responsible for maintaining and optimizing Post Production, Pipeline, VFX, and Playback storage systems. This role works with design and engineering teams to support a studio-wide server/storage infrastructure. Ideal candidates will have a demonstrated track record of resolving complex issues that involve analyzing server/storage logs, troubleshooting hardware, and assuring maximum performance and storage availability. This role is a crucial member of the team monitoring and maintaining storage server health and assisting with data backup and archive as well as media management between multiple servers. The Senior Storage Server Operations Engineer will offer Level 2 and Level 3 server/storage support that includes operation and change management, fulfill requests such as hyperconverged infrastructure, NAS, SAN, cloud backup, and VMWare. This Engineer will examine, evaluate and monitor all incoming server/storage change requests in a mission critical post-production environment with an emphasis on meeting strict Production deadlines.

What will you do?

• Ensure that bare metal and VMware systems are optimized for the utilization of NAS storage and resource allocation.
• Identify and repair the internal errors and capacity management of all virtual environments.
• Responsible for documenting and maintaining Hyperconverged Infrastructure drawings and equipment lists accurately and document all infrastructure changes.
• Conduct a periodic review of upgraded system firmware, application software, and patching of the OS on all storage devices.
• Create production support availability, outages, Runbooks, Process documentation and status reports. Work in coordination with other support teams to recognize and record monitoring and measurement requirements. Provide administration and reporting services in the areas of incident and problem management.
• Create a program to visualize the performance of the storage infrastructure, monitoring for health and reporting on problem areas

• Day-to-day server/storage performance monitoring, troubleshooting, and fault analysis; hardware troubleshooting and repair
• Trouble ticket generation and response; monitoring interface and escalations
• Deployment and maintenance of server/storage monitoring, analysis, and reporting tools
• Installation and maintenance of server/storage hardware and software
• Participate in a 24x7 call-out rotation

What do you need to succeed?

• At least 8 years of enterprise infrastructure operations experience – applicant must be “hands on keyboard” with the ability to technically troubleshoot server/storage related outages
• Previous experience at an enterprise level company supporting all infrastructure with absolute fault tolerance in mind
• Proven hands on experience deploying various tool sets which support monitoring and alerting across diverse platforms – has integration experience working with event correlation platforms and ticketing systems such as ServiceNow
• 5 years of vendor management experience preferred but not required
• 10 years of working with high performance systems platforms such as compute, storage, data centers and automation
• Experience with advanced server/storage operations utilizing various product platforms such as VMWare, Teradici, AWS, Azure, EMC, NetApp and Qumulo
• Bachelor’s degree in engineering, computer science or computer engineering is preferred or equivalent experience

The Company requires that all individuals, subject to certain limited exceptions, be fully vaccinated against COVID-19.  The Company will consider requests for reasonable accommodations regarding this requirement.