Mood Media
Sr. Cloud IT Reliability Engineer
Mood Media, Fort Mill, South Carolina, United States, 29715
Cloud Infrastructure Administrator
Role Overview
As an AWS Systems Admin, you will be responsible for Site Reliability and Configuring, Administering, and Supporting AWS environments within the IT infrastructure/operations team in a Linux (Ubuntu) and Windows server environment
Run the production environment by monitoring availability and taking a holistic view of system health
Support systems to manage platform infrastructure and applications
Improve reliability, quality, and time-to-market of software solutions in a cloud environment
Measure and optimize system performance, with an eye toward pushing our capabilities forward and innovating to continually improve the cloud/infrastructure environment
Provide primary operational support for multiple large, distributed software applications
Implementing best practices, ensuring maintenance and backups are being completed, making recommendations for AWS services and automation opportunities.
Monitor current AWS workloads and provide early warning to the business about impacted services and resolutions.
Works on the infrastructure team but collaborates with the applications team and corporate Infrastructure teams to ensure business needs are being met and projects are completed in a timely manner.
Ensure the environments are securely implemented and generating all necessary security logs.
Responsibilities include:
Configuration and support EC2, S3, Autoscaling, CloudFront, CloudWatch, IAM security services as needed.
Administration of the development, test and production cloud hosted environments.
Provide issues/needs analysis and solution recommendation/implementation relative to system's needs.
Work directly with application engineers to identify and resolve issues
Ensure system performance, uptime and support levels meet or exceed SLAs.
Assist project team as needed with capacity planning and all AWS service and application deployments
Manage virtual and physical cloud resources as required with an overall objective of improving the scalability, reliability, performance, and availability of the cloud infrastructure.
Develop a detailed understanding of application functionality and architecture.
Partner with application teams to develop practical monitoring solutions and participate in cross functional team meetings to collaborate and ensure successful executions.
Maintain internal documentation that fully reflects all activity related to an application and environment to be used by applicable teams
Troubleshoot and resolve operational issues, assisting with issues arising from product upgrades, installations, and configurations
All other duties as assigned
Job Requirements
Education:
Bachelor's degree in Management Information Systems, Computer Science or related major, or equivalent experience required. Equivalent years of experience are defined as one year of professional experience for each year of college requested
Experience:
5+ years of experience operating in a production environment and experience with Systems support with either Linux (pref Ubuntu Linux), and / or Windows environments
3+ years of experience in Infrastructure and systems support and administration in a cloud or hybrid environment
2+ years of experience within Cloud Infrastructure configuration management, administration, & support
2+ years of experience with AWS (Azure experience is acceptable, provided there is some knowledge of AWS)
Experience in AWS system administration and configuration management and knowledge of AWS versioning system.
Experience with AWS services, e.g., EC2, CloudWatch, RDS, S3, EKS, VPC, etc. Support and configuration management activities including Cloud Infrastructure; Cloud Provisioning; Cloud Service Management; AWS Relational Database Service (RDS); Amazon VPC; Amazon S3; Amazon Web Services EC2; Amazon Web Services VPC; including Cloud back up and restoration processes.
Experience with troubleshooting and resolving operational issues, assisting with issues arising from product upgrades, installations, and configurations.
Experience of clustering, backup configuration and DR exercises
Experience with AWS CLI (Command line interface) for automating administrative tasks
Ability to work in a distributed team environment where team members are spread across numerous locations and often communicate virtually to support clients
Strong written and verbal communications skills
Basic Scripting experience (Any language)
Skills & Certifications:
Experience with Site Reliability Engineering (SRE)
Experience with VPC, AZs, Subnets, Route53, CloudWatch, ALB/NLB, Security Groups, EKS, and EC2.
AWS Certifications
Experience working through a cloud transformation working with both on prem technologies and Public Cloud and ideally having helped move on prem technologies to the AWS Public Cloud
Knowledge of networking concepts, e.g., OSI model, DNS, TCP/UDP, and IPv4/IPv6 and experience with AWS network connectivity configuration and network security (using security groups, keys etc.) and maintaining AWS network connectivity for standard Internet services such as VPC, VPN, DNS, NFS, DHCP and FTP.
Automation, Orchestration & Provisioning; Container orchestration experience using AWS EKS clusters in a production environment.
Remediate vulnerabilities/patching operating systems using AWS Systems Manager, creating hardened AWS AMIs, and other security related activities.
Devops: Terraform; Docker; Ansible; Terraform, Ansible, Git, YAML JavaScript Object Notation (JSON);
Supporting multiple continuous build environments, code repository administration, and code packaging and deployments to multiple development, QA and production environments.
Supervisory Responsibility:
This position does not include supervisory responsibilities
FLSA Status
This position is classified as Salaried Exempt, and is not eligible for payment of time-and-one-half the regular rate of pay for hours worked over 40 in a week.
Work Environment & Physical Requirements Work Environment: This job operates in normal professional office environments, 2 to 5 days on-site, and routinely uses standard office equipment such as computers, phones, photocopiers and fax machines, and filing cabinets. Physical Requirements: While performing the duties of this job the employee is regularly required to talk or hear; frequently sits, stands, walks, uses hands handle or feel; and reaches with hands and arms.
Work Environment & Physical Requirements Work Environment: This job operates in normal professional office environments, 2 to 5 days on-site, and routinely uses standard office equipment such as computers, phones, photocopiers and fax machines, and filing cabinets. Physical Requirements: While performing the duties of this job the employee is regularly required to talk or hear; frequently sits, stands, walks, uses hands handle or feel; and reaches with hands and arms.