Logo
Edjuster

Sr. Site Reliability Engineer

Edjuster, Atlanta, Georgia, United States, 30383


The Senior Site Reliability Engineer will assist with the design, development, and implementation of the cloud architecture in various cloud, hybrid, and on-premise systems. This position will directly contribute to the overall implementation of enterprise cloud architecture while working closely with staff to enhance and develop new designs and strategies across all types of cloud-based applications. The Site Reliability Engineer will collaborate with both Information Technology and Business Units to ensure open lines of communication and clear understanding of objectives within each project. The successful candidate possesses excellent interpersonal and communication skills, required for collaborating with both internal business units and resources and external partners and integrators.RESPONSIBILITIESDevelop / Monitor dashboards to detect problems related to application, infrastructure and potential security incidents on daily basisRun the production environment by monitoring availability and taking a holistic view of system healthBuild software and systems to manage platform infrastructure and applicationsImprove reliability, quality, and time-to-market of our suite of software solutions by creating sustainable systems and services through automation and upliftsProvide primary operational support and engineering for multiple large, distributed software applicationsGather and analyze metrics from both operating systems and applications to assist in performance tuning and fault findingEnsure appropriate sizing of solutions, technology fit, and DR are assessed and accounted forEDUCATION AND EXPERIENCE QUALIFICATIONS4 year degree in IT or related field preferred; equivalent experience may be substituted in lieu of education4-6 years of experience with Architecting and/or Engineering in cloud environments.4-6 years of experience with Azure and/or AWS Cloud platform.2 – 4 years of experience with CI/CD automating4-6 years in an Operations Support RoleREQUIRED KNOWLEDGE, SKILLS or ABILITIESHands-on experience with Microsoft Azure is required. Specifically, Azure Security Center, Azure monitoring, Azure Key Vault, Azure Kubernetes Service, Azure Dedicated HSM, Blob Storage, Azure Backup, Azure Functions, Virtual Machines, Service Fabric and Container InstancesHands-on experience with Python, Bash, and/or PowerShell with a focus on orchestration and automation of underlying services, systems, provisioning, and security hardeningUnderstanding of Windows and Linux operating systems at a detailed level including processes, memory allocation, and networking with an understanding of how applications function and impact other OS components and cloud servicesExpert-level debug/troubleshooting skillsExperience developing and/or maintaining production-grade cloud solutions in virtualized environments such as Pivotal Cloud Foundry and KubernetesExperience with creating and deploying AWS and ARM templatesAble to pick up and learn new AWS/Azure technologies and create internal training docsExperience with Database technologies (SQL, Cluster technology and creation, Always-On, migration, log shipping)Hands-on experience with log aggregation toolsExperience architecting solutions within AzureWorking knowledge of common and industry standard cloud-native/cloud-friendly authentication mechanisms (OAuth, OpenID, etc.)Experience with automation systems: e.g. Ansible, Jenkins, Chef, GITExperience with monitoring solutions: e.g. Splunk, SolarWinds, NagiosExperience with Jira, Confluence, AtlassianExperience with APM tools: e.g. Dynatrace, AppDynamics, New Relic, Stackify, RaygunExperience with Cloud SecurityExperience with C#, Python, NodeJS, JSON, Java, etc.Experience working with cloud security and governance tools, cloud access security brokers (CASBs), and server virtualization technologiesExperience with enterprise applications (architecture, development, support, and troubleshooting)Experience with enterprise architecture and working as part of a cross-functional team to implement solutionsExperience in handling production support incidents and connect between Developer/Operations team to perform deep dive analysis work on RCA/implement code fixStrong interpersonal and communication skills; ability to work in a team environmentAbility to work independently with minimal direction; self-starter/self-motivatedTechnical writing experience

#J-18808-Ljbffr