The Cypress Group
Linux Engineer- SRE – Trading
The Cypress Group, New York, New York, us, 10261
Dynamic Linux Site Reliability Engineer (SRE)Are you passionate about creating robust, high-performing systems that keep everything running smoothly? We’re on the hunt for an innovative Linux Site Reliability Engineer (SRE) who can bring their expertise to the table and help elevate our Linux-based infrastructure to new heights! If you thrive on solving complex challenges and have a knack for automation and optimization, this role is tailor-made for you!What You’ll Be Doing:System Maestro:
Take charge of managing, configuring, and maintaining our Linux servers and related infrastructure. You’ll ensure our systems are always at peak performance and ready to handle anything.Monitoring Maven:
Develop and implement advanced monitoring solutions to identify and address performance issues before they become problems. You’ll be the first line of defense, swiftly responding to incidents to maintain seamless operations.Automation Guru:
Lead the charge in automating operational tasks. Craft scripts and utilize configuration management tools (like Ansible, Puppet, and Chef) to streamline our processes, reduce manual effort, and boost efficiency.Capacity Planner:
Keep a close eye on system performance and resource usage. You’ll plan for and execute scaling activities to ensure we’re always ahead of the demand curve.Collaboration Champion:
Work hand-in-hand with our talented software engineers and key stakeholders. You’ll provide insights into system requirements, offer valuable feedback on design, and support the deployment of new features.Security Sentinel:
Safeguard our infrastructure by enforcing security best practices and policies. You’ll be vigilant in ensuring our systems are secure and compliant.Documentation Dynamo:
Create and maintain detailed documentation for system configurations, processes, and procedures, ensuring that we have a solid knowledge base to rely on.What We’re Looking For:Experience:
7-10 years of experience in Linux administration or Site Reliability Engineering. Experience in a supervisory or management role for 2-3 years is a big plus.Technical Skills:
Mastery of Linux operating systems (like RHEL, CentOS), a strong grasp of networking, scripting skills in languages like Bash and Python, and hands-on experience with containerization technologies such as Docker and Kubernetes.Tools Expertise:
Skilled in using monitoring tools (like Prometheus, Grafana), configuration management tools (like Ansible, Puppet, Chef), and version control systems (like Git).Problem-Solving Prowess:
You’re a natural at diagnosing and resolving complex technical issues with your analytical and troubleshooting skills.Communication Pro:
Excellent verbal and written communication skills. You know how to collaborate effectively and thrive in a fast-paced team environment.Educational Background:
A bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.Bonus Points If You Have:Cloud Savvy:
Experience working with cloud platforms such as AWS, Azure, or Google Cloud.DevOps Knowledge:
Familiarity with CI/CD pipelines and DevOps practices.Simplification Skills:
A proven track record of simplifying tools and processes in complex environments, leading to faster and more reliable changes.High-Availability Expertise:
Experience managing large-scale systems and ensuring high availability.Ready to make an impact? Join us and be part of a team that’s at the forefront of technology, driving innovation and excellence in every step
#J-18808-Ljbffr
Take charge of managing, configuring, and maintaining our Linux servers and related infrastructure. You’ll ensure our systems are always at peak performance and ready to handle anything.Monitoring Maven:
Develop and implement advanced monitoring solutions to identify and address performance issues before they become problems. You’ll be the first line of defense, swiftly responding to incidents to maintain seamless operations.Automation Guru:
Lead the charge in automating operational tasks. Craft scripts and utilize configuration management tools (like Ansible, Puppet, and Chef) to streamline our processes, reduce manual effort, and boost efficiency.Capacity Planner:
Keep a close eye on system performance and resource usage. You’ll plan for and execute scaling activities to ensure we’re always ahead of the demand curve.Collaboration Champion:
Work hand-in-hand with our talented software engineers and key stakeholders. You’ll provide insights into system requirements, offer valuable feedback on design, and support the deployment of new features.Security Sentinel:
Safeguard our infrastructure by enforcing security best practices and policies. You’ll be vigilant in ensuring our systems are secure and compliant.Documentation Dynamo:
Create and maintain detailed documentation for system configurations, processes, and procedures, ensuring that we have a solid knowledge base to rely on.What We’re Looking For:Experience:
7-10 years of experience in Linux administration or Site Reliability Engineering. Experience in a supervisory or management role for 2-3 years is a big plus.Technical Skills:
Mastery of Linux operating systems (like RHEL, CentOS), a strong grasp of networking, scripting skills in languages like Bash and Python, and hands-on experience with containerization technologies such as Docker and Kubernetes.Tools Expertise:
Skilled in using monitoring tools (like Prometheus, Grafana), configuration management tools (like Ansible, Puppet, Chef), and version control systems (like Git).Problem-Solving Prowess:
You’re a natural at diagnosing and resolving complex technical issues with your analytical and troubleshooting skills.Communication Pro:
Excellent verbal and written communication skills. You know how to collaborate effectively and thrive in a fast-paced team environment.Educational Background:
A bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.Bonus Points If You Have:Cloud Savvy:
Experience working with cloud platforms such as AWS, Azure, or Google Cloud.DevOps Knowledge:
Familiarity with CI/CD pipelines and DevOps practices.Simplification Skills:
A proven track record of simplifying tools and processes in complex environments, leading to faster and more reliable changes.High-Availability Expertise:
Experience managing large-scale systems and ensuring high availability.Ready to make an impact? Join us and be part of a team that’s at the forefront of technology, driving innovation and excellence in every step
#J-18808-Ljbffr