JobRialto
Senior Dev Operations Engineer
JobRialto, Pleasanton, California, United States, 94566
Job Description:
Responsibilities
Monitoring sites, environments, and software by implementing tools and automation to achieve 99.9% uptime.
Measurement, optimization, and tuning of system performance and ensuring that systems will run reliably and are highly available in a 24/7 production environment.
Automate system and application monitoring using monitoring and automation tools
Anticipating potential problems before they occur and coming up with solutions.
Conducting post-incident reviews and Root Cause Analysis.
Documenting your work to turn findings into repeatable actions.
Coding automation within a site infrastructure.
Implement production monitoring systems.
Utilize strong analytical and problem-solving skills.
Security assessments and addressing vulnerabilities.
Design and deploy AWS solutions using AWS services (i.e. EC2, S3, Glacier, ELB, RDS, IAM, Route 53, VPC, Auto Scaling, Cloud Watch, Cloud Trail, Cloud Formation, Security Groups, API Gateway, SSM, Route table, Endpoint service, etc.)
Provision, management, and day-to-day operations of AWS environments
Implement alarms / alerts / notifications using AWS services (i.e. Cloud Watch)
Implement AWS Multi AZ accounts for HA and DR
Design AWS infrastructure that minimize operational costs through push-button deployment at scale with near-zero downtime.
Develop and maintain configuration management solutions.
Provide technical guidance, knowledge transfers and mentorship to State Fund internal engineering
peers as required and lead technical staff responsibilities.
Server Maintenance based on updates, system requirements, data usage, and antivirus requirements.
Responsible for the design, implementation, and support of large scale web farm infrastructure across multiple data centers supporting the Infrastructure as a Service (IaaS) offering.
Help engineering implement new technologies in development for future production deployment.
Working with team to analyze and design infrastructure witch includes virtualization, clustering, database, disaster recovery, and geographic redundancy.
Triage and provide technical solutions to environment related issues encountered by new and existing applications
Support developers with change requests, uptime, and performance related issues.
Documentation of work in regards to bug reports, systems analysis, application monitoring, and common task reporting
Author internal documentation, such as environment diagrams, installation/configuration documents and release notes.
Assist in establishing and implementing configuration management program and policies.
Troubleshoot and debug environment and infrastructure problems found in the production and non-production environments.
Collaborating with software developers, engineers, and operations teams.
Provide 24 by 7 production support
Top 3-5 Must Haves
Experience setting up alerts / alarms / notifications in AWS cloud. CloudWatch / Dynatrace
Experience with AWS solutions using AWS services including Kafka, ECS, EKS.
Experience with IaC (Infrastructure as code) CDK or Terraform.
Technical Knowledge and Skills:
6+ years of overall IT experience
4+ years of AWS Cloud management experience with below skill set
AWS Certified DevOps and / or Solution Architect certification
Experience in AWS provisioning, operations, and management of AWS environments.
Experience setting up alerts / alarms / notifications in AWS cloud. CloudWatch / Dynatrace
Experience with AWS solutions using AWS services including Kafka, ECS, EKS.
Experience with IaC (Infrastructure as code) CDK or Terraform.
Experience setting up / maintaining multi AZ infrastructure including HA and DR in AWS.
Experience with code repositories Azure DevOps Server, GIT, GITLab, SVN
Experience with continuous integration tools Jenkins, Azure Pipelines
Excellent knowledge of Linux systems
Experience with system automation and configuration management tools including Ansible
Experience with Python scripting
Strong background in networking, load balancing, and firewalls
High-level understanding of networking standard protocols and components such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing
Thorough understanding of and experience with managing web applications in a highly available environment
Experience in Software development is a plus
Familiarity with deploying and configuring Java and .Net applications.
Experience with Application Security Testing tools a plus (Coverity, Tenable, BlackDuck, etc)
Understanding of SQL, PL/SQL, and T-SQL commands
Preferred Skills:
Passion for AWS cloud architecture, provisioning, monitoring, and maintenance management
Passion for improving software development processes and desires to automate any repetitive work you ever do. Familiarity with configuration management
Enthusiasm for working closely with developers to understand ops requirements
Experience with large project rollouts at an enterprise level
Detailed knowledge of Windows and Linux Operating Systems
Good knowledge of SCM (software configuration management)
Working knowledge of web services, web application development, Oracle database Server, multi-tier application systems
Good knowledge of software configuration, source control, and build engineering, scripting and system administration is required
Database management expertise: MS SQL Server, Oracle
Strong troubleshooting and problem solving skills, including application and network-level troubleshooting ability
Technical writing skills
Knowledge/experience with troubleshooting installing and configuring SSL certificates
Understanding of TCP/IP, UDP, IP ROUTING, SSH/SFTP/SCP, DNS, FTP, SMTP
Education:
Bachelors Degree
Responsibilities
Monitoring sites, environments, and software by implementing tools and automation to achieve 99.9% uptime.
Measurement, optimization, and tuning of system performance and ensuring that systems will run reliably and are highly available in a 24/7 production environment.
Automate system and application monitoring using monitoring and automation tools
Anticipating potential problems before they occur and coming up with solutions.
Conducting post-incident reviews and Root Cause Analysis.
Documenting your work to turn findings into repeatable actions.
Coding automation within a site infrastructure.
Implement production monitoring systems.
Utilize strong analytical and problem-solving skills.
Security assessments and addressing vulnerabilities.
Design and deploy AWS solutions using AWS services (i.e. EC2, S3, Glacier, ELB, RDS, IAM, Route 53, VPC, Auto Scaling, Cloud Watch, Cloud Trail, Cloud Formation, Security Groups, API Gateway, SSM, Route table, Endpoint service, etc.)
Provision, management, and day-to-day operations of AWS environments
Implement alarms / alerts / notifications using AWS services (i.e. Cloud Watch)
Implement AWS Multi AZ accounts for HA and DR
Design AWS infrastructure that minimize operational costs through push-button deployment at scale with near-zero downtime.
Develop and maintain configuration management solutions.
Provide technical guidance, knowledge transfers and mentorship to State Fund internal engineering
peers as required and lead technical staff responsibilities.
Server Maintenance based on updates, system requirements, data usage, and antivirus requirements.
Responsible for the design, implementation, and support of large scale web farm infrastructure across multiple data centers supporting the Infrastructure as a Service (IaaS) offering.
Help engineering implement new technologies in development for future production deployment.
Working with team to analyze and design infrastructure witch includes virtualization, clustering, database, disaster recovery, and geographic redundancy.
Triage and provide technical solutions to environment related issues encountered by new and existing applications
Support developers with change requests, uptime, and performance related issues.
Documentation of work in regards to bug reports, systems analysis, application monitoring, and common task reporting
Author internal documentation, such as environment diagrams, installation/configuration documents and release notes.
Assist in establishing and implementing configuration management program and policies.
Troubleshoot and debug environment and infrastructure problems found in the production and non-production environments.
Collaborating with software developers, engineers, and operations teams.
Provide 24 by 7 production support
Top 3-5 Must Haves
Experience setting up alerts / alarms / notifications in AWS cloud. CloudWatch / Dynatrace
Experience with AWS solutions using AWS services including Kafka, ECS, EKS.
Experience with IaC (Infrastructure as code) CDK or Terraform.
Technical Knowledge and Skills:
6+ years of overall IT experience
4+ years of AWS Cloud management experience with below skill set
AWS Certified DevOps and / or Solution Architect certification
Experience in AWS provisioning, operations, and management of AWS environments.
Experience setting up alerts / alarms / notifications in AWS cloud. CloudWatch / Dynatrace
Experience with AWS solutions using AWS services including Kafka, ECS, EKS.
Experience with IaC (Infrastructure as code) CDK or Terraform.
Experience setting up / maintaining multi AZ infrastructure including HA and DR in AWS.
Experience with code repositories Azure DevOps Server, GIT, GITLab, SVN
Experience with continuous integration tools Jenkins, Azure Pipelines
Excellent knowledge of Linux systems
Experience with system automation and configuration management tools including Ansible
Experience with Python scripting
Strong background in networking, load balancing, and firewalls
High-level understanding of networking standard protocols and components such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing
Thorough understanding of and experience with managing web applications in a highly available environment
Experience in Software development is a plus
Familiarity with deploying and configuring Java and .Net applications.
Experience with Application Security Testing tools a plus (Coverity, Tenable, BlackDuck, etc)
Understanding of SQL, PL/SQL, and T-SQL commands
Preferred Skills:
Passion for AWS cloud architecture, provisioning, monitoring, and maintenance management
Passion for improving software development processes and desires to automate any repetitive work you ever do. Familiarity with configuration management
Enthusiasm for working closely with developers to understand ops requirements
Experience with large project rollouts at an enterprise level
Detailed knowledge of Windows and Linux Operating Systems
Good knowledge of SCM (software configuration management)
Working knowledge of web services, web application development, Oracle database Server, multi-tier application systems
Good knowledge of software configuration, source control, and build engineering, scripting and system administration is required
Database management expertise: MS SQL Server, Oracle
Strong troubleshooting and problem solving skills, including application and network-level troubleshooting ability
Technical writing skills
Knowledge/experience with troubleshooting installing and configuring SSL certificates
Understanding of TCP/IP, UDP, IP ROUTING, SSH/SFTP/SCP, DNS, FTP, SMTP
Education:
Bachelors Degree