Logo
Rollbar, Inc.

Platform Engineer

Rollbar, Inc., Dallas, Texas, United States, 75215


Costco IT is responsible for the technical future of Costco Wholesale, the third largest retailer in the world with wholesale operations in fourteen countries. Despite our size and explosive international expansion, we continue to provide a family, employee-centric atmosphere in which our employees thrive and succeed. As proof, Costco ranks eighth in Forbes “World’s Best Employers”.This is an environment unlike anything in the high-tech world and the secret of Costco’s success is its culture. The value Costco puts on its employees is well documented in articles from a variety of publishers including Bloomberg and Forbes. Our employees and our members come FIRST. Costco is well known for its generosity and community service and has won many awards for its philanthropy. The company joins with its employees to take an active role in volunteering by sponsoring many opportunities to help others.Come join the Costco Wholesale IT family. Costco IT is a dynamic, fast-paced environment, working through exciting transformation efforts. We are building the next generation retail environment where you will be surrounded by dedicated and highly professional employees.

Position Summary

This position is responsible for the design, support, installation, and configuration of Costco Enterprise Container Platforms (ECP) using Google GKE, Azure AKS, and OpenShift in both Cloud and On-Prem Environments.Job Duties/Essential Functions

Platform Management:

Design, deploy, and manage Kubernetes clusters on GKE, AKS, OpenShift.Automation:

Develop and implement automation, scripts for deployments, scaling, and management of containerized applications using tools such as Terraform, Ansible or similar.CI/CD Pipeline Integration:

Collaborate with the CI/CD team to integrate containerized applications into automated build and deployment pipelines.Monitoring and Logging:

Set up and maintain monitoring and logging solutions to ensure health and performance using tools like Prometheus, Grafana, and ELK stack.Security:

Implement and manage security best practices for containerized environments, including RBAC and vulnerability scanning.Perform container Day 2 activities - Clusters and add-ons upgrade and administration.Enable self-service solutions for customers to onboard to the container platform.Design and develop standardized solutions across OpenShift, AKS, GKE platforms like container registry, security, logging, and monitoring.Design and implement Active-Active and HA/DR strategy for critical business applications.Collaborate with delivery pods and Architects to improve production deployments, platform operations, and environment stability.Enable Cost management and build chargeback reporting model for customers.Seeks opportunities to learn, automate, document, share, educate, and improve processes where appropriate.Follows corporate security standards and best practices, including the promotion of awareness on current threats.Understands and adheres to Costco’s project methodology and framework, Costco’s Mission Statement and Code of Conduct.Regular and reliable workplace attendance at your assigned location.Required Skills:

Kubernetes Expertise: Deep understanding and hands-on experience with Kubernetes, including deployments, scaling, and management of clusters.Cloud Platform: Experience with Google Cloud Platform (GCP) and Azure, especially with GKE and AKS.Infrastructure as Code: Proficiency with infrastructure as code tools such as Terraform and Ansible.Scripting and Automation: Strong scripting skills such as Python, Bash, or Go for automation and tooling.CI/CD Tools: Familiarity with CI/CD tools like GitLab CI, Jenkins or similar.Problem Solving: Strong analytical and problem-solving skills.Experience, Skills, Education & Licenses/Certifications

Required:

2+ years of experience operating enterprise container orchestration platforms in production.2+ years of Kubernetes administration experience on AKS (preferred), Red Hat OpenShift (preferred), EKS, GKE, Rancher, or similar.2+ years of experience managing infrastructure on Azure (preferred), AWS, or GCP.2+ years of experience with tools for CI/CD like Azure pipelines (preferred), GitLab, Jenkins, or similar.2+ years of experience supporting products running on AIX or Linux servers.2+ years of experience with enterprise monitoring and logging solutions like Dynatrace, Splunk, Prometheus, ELK, or similar.2+ years of experience working with tools for IaC and automation like Terraform, Ansible, Puppet, or similar.Competency in at least one high-level programming language such as Golang (preferred), Python, JAVA, JavaScript/NodeJS, or similar.Practical understanding of Agile concepts, SDLC, and test-driven development.Excellent verbal and written communication skills.Willing to learn third-party applications and tools.Willing to split time between development and operations tasks.Fundamental networking knowledge.Ability to participate in a 24x7 on-call rotation, including evenings, weekends, and/or holidays, as necessary.Recommended:

Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD) or Cloud certification (GCP, Azure).Unix Administration Certification.SSL/SSO experience.DevOps Mindset.Pay Ranges:

Level 1 - $85,000 - $110,000Level 2 - $105,000 - $135,000Level 3 - $130,000 - $160,000Senior - $150,000 - $190,000, Bonus and Restricted Stock Unit (RSU) eligible.

#J-18808-Ljbffr