Sunrise Group Inc.
Site Reliability Engineer (SRE) - Observability Specialist
Sunrise Group Inc., Las Vegas, NV, United States
We are seeking a skilled Site Reliability Engineer (SRE) with expertise in Observability to design, implement, and maintain monitoring, logging, and tracing solutions. This role focuses on improving system reliability, scalability, and performance through effective observability practices while collaborating with development, operations, and business teams.Role: Site Reliability Engineer (SRE) - Observability Specialist
Experience: 6-9 Years
Location: Las Vegas, NV
Duration: 6 Month+ Contract
Key Responsibilities:Observability Solutions: Design and integrate tools for monitoring, logging, and tracing (e.g., Prometheus, Grafana, Elasticsearch, Datadog).
Monitoring & Alerting: Define KPIs, SLOs, and SLIs; implement actionable alerts to ensure reliability.
System Reliability: Analyze observability metrics to identify risks and collaborate on mitigations.
Collaboration: Partner with teams to embed observability into the software lifecycle and advocate best practices.
Automation: Streamline observability processes like dashboard creation and log parsing.
Documentation: Maintain documentation for observability tools and processes, ensuring visibility for stakeholders.
Qualifications:Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
Proven experience with observability platforms (Prometheus, Grafana, Splunk, OpenTelemetry).
Proficiency in programming/scripting languages (Python, Go, Bash).
Strong knowledge of distributed systems, cloud platforms (AWS, Azure, GCP), and containerization (Kubernetes, Docker).
Familiarity with KPIs, SLOs, and SLIs for monitoring and reporting.
Preferred:Certifications in observability tools or cloud platforms and experience with Infrastructure as Code (e.g., Ansible, Terraform).
Additional Requirements:Must obtain and maintain a valid Nevada Gaming License.