A curated list of useful resources for SLIs/SLOs
- The Site Reliability Workbook
- Implementing Service Level Objectives: A Practical Guide to SLIs, SLOs, and Error Budgets
- Alerting on SLOs like Pros
- Prometheus: Apdex alerting
- Application Performance Index – Apdex Technical Specification
- How to choose Apdex T: the final word
- The SLO Development Lifecycle
- SLOs eased
- Origin of Service Level Objectives
- SLO vs. SLA - Explained
- SLO formulas implementation in PromQL step by step
- SLO
- Multi tier SLO
- SLO: Elastic vs Datadog vs Grafana
- SLIs, SLOs, SLAs, oh my! (class SRE implements DevOps)
- Risk and Error Budgets (class SRE implements DevOps)
- LISA18 - SLO Burn—Reducing Alert Fatigue and Maintenance Cost in Systems of Any Size
- Alerting on error budget burn rate
- How to Include Latency in SLO-Based Alerting
- SLOconf: Should SLOs be request-based or time-based? why neither really works
- SLOconf 2022: Fred Moyer- Don't chase percentiles use histograms if you want precision SLO latency
- SLOconf: GitLab's journey to SLO Monitoring - by Andrew Newdigate
- SLOconf 2022: Andrew Newdigate & Bob Van Landuyt - Everyone can contribute to our SLO
- New Relic APM Tutorial: Understanding Apdex