Introduction

Welcome to the Site Reliability Framework documentation. This section covers the principles, patterns, and practices used in production excellence consulting.

What You'll Find Here

This framework documents our approach to:

  • Reliability engineering — designing systems that stay up
  • Incident management — responding effectively when things go wrong
  • Observability — knowing what your systems are doing

Getting Started

Browse the sidebar to explore topics. Use Ctrl+K (or ⌘K) to search.

Example Code