
Observability Playground
Interactive observability lab demonstrating metrics, logs, and tracing.
Screenshots



Demo Video
The Problem
Many systems lack proper monitoring and tracing, making it hard to diagnose production issues.
The Solution
Observability Playground demonstrates metrics, logs, and traces through simulated failure scenarios.
Implementation Details
You can't fix what you can't see. This project is a curated environment where "chaos" is invited so that we can learn how to observe it.
The Three Pillars
I integrated the full LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus) to show how logs, metrics, and traces correlate. When a simulated memory leak occurs, you can see the resident memory metric spike, find the specific error logs, and trace the exact request path that triggered the leak.
Simulated Failures
The playground includes a "Chaos Dashboard" where you can trigger network latency, 5xx error storms, and CPU saturation events. It's the ultimate training ground for SREs.