Cracking Observability: Mastering Logs, Metrics & Traces
Achieving Observability with Prometheus, Grafana, cAdvisor, Redis, Loki & Promtail
In the cloud-native world we live in today, observability is no longer a nicety—it's a requirement. DevOps engineers must know what occurs within our applications and infrastructure in real time. Observability is made up of three foundational pillars:
✅ Logs – What happened?
✅ Metrics – How healthy is the system?
✔️ Traces – Where did the request go?
Prerequisites
Before getting started, ensure you have the following setup:
Cloud Provider: AWS
Instance Type: Ubuntu t2.medium
Storage: 15GB
After connecting to the server through SSH, use the following steps to update packages, install Docker & Docker Compose, and set up permissions
sudo apt-get update # Update package lists to ensure the latest versions are available
sudo apt-get install docker.io docker-compose-v2 -y # Install Docker and Docker Compose
sudo usermod -aG docker USER && newgrp docker #Add the current user to the Docker group
docker-compose version
mkdir observability #Create a directory for the observability
cd observability
ls
git clone <repo> #clone the django app repo
checkout dev #now switch for better clarity
Build the Docker Image and Run the Docker Container
docker build -t note-app .
docker run -d -p 8000:8000 notes-app
Open Port 8000 in AWS Security Group
Stopping the Docker Container and Switching to Docker Compose"
Now that we have run the application using a single Docker container, let's stop it and move to a Docker Compose setup for better manageability.
docker stop <id>
version: "3.8"
services:
notes-app:
build: .
context: django-notes-app/.
container_name: notes-app
ports:
- "8000:8000"
docker compose up
For better tracking and monitoring of logs, we must setup Prometheus for scraping metrics. It helps us provide insights into application health and container performance.
Setting Up Prometheus for Metrics Scraping
1️⃣ Update docker compose.yml to add Prometheus
2️⃣ Make a Prometheus config file (prometheus.yml)
vim docker compose.yml
Specify the scrape targets:
As a starting point, set the target to localhost:8000 (your application).
Update to add cAdvisor for monitoring containers
After that do docker-compose up -d
open port 9090 in the security group http://<your-server-ip>:9090
Now that Prometheus is running, open your browser and navigate to
if you do ipv4:9090/metrics
Now that Prometheus is in place, the second step is target health—verifying Prometheus can effectively scrape our application's metrics. Monitoring the app alone, however, is insufficient; we require visibility into container performance as well.
cAdvisor is an open-source software created by Google that offers real-time monitoring and resource utilization analysis for running containers. It collects, aggregates, and exposes CPU, memory, disk, and network usage metrics for every container automatically, which makes it a must-have observability tool.
Let us take it to the next level by adding Redis to our stack. Redis is typically utilized as a caching layer to enhance the performance of applications through the alleviation of database load. Redis monitoring allows for maximum performance, availability, and early identification of impending bottlenecks.
Now, open vim docker copose.yml
docker compose down
docker compose up -d
Open port 8080 for cadvisor
you can check its running or not
To monitor system-level metrics like CPU, memory usage, disk usage, and network usage, we have to incorporate Node Exporter into our stack. Node Exporter gathers and makes these available in a way that is scrape-friendly for Prometheus
now open security group of port 9100 for node exporter
Now that we have Prometheus, Node Exporter, cAdvisor, and Redis Exporter installed, let's add Grafana to visualize and analyze our logs, metrics, and traces in real-time. Grafana offers interactive dashboards, alerting, and monitoring features, making it the ideal tool for observability
Modify your docker-compose.yml
to include Grafana docker compose down
and
docker compose up -d
and open port 3000 in a security group
Adding Prometheus as a Data Source in Grafana
After you've logged into Grafana, proceed as follows to add Prometheus as a data source:
Step 1: Access Data Sources
Navigate to Configuration (⚙️) → Data Sources
Click Add Data Source
Step 2: Configure Prometheus
Choose Prometheus from the list of available data sources.
In the URL field, enter:
http://prometheus:9090
Leave Scrape Interval as default or modify it according to your requirements.
Click Save & Test.
If all is well, you should get a "Data source is working" message. ✅
Now that Prometheus is configured as a data source, let's import a Node Exporter dashboard to track system metrics such as CPU, memory, disk, and network usage.
Importing Node Exporter Dashboard in Grafana
1️⃣ Navigate to Dashboards → Manage → Import
2️⃣ Dashboard ID: (Node Exporter Full Dashboard)
3️⃣ Click Load → Select Prometheus → Import
Metrics Monitored:
✅ CPU Usage | ✅ Memory | ✅ Disk | ✅ Network Traffic
Loki is a log aggregation system similar to Prometheus but for logs, providing efficient and cost-effective log storage. Promtail acts as the log collector, forwarding logs from Docker containers to Loki for analysis.
Now, let's add Loki and Promtail to our Docker Compose setup for log aggregation and integrate them into Grafana
Modify docker-compose.yml and include the following services
and then docker compose up -d
Go to the security group and open port Loki 3100
Now you can access Loki wait a few sec then it will be ready
Now you Import Loki Dashboard into Grafana
Logs should appear in real-time
By integrating Prometheus, Loki, and Grafana, we’ve built a powerful observability stack capable of monitoring logs, metrics, and traces in real-time. This setup enhances system visibility, aiding DevOps teams in proactive issue resolution and performance optimization
The full setup, Docker Compose files, Prometheus, Loki, and Grafana configurations can be found in my GitHub repository
🔗 GitHub Repo:- https://github.com/fauzeya67/observability
Try implementing this observability stack on your cloud instance! If you have questions or suggestions, let me know 🚀🚀🚀