This document describes steps that needed to be done to troubleshoot monitoring problems when using Grafana/Prometheus monitoring tool.
No data points on all data charts.
- Prometheus may be pointing to the wrong target. Check your
prometheus/node_exporter_servers.yml. Make sure in both cases Prometheus is pulling data from the Scylla server.
- Your dashboard and Scylla version may not be aligned. If you are running Scylla 2.0.x, you need to start the monitoring server with
./start-all.sh -v 2.0.1
More on start-all.sh options.
Run this procedure on the Monitoring server.
All of Grafana chart shows error (!) sign. There is a problem with the connection between Grafana and Prometheus. On the monitoring server:
1. Check Prometheus is running using
sudo docker ps.
If it is not running check the
prometheus.yml for errors.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 41bd3db26240 monitor "/docker-entrypoin..." 25 seconds ago Up 23 seconds 7000-7001/tcp, 9042/tcp, 9160/tcp, 9180/tcp, 10000/tcp monitor
- If it is running, go to “Data Source” in the Grafana GUI, choose Prometheus and click Test Connection.
Grafana shows server level metrics like disk usage, but not Scylla metrics. Prometheus fails to fetch metrics from Scylla servers.
curl <scylla_node>:9180/metricsto fetch binary metric data from Scylla. If curl does not return data, the problem is the connectivity between the monitoring and Scylla server. Please check your IPs and firewalls.
Grafana dashboard shows Scylla metrics, such as load, but not server metrics like disk usage.
Prometheus fail to fetch metrics from
1. Make sure
node_exporter is running on each Scylla server.
node_exporter is installed by
If it does not, make sure to install and run it.
- If is running, use
curl <scylla_node>:9100/metrics(where 172.17.0.2 is a Scylla server IP) to fetch binary metric data from Scylla. If curl does not return data, the problem is the connectivity between the monitoring and Scylla server. Please check your IPs and firewalls.
No metrics shown in Scylla monitor.
- Install wire-shark
2. Capture the traffic between Scylla monitor and Scylla node using the
tshark -i <network interface name> -f "dst port 9180"
tshark -i eth0 -f "dst port 9180"
Capture from Scylla node towards Scylla monitor server.
Scylla is running.
Monitor ip Scylla node ip 18.104.22.168 -> 172.16.12.142 TCP 66 59212 > 9180 [ACK] Seq=317 Ack=78193 Win=158080 Len=0 TSval=79869679 TSecr=3347447210
Scylla is not running
Monitor ip Scylla node ip 22.214.171.124 -> 172.16.12.142 TCP 74 60440 > 9180 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=79988291 TSecr=0 WS=128