Scylla Monitoring Stack¶
This document describes the setup of Scylla monitoring Stack, base on Scylla Prometheus API.
The monitoring stack needs to be installed on a dedicated server, external to the Scylla cluster. Make sure the monitoring server have access to the Scylla nodes so that it can pull the metrics over the Prometheus API.
For evaluation, you can run Scylla monitoring stack on any server (or laptop) that can handle two Docker instances at the same time. For production, see recommendation below.
Minimal Production System Recommendations¶
- CPU - at least 2 physical cores/ 4vCPUs
- Memory - 15GB+ DRAM
- Disk - 1TB+ of persistent disk storage
- Network - 1GbE/10GbE preferred
Scylla monitoring stack consists of three components, wrapped in Docker containers:
- prometheus - collects and stores metrics
- alertmanager - handles alerts
- grafana - dashboard server
- Download and extract the latest Scylla Monitoring Stack binary; for example, for release 2.0.0
wget https://github.com/scylladb/scylla-grafana-monitoring/archive/scylla-monitoring-2.0.tar.gz tar -xvf scylla-monitoring-2.0.tar.gz cd scylla-grafana-monitoring-scylla-monitoring-2.0
As an alternative, you can clone and use the git repository directly.
- Start docker service if needed
centos $ sudo service docker start ubuntu $ sudo systemctl restart docker
prometheus/node_exporter_servers.yml with the targets IP (the server you wish to monitor).
For every server, there are two targets, one under Scylla job which is used for the scylla metrics.
Use port 9180.
It is important that the dc in the target files will match the datacenters names used by Scylla.
nodetool status command to validate the datacenter names used by Scylla.
targets: - 172.17.0.2:9180 - 172.17.0.3:9180 labels: cluster: cluster1 dc: dc1
From Scylla Monitoring version 2.2, you no longer need to configure
node_exporter_server nor the ports numbers.
Instead, Prometheus will use the same targets it uses for Sylla and will assume you have a node_exporter
running on each Scylla server.
For general node information (disk, network, etc.) add the server under
prometheus/node_exporter_servers.yml job. Use port 9100.
targets: - 172.17.0.2:9100 - 172.17.0.3:9100 labels: cluster: cluster1 dc: dc1
It is possible to configure your own target files instead of updating
node_exporter_servers.yml, using the
-s for scylla target file and
-n for node targets file.
./start-all.sh -s my_scylla_server.yml -n my_node_exporter_servers.yml -d data_dir
In many deployments the contents of those files are very similar, with the same servers being listed differing only in the ports scylla and node_exporter listen to. To automatically generate the target files, one can use the
genconfig.py script, using the
-s flags to control which files get created:
./genconfig.py -ns -d myconf 192.168.0.1 192.168.0.2
After that, the monitoring stack can be started pointing to the servers at
./start-all.sh -s myconf/scylla_server.yml -n myconf/node_exporter_servers.yml
Use Labels to mark different Data Centers
As can be seen in the examples, each target has its own set of labels to mark the cluster name and the data center (dc). You can add multiple targets in the same file for multiple clusters or multiple data centers.
4. Connect to Scylla Manager by updating
If you are using Scylla Manager, you should set its ip.
# List Scylla Manager end points - targets: - 127.0.0.1:56090
Note that you do not need to add labels to the Scylla Manager targets.
Start and Stop¶
./start-all.sh -d data_dir
Setting Specific Version¶
By default, start-all.sh will start with dashboards for the latest two Scylla versions and the latest Scylla Manager version.
You can specify specific scylla version with the
-v flag and Scylla Manager version with
./start-all.sh -v 3.0,master -M 1.3
will load the dashboards for Scylla versions
master and the dashboard for Scylla Manager
View Grafana Monitor¶
Point your browser to
By default, Grafana authentication is disabled. To enable it and set a password for user admin use the
- Choose Disk and network interface
The dashboard holds a drop down menu at its upper left corner for disk and network interface. You should choose relevant disk and interface for the dashboard to show the graphs.