Scylla Documentation Logo Documentation
  • Server
    • Scylla Open Source
    • Scylla Enterprise
    • Scylla Alternator
  • Cloud
    • Scylla Cloud
    • Scylla Cloud Docs
  • Tools
    • Scylla Manager
    • Scylla Monitoring Stack
    • Scylla Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
Download
Menu
Scylla Install Scylla Monitoring Stack

Install Scylla Monitoring Stack¶

Note

You are not reading the most recent version of this documentation. Go to the latest version of Scylla Monitoring Stack Documentation.

This document describes the setup of Scylla Monitoring Stack, based on Scylla Prometheus API.

The Scylla Monitoring Stack needs to be installed on a dedicated server, external to the Scylla cluster. Make sure the Scylla Monitoring Stack server has access to the Scylla nodes so that it can pull the metrics over the Prometheus API.

For evaluation, you can run the Scylla Monitoring Stack on any server (or laptop) that can handle three Docker instances at the same time. For production, see the recommendations below.

Minimal Production System Recommendations¶

Note

You are not reading the most recent version of this documentation. Go to the latest version of Scylla Monitoring Stack Documentation.

  • CPU - at least 2 physical cores/ 4vCPUs

  • Memory - 15GB+ DRAM

  • Disk - persistent disk storage is proportional to the number of cores and Prometheus retention period (see the following section)

  • Network - 1GbE/10GbE preferred

Calculating Prometheus Minimal Disk Space requirement¶

Prometheus storage disk performance requirements: persistent block volume, for example an EC2 EBS volume

Prometheus storage disk volume requirement: proportional to the number of metrics it holds. The default retention period is 15 days, and the disk requirement is around 200MB per core, assuming the default scraping interval of 15s.

For example, when monitoring a 6 node Scylla cluster, each with 16 CPU cores, and using the default 15 days retention time, you will need minimal disk space of

6 * 16 * 200MB ~ 20GB

To account for unexpected events, like replacing or adding nodes, we recommend allocating at least x4-5 space, in this case, ~100GB. Prometheus Storage disk does not have to be as fast as Scylla disk, and EC2 EBS, for example, is fast enough and provides HA out of the box.

Prerequisites¶

  • Follow the Installation Guide and install docker on the Scylla Monitoring Stack Server. This server can be the same server that is running a Scylla Manager. Alternatively, you can Deploy Scylla Monitoring Without Docker .

Docker Post Installation¶

Docker post-installation guide can be found here

Note

Do not run the container as root.

To avoid running docker as root, you should add the user you are going to use for Scylla Monitor to the Docker group.

  1. Create a Docker group.

sudo groupadd docker
  1. Add your user to the Docker group.

sudo usermod -aG docker $USER
  1. Start Docker by calling:

sudo systemctl enable docker

Install Scylla Monitoring Stack¶

Procedure

  1. Download and extract the latest Scylla Monitoring Stack binary;

wget https://github.com/scylladb/scylla-monitoring/archive/branch-3.4.tar.gz
tar -xvf branch-3.4.tar.gz
cd scylla-monitoring-branch-3.4

As an alternative, you can clone and use the Git repository directly.

git clone https://github.com/scylladb/scylla-monitoring.git
cd scylla-monitoring
git checkout branch-3.4
  1. Start Docker service if needed

sudo systemctl restart docker

Configure Scylla Monitoring Stack¶

To monitor the cluster, Scylla Monitor (Specifically the Prometheus Server) needs to know the IP of all the nodes and the IP of the Scylla Manager Server (if you are using Scylla Manager).

This configuration can be done from files or using the Consul api.

Sylla Manager 2.0 and higher supports the Consul API.

Configure Scylla nodes from files¶

  1. Create prometheus/scylla_servers.yml with the targets’ IPs (the servers you wish to monitor).

Note

It is important that the name listed in dc in the labels matches the datacenter names used by Scylla. Use the nodetool status command to validate the datacenter names used by Scylla.

For example:

- targets:
      - 172.17.0.2
      - 172.17.0.3
  labels:
      cluster: cluster1
      dc: dc1

Note

If you want to add your managed cluster to Scylla Monitoring Stack, add the IPs of the nodes as well as the cluster name you used when you added the cluster to Scylla Manager. It is important that the label cluster name and the cluster name in the Scylla Manager match.

Using IPV6

To use IPv6 inside scylla_server.yml, add the IPv6 addresses with their square brackets and the port numbers.

For example:

- targets:
      - "[2600:1f18:26b1:3a00:fac8:118e:9199:67b9]:9180"
      - "[2600:1f18:26b1:3a00:fac8:118e:9199:67ba]:9180"
  labels:
      cluster: cluster1
      dc: dc1

Note

For IPv6 to work, both Scylla Prometheus address and node_exporter’s –web.listen-address should be set to listen to an IPv6 address.

For general node information (disk, network, etc.) Scylla Monitoring Stack uses the node_exporter agent that runs on the same machine as Scylla does. By default, Prometheus will assume you have a node_exporter running on each machine. If this is not the case, you can override the node_exporter targets configuration file by creating an additional file and passing it with the -n flag.

Note

By default, there is no need to create node_exporter_server.yml. Prometheus will use the same targets it uses for Scylla and it will assume you have a node_exporter running on each Scylla server.

If needed, you can set your own target file instead of the default prometheus/scylla_servers.yml, using the -s for Scylla target files.

For example:

./start-all.sh -s my_scylla_server.yml -d data_dir

Mark the different Data Centers with Labels.

As can be seen in the examples, each target has its own set of labels to mark the cluster name and the data center (dc). You can add multiple targets in the same file for multiple clusters or multiple data centers.

You can use the genconfig.py script to generate the server file. For example:

./genconfig.py -d myconf -dc dc1:192.168.0.1,192.168.0.2 -dc dc2:192.168.0.3,192.168.0.4

This will generate a server file for four servers in two datacenters server 192.168.0.1 and 192.168.0.2 in dc1 and 192.168.0.3 and 192.168.0.4 in dc2.

OR

The genconfig.py script can also use nodetool status to generate the server file using the -NS flag.

nodetool status | ./genconfig.py -NS

2. Connect to Scylla Manager by creating prometheus/scylla_manager_servers.yml If you are using the Scylla Manager, you should set its IP.

You must add a scylla_manager_servers.yml file even if you are not using the manager. You can look at: prometheus/scylla_manager_servers.example.yml for an example.

For example

# List Scylla Manager end points

- targets:
  - 172.17.0.7:56090

Note that you do not need to add labels to the Scylla Manager targets.

Configure Scylla nodes using Scylla-Manager Consul API¶

Scylla Manager 2.0 has a Consul like API.

When using the manager as the configuration source, there is no need to set any of the files. Instead, you should set the scylla-manager IP from the command line using the -L flag.

For example:

./start-all.sh -L 10.10.0.1

Note

If you are running Scylla-Manager on the same host as Scylla-Monitoring you should use -l flag so that the localhost address will be available from within the container.

Start and Stop Scylla Monitoring Stack¶

Start¶

./start-all.sh -d data_dir

Stop¶

./kill-all.sh

Start a Specific Scylla Monitoring Stack Version¶

By default, start-all.sh will start with dashboards for the latest two Scylla versions and the latest Scylla Manager version.

You can specify specific Scylla version with the -v flag and Scylla Manager version with -M flag

For example:

./start-all.sh -v 3.1,master -M 2.0 -d /prometheus-data

will load the dashboards for Scylla versions 3.1 and master and the dashboard for Scylla Manager 2.0

Accessing the localhost¶

The Prometheus server runs inside a Docker container if it needs to reach a target on the local- host: either Scylla or Scylla-Manager. It needs to use the host network and not the Docker network. To do that run ./start-all.sh with the -l flag. For example:

./start-all.sh -l -d /prometheus-data

View Grafana Dashboards¶

Point your browser to your-server-ip:3000 By default, Grafana authentication is disabled. To enable it and set a password for user admin, use the -a option.

  • Getting Started
    • Install Scylla
      • Scylla Unified Installer (relocatable executable)
      • Air-gapped Server Installation
      • What is in each RPM
      • Scylla Housekeeping and how to disable it
      • Scylla Developer Mode
      • Scylla Configuration Reference
    • Configure Scylla
    • Scylla Requirements
      • System Requirements
      • OS Support by Platform and Version
      • Scylla in a Shared Environment
    • Cassandra Query Language (CQL)
      • CQLSh the CQL shell
      • Data Definition
      • Data Manipulation
      • Expiring Data with Time to Live (TTL)
      • Additional Information
      • Security
      • Data Types
      • Appendices
      • Definitions
      • Materialized Views
      • Functions
      • JSON
      • Global Secondary Indexes
      • Additional Information
      • Compaction
      • Consistency Levels
      • Reserved Keywords
      • Non-reserved Keywords
    • CQLSh: the CQL shell
    • Scylla Drivers
      • Scylla CQL Drivers
      • Scylla DynamoDB Drivers
    • Migrate to Scylla
      • Migration Process from Cassandra to Scylla
      • Scylla and Apache Cassandra Compatibility
      • Migration Tools Overview
    • Integration Solutions
      • Integrate Scylla with Spark
      • Integrate Scylla with KairosDB
      • Integrate Scylla with Presto
      • Integrate Scylla with Elasticsearch
      • Integrate Scylla with Kubernetes
      • Integrate Scylla with the JanusGraph Graph Data System
      • Integrate Scylla with DataDog
      • Integrate Scylla with Kafka
      • Integrate Scylla with IOTA Chronicle
      • Integrate Scylla with Spring
      • Shard-Aware Kafka Connector for Scylla
      • Install Scylla with Ansible
      • Integrate Scylla with Databricks
    • Tutorials
  • Scylla for Administrators
    • Administration Guide
    • Procedures
      • Cluster Management
      • Backup & Restore
      • Change Configuration
      • Maintenance
      • Best Practices
      • Benchmarking Scylla
      • Migrate from Cassandra to Scylla
      • Disable Housekeeping
    • Security
      • Scylla Security Checklist
      • Enable Authentication
      • Enable and Disable Authentication Without Downtime
      • Generate a cqlshrc File
      • Reset Authenticator Password
      • Enable Authorization
      • Grant Authorization CQL Reference
      • Role Based Access Control (RBAC)
      • Scylla Auditing Guide
      • Encryption: Data in Transit Client to Node
      • Encryption: Data in Transit Node to Node
      • Generating a self-signed Certificate Chain Using openssl
      • Encryption at Rest
      • LDAP Authentication
      • LDAP Authorization (Role Management)
    • Admin Tools
      • Nodetool Reference
      • CQLSh
      • REST
      • Tracing
      • scylla-sstable
      • SSTableLoader
      • cassandra-stress
      • SSTabledump
      • SSTable2json
      • SSTable Index
      • Scylla Logs
      • Seastar Perftune
    • Scylla Manager
      • Scylla Manager Docs
      • Upgrade Scylla Manager
      • Monitoring Support Matrix
    • Scylla Monitoring Stack
      • Latest Version
      • Upgrade Scylla Monitoring Stack
      • Monitoring Support Matrix
    • Scylla Operator
    • Upgrade Procedures
      • Scylla Enterprise
      • Scylla Open Source
      • Scylla Open Source to Scylla Enterprise
      • Scylla Manager
      • Scylla Monitoring
      • Scylla AMI
    • System Configuration
      • System Configuration Guide
      • scylla.yaml
      • Scylla Snitches
    • Benchmarking Scylla
  • Scylla for Developers
    • Learn To Use Scylla
      • Scylla University
      • Course catalog
      • Scylla Essentials
      • Basic Data Modeling
      • Advanced Data Modeling
      • MMS - Learn by Example
      • Care-Pet an IoT Use Case and Example
    • CQLSh
    • Apache Cassandra Query Language (CQL)
    • Scylla Alternator
    • Scylla Features
      • Scylla Open Source Features
      • Scylla Enterprise Features
    • Scylla Drivers
      • Scylla CQL Drivers
      • Scylla DynamoDB Drivers
  • Scylla Architecture
    • Scylla Ring Architecture
    • Scylla Fault Tolerance
    • Consistency Level Console Demo
    • Scylla Anti-Entropy
      • Scylla Hinted Handoff
      • Scylla Read Repair
      • Scylla Repair
    • SSTable
      • Scylla SSTable - 2.x
      • Scylla SSTable - 3.x
    • Compaction Strategies
  • Troubleshooting Scylla
    • Errors and Support
      • Report a Scylla problem
      • Error Messages
      • Change Log Level
    • Scylla Startup
      • Ownership Problems
      • Scylla will not Start
      • Scylla Python Script broken
    • Cluster and Node
      • Failed Decommission Problem
      • Cluster Timeouts
      • Node Joined With No Data
      • SocketTimeoutException
      • NullPointerException
    • Data Modeling
      • Scylla Large Partitions Table
      • Scylla Large Rows and Cells Table
      • Large Partitions Hunting
    • Data Storage and SSTables
      • Space Utilization Increasing
      • Disk Space is not Reclaimed
      • SSTable Corruption Problem
      • Pointless Compactions
      • Limiting Compaction
    • CQL
      • Time Range Query Fails
      • COPY FROM Fails
      • CQL Connection Table
      • Reverse queries fail
    • Scylla Monitor and Manager
      • Manager and Monitoring integration
      • Manager lists healthy nodes as down
  • Knowledge Base
    • Upgrading from experimental CDC
    • Compaction
    • Counting all rows in a table is slow
    • CQL Query Does Not Display Entire Result Set
    • When CQLSh query returns partial results with followed by “More”
    • Run Scylla and supporting services as a custom user:group
    • Decoding Stack Traces
    • Snapshots and Disk Utilization
    • DPDK mode
    • Debug your database with Flame Graphs
    • How to Change gc_grace_seconds for a Table
    • Gossip in Scylla
    • Increase Permission Cache to Avoid Non-paged Queries
    • How does Scylla LWT Differ from Apache Cassandra ?
    • Map CPUs to Scylla Shards
    • Scylla Memory Usage
    • NTP Configuration for Scylla
    • POSIX networking for Scylla
    • Scylla consistency quiz for administrators
    • Recreate RAID devices
    • How to Safely Increase the Replication Factor
    • Scylla and Spark integration
    • Increase Scylla resource limits over systemd
    • Scylla Seed Nodes
    • How to Set up a Swap Space
    • Scylla Snapshots
    • Stopping a local repair
    • System Limits
    • How to flush old tombstones from a table
    • Time to Live (TTL) and Compaction
    • Scylla Nodes are Unresponsive
    • Update a Primary Key
    • Using the perf utility with Scylla
    • Configure Scylla Networking with Multiple NIC/IP Combinations
  • Scylla University
  • Scylla FAQ
  • Contribute to Scylla
  • Glossary
  • Create an issue

On this page

  • Install Scylla Monitoring Stack
    • Minimal Production System Recommendations
      • Calculating Prometheus Minimal Disk Space requirement
    • Prerequisites
    • Docker Post Installation
    • Install Scylla Monitoring Stack
    • Configure Scylla Monitoring Stack
      • Configure Scylla nodes from files
      • Configure Scylla nodes using Scylla-Manager Consul API
    • Start and Stop Scylla Monitoring Stack
      • Start
      • Stop
      • Start a Specific Scylla Monitoring Stack Version
      • Accessing the localhost
    • View Grafana Dashboards
Logo
Docs Contact Us About Us
Mail List Icon Slack Icon
© 2022, ScyllaDB. All rights reserved.
Last updated on 10 May 2022.
Powered by Sphinx 4.3.2 & ScyllaDB Theme 1.2.1