Observability
Monitoring your cluster is essential to ensure the health, efficiency, and reliability of your storage infrastructure. The following technologies are implemented to monitor your cluster:
- Prometheus to collect data
- Grafana to visualize your data
Prometheus and Grafana, together provide a robust solution for collecting metrics, visualizing data, and creating alerts.
#
Monitoring SetupOpenEBS provides a basic cloud-native monitoring stack built using Prometheus and Grafana, as an add-on Helm chart. This has pre-configured dashboards for visualization of metrics from the various OpenEBS storages. Grafana will be using Prometheus as a data source.
#
Setup the Monitoring Helm RepositorySetup the Monitoring Helm Repository by using the following command:
#
Install the Helm ChartInstall the Helm Chart by using the following command:
With this installation, Prometheus and Grafana pods will be deployed.
#
Accessing Grafana Dashboard- You can view the Grafana Pod by using the following command:
- You can access the Grafana dashboard using the NodeIP (Public IP) and NodePort (Grafana service port) of your Kubernetes cluster.
- Visit http://NodeIp:NodePort. For example, if your Node IP address is
node-ip
and theNodePort
assigned is 12345, you would access Grafana using http://node-ip:12345.
The default Grafana login credentials are:
- username: admin
- password: admin
note
If public IP is not available, then you can access it via port-forwarding by using the following command and then visit http://127.0.0.1:[grafana-forward-port].
#
Pre-Configured DashboardsAfter accessing Grafana in the default directory, you can view the following pre-configured dashboards for various OpenEBS storages:
#
Replicated PV Mayastor DashboardDashBoard | Panel |
---|---|
DiskPool Information | Pool Status |
Total Pool Size | |
Used Pool Size | |
Available Pool Size | |
DiskPool IOPS (Read/Write) | |
DiskPool Throughput (Read/Write) | |
DiskPool Latency (Read/Write) | |
Volume Replica Information | Volume Replica IOPS (Read/Write) |
Volume Replica Throughput (Read/Write) | |
Volume Replica Latency (Read/Write) | |
Volume Information | Volume IOPS (Read/Write) |
Volume Throughput (Read/Write) | |
Volume Latency (Read/Write) |
#
Local PV LVM DashboardDashBoard | Panel |
---|---|
Volume Group Stats | Volume Group Capacity |
Volume Group Metadata Capacity | |
Volume Group Permission | |
Volume Group Allocation Policy | |
Volume Group Volumes Count | |
Volume Group PV Count | |
Volume Group Metadata Count | |
Volume Group Snapshot Count | |
Volume Group Volumes | |
Volume Group Performance Stats | Volume Group I/O Read |
Volume Group I/O Write | |
Volume Group R/W Data | |
Volume Group I/O Utilisation | |
Thin Pool Stats | Health Status |
Behaviour when Full | |
Pool Capacity | |
Pool Metadata Capacity | |
Pool Snapshot Full % | |
Pool Permission | |
Thin Volumes | |
Thin Pool Performance Stats | Thin Pool I/O Read |
Thin Pool I/O Write | |
Thin Pool R/W Data | |
Thin Pool I/O Utilisation | |
Volume Group PV Stats | Volume Group PV |
#
Local PV ZFS DashboardDashBoard | Panel |
---|---|
Volume Capacity | Used Space |
Pools | ZPOOL-Time |
ZPOOL-OPS | |
ARC | ARC Hit % |
ARC Hit/ARC Misses | |
ARC Size | |
ARC L2 | ARC L2 Hit % |
ARC L2 Hit/ARC L2 Misses | |
ARC L2 Size |