Observability
Monitoring your cluster is essential to ensure the health, efficiency, and reliability of your storage infrastructure. The following technologies are implemented to monitor your cluster:
- Prometheus to collect data
- Grafana to visualize your data
Prometheus and Grafana, together provide a robust solution for collecting metrics, visualizing data, and creating alerts.
Monitoring Setup#
OpenEBS provides a basic cloud-native monitoring stack built using Prometheus and Grafana, as an add-on Helm chart. This has pre-configured dashboards for visualization of metrics from the various OpenEBS storages. Grafana will be using Prometheus as a data source.
Setup the Monitoring Helm Repository#
Setup the Monitoring Helm Repository by using the following command:
Install the Helm Chart#
Install the Helm Chart by using the following command:
With this installation, Prometheus and Grafana pods will be deployed.
Accessing Grafana Dashboard#
- You can view the Grafana Pod by using the following command:
- You can access the Grafana dashboard using the NodeIP (Public IP) and NodePort (Grafana service port) of your Kubernetes cluster.
- Visit http://NodeIp:NodePort. For example, if your Node IP address is
node-ipand theNodePortassigned is 12345, you would access Grafana using http://node-ip:12345.
The default Grafana login credentials are:
- username: admin
- password: admin
note
If public IP is not available, then you can access it via port-forwarding by using the following command and then visit http://127.0.0.1:[grafana-forward-port].
Pre-Configured Dashboards#
After accessing Grafana in the default directory, you can view the following pre-configured dashboards for various OpenEBS storages:
Replicated PV Mayastor Dashboard#
| DashBoard | Panel |
|---|---|
| DiskPool Information | Pool Status |
| Total Pool Size | |
| Used Pool Size | |
| Available Pool Size | |
| DiskPool IOPS (Read/Write) | |
| DiskPool Throughput (Read/Write) | |
| DiskPool Latency (Read/Write) | |
| Volume Replica Information | Volume Replica IOPS (Read/Write) |
| Volume Replica Throughput (Read/Write) | |
| Volume Replica Latency (Read/Write) | |
| Volume Information | Volume IOPS (Read/Write) |
| Volume Throughput (Read/Write) | |
| Volume Latency (Read/Write) |
Local PV LVM Dashboard#
| DashBoard | Panel |
|---|---|
| Volume Group Stats | Volume Group Capacity |
| Volume Group Metadata Capacity | |
| Volume Group Permission | |
| Volume Group Allocation Policy | |
| Volume Group Volumes Count | |
| Volume Group PV Count | |
| Volume Group Metadata Count | |
| Volume Group Snapshot Count | |
| Volume Group Volumes | |
| Volume Group Performance Stats | Volume Group I/O Read |
| Volume Group I/O Write | |
| Volume Group R/W Data | |
| Volume Group I/O Utilisation | |
| Thin Pool Stats | Health Status |
| Behaviour when Full | |
| Pool Capacity | |
| Pool Metadata Capacity | |
| Pool Snapshot Full % | |
| Pool Permission | |
| Thin Volumes | |
| Thin Pool Performance Stats | Thin Pool I/O Read |
| Thin Pool I/O Write | |
| Thin Pool R/W Data | |
| Thin Pool I/O Utilisation | |
| Volume Group PV Stats | Volume Group PV |
Local PV ZFS Dashboard#
| DashBoard | Panel |
|---|---|
| Volume Capacity | Used Space |
| Pools | ZPOOL-Time |
| ZPOOL-OPS | |
| ARC | ARC Hit % |
| ARC Hit/ARC Misses | |
| ARC Size | |
| ARC L2 | ARC L2 Hit % |
| ARC L2 Hit/ARC L2 Misses | |
| ARC L2 Size |