This user guide of cStor contains advanced level of cStor related topics such as expanding a cStor volume, taking Snapshot and Clone of a cStor volume, scaling up cStor pools, Block Device Tagging, Tuning cStor Pools and Tuning cStor Volumes
- Scaling up cStor pools
- Snapshot and Clone of a cStor volume
- Expanding a cStor volume
- Block Device Tagging
- Tuning cStor Pools
- Tuning cStor Volumes
Once the cStor storage pools are created you can scale-up your existing cStor pool. To scale-up the pool size, you need to edit the CSPC YAML that was used for creation of CStorPoolCluster.
Scaling up can done by two methods:
Note: The dataRaidGroupType: can either be set as stripe or mirror as per your requirement. In the following example it is configured as stripe.
A new node spec needs to be added to previously deployed YAML,
Now verify the status of CSPC and CSPI(s):
As a result of this, we can see that a new pool have been added, increasing the number of pools to 4
blockDevices needs to be added to previously deployed YAML. Execute the following command to edit the CSPC,
An OpenEBS snapshot is a set of reference markers for data at a particular point in time. A snapshot act as a detailed table of contents, with accessible copies of data that user can roll back to the required point of instance. Snapshots in OpenEBS are instantaneous and are managed through kubectl.
During the installation of OpenEBS, a snapshot-controller and a snapshot-provisioner are setup which assist in taking the snapshots. During the snapshot creation, snapshot-controller creates VolumeSnapshot and VolumeSnapshotData custom resources. A snapshot-provisioner is used to restore a snapshot as a new Persistent Volume(PV) via dynamic provisioning.
Before proceeding to create a cStor volume snapshot and use it further for restoration, it is necessary to create a
VolumeSnapshotClass. Copy the following YAML specification into a file called
snapshot_class.yaml.kind: VolumeSnapshotClassapiVersion: snapshot.storage.k8s.io/v1metadata:name: csi-cstor-snapshotclassannotations:snapshot.storage.kubernetes.io/is-default-class: "true"driver: cstor.csi.openebs.iodeletionPolicy: Delete
The deletion policy can be set as
Delete or Retain. When it is set to Retain, the underlying physical snapshot on the storage cluster is retained even when the VolumeSnapshot object is deleted. To apply, execute:kubectl apply -f snapshot_class.yaml
Note: In clusters that only install
v1beta1version of VolumeSnapshotClass as the supported version(eg. OpenShift(OCP) 4.5 ), the following error might be encountered.no matches for kind "VolumeSnapshotClass" in version "snapshot.storage.k8s.io/v1"
In such cases, the apiVersion needs to be updated to
For creating the snapshot, you need to create a YAML specification and provide the required PVC name into it. The only prerequisite check is to be performed is to ensure that there is no stale entries of snapshot and snapshot data before creating a new snapshot. Copy the following YAML specification into a file called
snapshot.yaml.apiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotmetadata:name: cstor-pvc-snapspec:volumeSnapshotClassName: csi-cstor-snapshotclasssource:persistentVolumeClaimName: cstor-pvc
Run the following command to create the snapshot,kubectl create -f snapshot.yaml
To list the snapshots, execute:kubectl get volumesnapshots -n default
Sample Output:NAME AGEcstor-pvc-snap 10s
A VolumeSnapshot is analogous to a PVC and is associated with a
VolumeSnapshotContentobject that represents the actual snapshot. To identify the VolumeSnapshotContent object for the VolumeSnapshot execute:kubectl describe volumesnapshots cstor-pvc-snap -n default
Sample Output:Name: cstor-pvc-snapNamespace: default...Spec:Snapshot Class Name: cstor-csi-snapshotclassSnapshot Content Name: snapcontent-e8d8a0ca-9826-11e9-9807-525400f3f660Source:API Group:Kind: PersistentVolumeClaimName: cstor-pvcStatus:Creation Time: 2020-06-20T15:27:29ZReady To Use: trueRestore Size: 5Gi
VolumeSnapshotContentobject which serves this snapshot. The
Ready To Useparameter indicates that the Snapshot has been created successfully and can be used to create a new PVC.
Note: All cStor snapshots should be created in the same namespace of source PVC.
Once the snapshot is created, you can use it to create a PVC. In order to restore a specific snapshot, you need to create a new PVC that refers to the snapshot. Below is an example of a YAML file that restores and creates a PVC from a snapshot.
dataSource shows that the PVC must be created using a VolumeSnapshot named
cstor-pvc-snap as the source of the data. This instructs cStor CSI to create a PVC from the snapshot. Once the PVC is created, it can be attached to a pod and used just like any other PVC.
To verify the creation of PVC execute:
OpenEBS cStor introduces support for expanding a PersistentVolume using the CSI provisioner. Provided cStor is configured to function as a CSI provisioner, you can expand PVs that have been created by cStor CSI Driver. This feature is supported with Kubernetes versions 1.16 and above.
For expanding a cStor PV, you must ensure the following items are taken care of:
- The StorageClass must support volume expansion. This can be done by editing the StorageClass definition to set the allowVolumeExpansion: true.
- To resize a PV, edit the PVC definition and update the spec.resources.requests.storage to reflect the newly desired size, which must be greater than the original size.
- The PV must be attached to a pod for it to be resized. There are two scenarios when resizing an cStor PV:
- If the PV is attached to a pod, cStor CSI driver expands the volume on the storage backend, re-scans the device and resizes the filesystem.
- When attempting to resize an unattached PV, cStor CSI driver expands the volume on the storage backend. Once the PVC is bound to a pod, the driver re-scans the device and resizes the filesystem. Kubernetes then updates the PVC size after the expansion operation has successfully completed.
Below example shows the way for expanding cStor volume and how it works. For an already existing StorageClass, you can edit the StorageClass to include the
allowVolumeExpansion: true parameter.
For example an application busybox pod is using the below PVC associated with PV. To get the status of the pod, execute:
The following is a Sample Output:
To list PVCs, execute:
To list PVs, execute:
To resize the PV that has been created from 5Gi to 10Gi, edit the PVC definition and update the spec.resources.requests.storage to 10Gi. It may take a few seconds to update the actual size in the PVC resource, wait for the updated capacity to reflect in PVC status (pvc.status.capacity.storage). It is internally a two step process for volumes containing a file system:
- Volume expansion
- FileSystem expansion
Now, we can validate the resize has worked correctly by checking the size of the PVC, PV, or describing the pvc to get all events.
NDM provides you with an ability to reserve block devices to be used for specific applications via adding tag(s) to your block device(s). This feature can be used by cStor operators to specify the block devices which should be consumed by cStor pools and conversely restrict anyone else from using those block devices. This helps in protecting against manual errors in specifying the block devices in the CSPC yaml by users.
- Consider the following block devices in a Kubernetes cluster, they will be used to provision a storage pool. List the labels added to these block devices,
- Now, to understand how block device tagging works we will be adding
openebs.io/block-device-tag=fastto the block device attached to worker-node-3 (i.e blockdevice-00439dc464b785256242113bf0ef64b9)
Now, provision cStor pools using the following CSPC YAML. Note,
openebs.io/allowed-bd-tags: is set to
cstor, ssd which ensures the CSPC will be created using the block devices that either have the label set to cstor or ssd, or have no such label.
Apply the above CSPC file for CSPIs to get created and check the CSPI status.
Note that CSPI for node worker-node-3 is not created because:
- CSPC YAML created above has
openebs.io/allowed-bd-tags: cstor, ssdin its annotation. Which means that the CSPC operator will only consider those block devices for provisioning that either do not have a BD tag, openebs.io/block-device-tag, on the block device or have the tag with the values set as
cstor or ssd.
- In this case, the blockdevice-022674b5f97f06195fe962a7a61fcb64 (on node worker-node-1) and blockdevice-241fb162b8d0eafc640ed89588a832df (on node worker-node-2) do not have the label. Hence, no restrictions are applied on it and they can be used as the CSPC operator for pool provisioning.
- For blockdevice-00439dc464b785256242113bf0ef64b9 (on node worker-node-3), the label
openebs.io/block-device-taghas the value fast. But on the CSPC, the annotation openebs.io/allowed-bd-tags has value cstor and ssd. There is no fast keyword present in the annotation value and hence this block device cannot be used.
- To allow multiple tag values, the bd tag annotation can be written in the following comma-separated manner:
- BD tag can only have one value on the block device CR. For example,
- openebs.io/block-device-tag: fast Block devices should not be tagged in a comma-separated format. One of the reasons for this is, cStor allowed bd tag annotation takes comma-separated values and values like(i.e fast, ssd ) can never be interpreted as a single word in cStor and hence BDs tagged in above format cannot be utilised by cStor.
- If any block device mentioned in CSPC has an empty value for
the openebs.io/block-device-tag, then those block devices will not be considered for pool provisioning and other operations. Block devices with empty tag value are implicitly not allowed by the CSPC operator.
Allow users to set available performance tunings in cStor Pools based on their workload. cStor pool(s) can be tuned via CSPC and is the recommended way to do it. Below are the tunings that can be applied:
Resource requests and limits: This ensures high quality of service when applied for the pool manager containers.
Toleration for pool manager pod: This ensures scheduling of pool pods on the tainted nodes.
Set priority class: Sets the priority levels as required.
Compression: This helps in setting the compression for cStor pools.
ReadOnly threshold: Helps in specifying read only thresholds for cStor pools.
Example configuration for Resource and Limits:
Following CSPC YAML specifies resources and auxResources that will get applied to all pool manager pods for the CSPC. Resources get applied to cstor-pool containers and auxResources gets applied to sidecar containers i.e. cstor-pool-mgmt and pool-exporter.
In the following CSPC YAML we have only one pool spec (@spec.pools). It is also possible to override the resource and limit value for a specific pool.
Following CSPC YAML explains how the resource and limits can be overridden. If you look at the CSPC YAML, there are no resources and auxResources specified at pool level for worker-node-1 and worker-node-2 but specified for worker-node-3. In this case, for worker-node-1 and worker-node-2 the resources and auxResources will be applied from @spec.resources and @spec.auxResources respectively but for worker-node-3 these will be applied from @spec.pools.poolConfig.resources and @spec.pools.poolConfig.auxResources respectively.
Example configuration for Tolerations:
Tolerations are applied in a similar manner like resources and auxResources. The following is a sample CSPC YAML that has tolerations specified. For worker-node-1 and worker-node-2 tolerations are applied form @spec.tolerations but for worker-node-3 it is applied from @spec.pools.poolConfig.tolerations
Example configuration for Priority Class:
Priority Classes are also applied in a similar manner like resources and auxResources. The following is a sample CSPC YAML that has a priority class specified. For worker-node-1 and worker-node-2 priority classes are applied from @spec.priorityClassName but for worker-node-3 it is applied from @spec.pools.poolConfig.priorityClassName. Check more info about priorityclass.
Priority class needs to be created beforehand. In this case, high-priority and ultra-priority priority classes should exist.
The index starts from 0 for @.spec.pools list.apiVersion: cstor.openebs.io/v1kind: CStorPoolClustermetadata:name: cstor-disk-poolnamespace: openebsspec:priorityClassName: high-prioritypools:- nodeSelector:kubernetes.io/hostname: worker-node-1dataRaidGroups:- blockDevices:- blockDeviceName: blockdevice-ada8ef910929513c1ad650c08fbe3f36- blockDeviceName: blockdevice-ada8ef910929513c1ad650c08fbe3f37poolConfig:dataRaidGroupType: mirror- nodeSelector:kubernetes.io/hostname: worker-node-2dataRaidGroups:- blockDevices:- blockDeviceName: blockdevice-ada8ef910929513c1ad650c08fbe3f39- blockDeviceName: blockdevice-ada8ef910929513c1ad650c08fbe3f40poolConfig:dataRaidGroupType: mirror- nodeSelector:kubernetes.io/hostname: worker-node-3dataRaidGroups:- blockDevices:- blockDeviceName: blockdevice-ada8ef910929513c1ad650c08fbe3f42- blockDeviceName: blockdevice-ada8ef910929513c1ad650c08fbe3f43poolConfig:dataRaidGroupType: mirrorpriorityClassName: utlra-priority
Example configuration for Compression:
Compression values can be set at pool level only. There is no override mechanism like it was there in case of tolerations, resources, auxResources and priorityClass. Compression value must be one of
Note: lz4 is the default compression algorithm that is used if the compression field is left unspecified on the cspc. Below is the sample yaml which has compression specified.
Example configuration for Read Only Threshold:
RO threshold can be set in a similar manner like compression. ROThresholdLimit is the threshold(percentage base) limit for pool read only mode. If ROThresholdLimit (%) amount of pool storage is consumed then the pool will be set to readonly. If ROThresholdLimit is set to 100 then entire pool storage will be used. By default it will be set to 85% i.e when unspecified on the CSPC. ROThresholdLimit value will be 0 < ROThresholdLimit <= 100. Following CSPC yaml has the ReadOnly Threshold percentage specified.
Similar to tuning of the cStor Pool cluster, there are possible ways for tuning cStor volumes. cStor volumes can be provisioned using different policy configurations. However,
cStorVolumePolicy needs to be created first. It must be created prior to creation of StorageClass as
CStorVolumePolicy name needs to be specified to provision cStor volume based on configured policy. A sample StorageClass YAML that utilises
cstorVolumePolicy is given below for reference:
If the volume policy is not created before volume provisioning and needs to be modified later, it can be changed by editing the cStorVolumeConfig(CVC) resource as per volume bases which will be reconciled by the CVC controller to the respected volume resources. Each PVC creation request will create a CStorVolumeConfig(cvc) resource which can be used to manage volume, its policies and any supported operations (like, Scale up/down), per volume bases. To edit, execute:
The list of policies that can be configured are as follows:
For StatefulSet applications, to distribute single replica volume on specific cStor pool we can use replicaAffinity enabled scheduling. This feature should be used with delay volume binding i.e.
volumeBindingMode: WaitForFirstConsumer in StorageClass. When
volumeBindingMode is set to
WaitForFirstConsumer the csi-provisioner waits for the scheduler to select a node. The topology of the selected node will then be set as the first entry in preferred list and will be used by the volume controller to create the volume replica on the cstor pool scheduled on preferred node.
replicaAffinity spec needs to be enabled via volume policy before provisioning the volume
The Stateful workloads access the OpenEBS storage volume by connecting to the Volume Target Pod. Target Pod Affinity policy can be used to co-locate volume target pod on the same node as the workload. This feature makes use of the Kubernetes Pod Affinity feature that is dependent on the Pod labels.
For this labels need to be added to both, Application and volume Policy.
Given below is a sample YAML of
CStorVolumePolicy having target-affinity label using
kubernetes.io/hostname as a topologyKey in CStorVolumePolicy:
Set the label configured in volume policy, openebs.io/target-affinity: fio-cstor , on the app pod which will be used to find pods, by label, within the domain defined by topologyKey.
Performance tunings based on the workload can be set using Volume Policy. The list of tunings that can be configured are given below:
This limits the ongoing IO count from iscsi client on Node to cStor target pod. The default value for this parameter is set at 32.
cStor target IO worker threads, sets the number of threads that are working on QueueDepth queue. The default value for this parameter is set at 6. In case of better number of cores and RAM, this value can be 16, which means 16 threads will be running for each volume.
cStor volume replica IO worker threads, defaults to the number of cores on the machine. In case of better number of cores and RAM, this value can be 16.
Given below is a sample YAML that has the above parameters configured.
Note: These Policy tunable configurations can be changed for already provisioned volumes by editing the corresponding volume CStorVolumeConfig resources.
CStorVolumePolicy can also be used to configure the volume Target pod resource requests and limits to ensure QoS. Given below is a sample YAML that configures the target container's resource requests and limits, and auxResources configuration for the sidecar containers.
To know more about Resource configuration in Kubernetes, click here.
Note: These resource configuration(s) can be changed, for provisioned volumes, by editing the CStorVolumeConfig resource on per volume level.
An example to patch an already existing
CStorVolumeConfig resource is given below,
Create a file, say patch-resources-cvc.yaml, that contains the changes and apply the patch on the resource.
To apply the patch,
This Kubernetes feature allows users to taint the node. This ensures no pods are be scheduled to it, unless a pod explicitly tolerates the taint. This Kubernetes feature can be used to reserve nodes for specific pods by adding labels to the desired node(s).
One such scenario where the above tunable can be used is: all the volume specific pods, to operate flawlessly, have to be scheduled on nodes that are reserved for storage.
Priority classes can help in controlling the Kubernetes schedulers decisions to favor higher priority pods over lower priority pods. The Kubernetes scheduler can even preempt lower priority pods that are running, so that pending higher priority pods can be scheduled. Setting pod priority also prevents lower priority workloads from impacting critical workloads in the cluster, especially in cases where the cluster starts to reach its resource capacity. To know more about PriorityClasses in Kubernetes, click here.
Note: Priority class needs to be created before volume provisioning.
Given below is a sample CStorVolumePolicy YAML which utilises priority class.