Kubernetes is increasingly used for running production-grade stateful services. Organizations are making progress on a containerized form of their production workloads for running in Kubernetes. There are already solutions available for the containerized version of stateful applications, network, storage, etc.
Handling node down / cluster scaling on Local PV backed workloads
OpenEBS is one of the leading containerized storage solutions for Kubernetes, and it is a rapidly growing Sandbox project in CNCF. MayaData is the primary maintainer and contributor of OpenEBS along with other companies. MayaData also contributed another open source project, Litmus, into CNCF, which does mostly Chaos engineering in Kubernetes, which helps SREs and developers to do all kinds of testing of their applications and components in Kubernetes before going into production.
It is a must requirement of a persistent storage solution for running the stateful application, be it a Deployment or StatefulSet. OpenEBS provides many storage engines, and each storage engine is suitable for specific applications or workloads. Some engines provide storage level synchronous replication, capable of taking snapshots and cloning, backup and restore, volume expansion, CSI complaint, performance-oriented, etc. So choosing the engine based on the workload requirement is an important activity.
OpenEBS provides dynamic provisioning of LocalPV using an external device, and this external device will be allocated entirely to an application. You can also use the partitioned disk for using OpenEBS LocalPV by using the `openebs-hostpath` storage engine. In this article, we provisioned a MySQL deployment on an OpenEBS LocalPV device dynamically.
This article is a step-by-step instruction. We will mention how a MySQL application deployment running on OpenEBS LocalPV device volume is getting handled when a Node down scenario or a cluster scale down situation happens in the GKE cluster. In GKE and some other managed clusters like EKS, the node name will change if the cluster undergoes a scale down and scale-up operation has performed. So the application running on the OpenEBS LocalPV device will not be able to attach to the new node since the corresponding PV has volume node affinity. We need to update the new node name details in PV, where the disk got attached.
In this article, we discuss the steps that need to be performed to make the application into a running state when a Node down / scale down cluster scenario has happened. This situation is usually required in case of managed clusters where the node name will get changed during this scenario. As stated earlier, the following approach works fine for both Deployment type and StatefulSet type, but ensure that the following steps are correctly satisfied. Let’s start with the step by step instructions once you have scaled up the cluster after a scale down scenario.
- Verify all nodes are now in Ready state.
$ kubectl get node NAME STATUS ROLES AGE VERSION gke-openebs-mysql-default-pool-d55297a7-bjjp Ready <none> 74s v1.16.13-gke.1 gke-openebs-mysql-default-pool-d55297a7-j1vm Ready <none> 80s v1.16.13-gke.1 gke-openebs-mysql-default-pool-d55297a7-pvg4 Ready <none> 85s v1.16.13-gke.1
2. Ensure OpenEBS pods are in Running state.
$ kubectl get pod -n openebs NAME READY STATUS RESTARTS AGE maya-apiserver-76cb4df9b8-wpbf6 1/1 Running 0 22m openebs-admission-server-5cf696b8d5-d97bn 1/1 Running 0 22m openebs-localpv-provisioner-7654f6dbd9-hskq8 1/1 Running 0 22m openebs-ndm-7dtts 1/1 Running 0 2m19s openebs-ndm-c4r4m 1/1 Running 0 2m23s openebs-ndm-lnb5c 1/1 Running 0 2m12s openebs-ndm-operator-6cfc59b69b-684nx 1/1 Running 0 22m openebs-provisioner-7d9884d4ff-tfcxj 1/1 Running 0 22m openebs-snapshot-operator-7ff577c889-kfttj 2/2 Running 0 22m
3. Check the status of the application pod. It will be in the `Pending` state.
$ kubectl get pod NAME READY STATUS RESTARTS AGE percona-9fbdb8678-lncd5 0/1 Pending 0 17m
4. Label all the nodes with the same custom label used in the `nodeSelector` field in the STS app. In my case, there is no custom node label used in application deployment. So we are skipping this step.
5. Attach the disk randomly to any node in the same zone. Note down the device name and node name where it is getting attached. This information will be needed in step 9.
$ gcloud compute instances attach-disk gke-openebs-mysql-default-pool-d55297a7-bjjp --disk mysql-disk1 --device-name mysql-disk1 --zone=us-central1-c
$ gcloud compute instances attach-disk gke-openebs-mysql-default-pool-d55297a7-j1vm --disk mysql-disk2 --device-name mysql-disk2 --zone=us-central1-c
$ gcloud compute instances attach-disk gke-openebs-mysql-default-pool-d55297a7-pvg4 --disk mysql-disk3 --device-name mysql-disk3 --zone=us-central1-c
6. Verify BDs are updated with new node names
$ kubectl get bd -n openebs NAME NODENAME SIZE CLAIMSTATE STATUS AGE blockdevice-4f51859193d333687a873af7acf8ad78 gke-openebs-mysql-default-pool-d55297a7-j1vm 32212254720 Unclaimed Active 37m blockdevice-967d7816c2a2d73b91c8c6310dc70465 gke-openebs-mysql-default-pool-d55297a7-bjjp 32212254720 Claimed Active 37m blockdevice-ddfc782ea661fc9007a896438f483e3d gke-openebs-mysql-default-pool-d55297a7-pvg4 32212254720 Unclaimed Active 37m
7. Get the PV details of the associated application
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-5cd17649-efe4-46e1-a5f3-f779b0e03999 5G RWO Delete Bound default/demo-vol1-claim openebs-device 33m
8. Create a directory and copy the YAML spec of all the associated PVs into it like below
$ mkdir mysql-restore $ cd mysql-restore/ $ kubectl get pv pvc-5cd17649-efe4-46e1-a5f3-f779b0e03999 -o yaml --export > pv1.yaml
Note: If it is StatefulSet, take the YAML spec of all the associated PVs of that application.
9. Modify the above-copied YAML with the new hostname in the copied YAML of PV. The following is that snippet of PV spec where it mentions the new node name where the Local disk is attached.
path: /dev/disk/by-id/scsi-0Google_PersistentDisk_mysql-disk1 nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - gke-openebs-mysql-default-pool-d55297a7-bjjp
10. Now get the PV and then delete the PV
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-5cd17649-efe4-46e1-a5f3-f779b0e03999 5G RWO Delete Bound default/demo-vol1-claim openebs-device 36m
$ kubectl delete pv pvc-5cd17649-efe4-46e1-a5f3-f779b0e03999 persistentvolume "pvc-5cd17649-efe4-46e1-a5f3-f779b0e03999" deleted
The deletion of the PV will not be completed since it has the finaliser set with the PV. So we need to cancel the ongoing operation and then edit the PV and remove Finalizers. Once finalizers are removed, the volume will be automatically deleted.
11. Verify that the PV of the application has been removed successfully.
$ kubectl get pv No resources were found in the default namespace.
12. Now, apply the updated YAML files of the PV.
$ kubectl apply -f pv1.yaml Note: Perform the same for other PVs as well if the application is a StatefulSet.
13. Verify that if PODs are started `Running` from `Pending` state.
$ kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES percona-9fbdb8678-lncd5 1/1 Running 0 29m 10.16.0.2 gke-openebs-mysql-default-pool-d55297a7-bjjp <none> <none>
14. Log in to the application and verify that you are able to access the data.
$ kubectl exec -it percona-9fbdb8678-lncd5 sh kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead. sh-4.2$ mysql -uroot -pk8sDem0; mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.30-33 Percona Server (GPL), Release 33, Revision 6517692 Copyright (c) 2009-2020 Percona LLC and/or its affiliates Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | pets | | sys | +--------------------+ 5 rows in set (0.07 sec) mysql> use pets; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> SELECT * FROM cats; +----+---------+--------+------------+ | id | name | owner | birth | +----+---------+--------+------------+ | 1 | Sandy | Lennon | 2015-01-03 | | 2 | Cookie | Casey | 2013-11-13 | | 3 | Charlie | River | 2016-05-21 | +----+---------+--------+------------+ 3 rows in set (0.00 sec)