Frequently asked questions
Popular topics
Why would you use OpenEBS on EBS?
There are at least four common reasons given for running OpenEBS on Amazon EBS
-
Attach/detach - The attach/detach process can slow the operation of environments dependent upon EBS.
-
No volume management needed- OpenEBS removes the need for volume management, enabling the combination of multiple underlying EBS volumes without the user needing to run LVM or other volume managers. This saves time and reduces operational complexity.
-
Expansion and inclusion of NVMe - OpenEBS allows users to add additional capacity without experiencing downtime. This online addition of capacity can include NVMe and SSD instances from cloud providers or deploy in physical servers. This means that as performance requirements increase, or decrease, Kubernetes can be used via storage policies to instruct OpenEBS to change capacity accordingly.
-
Other enterprise capabilities- OpenEBS adds other capabilities such as extremely efficient snapshots and clones as well as forthcoming capabilities such as encryption. Snapshots and clones facilitate much more efficient CI/CD workflows because zero space copies of databases and other stateful workloads can be used in these and other workflows, improving these workflows without incurring additional storage space or administrative effort. The snapshot capabilities can also be used for replication. As of February 2018, these replication capabilities are under development.
Where is my data stored? How can I see that?
OpenEBS stores data in a configurable number of replicas. These are placed to maximize resiliency, so, for example, they are placed in different racks or availability zones.
To determine exactly where your data is physically stored, you can run the following kubectl commands:
a. Run kubectl get pvc to fetch the Volume name. The volume name looks like- pvc-ee171da3-07d5-11e8-a5be-42010a8001be
b. For each volume, you will notice one IO controller pod and one or more replicas (as per the storage class configuration). For the above PVC, run the following command to get the IO controller and replica pods. kubectl get pods --all-namespaces | grep pvc-ee171da3-07d5-11e8-a5be-42010a8001be. The output will show the following pods-
..- IO Controller- pvc-ee171da3-07d5-11e8-a5be-42010a8001be-ctrl-6798475d8c-7dcqd ..- Replica 1- pvc-ee171da3-07d5-11e8-a5be-42010a8001be-rep-86f8b8c758-hls6s ..- Replica 2- pvc-ee171da3-07d5-11e8-a5be-42010a8001be-rep-86f8b8c758-tr28f;
c. To check the location where the data is stored, get the details of the replica pod. For getting the details of Replica 1 above, use the following command:
kubectl get pod -o yaml pvc-ee171da3-07d5-11e8-a5be-42010a8001be-rep-86f8b8c758-hls6s
Check for the volumes section:
- hostPath:
path: /var/openebs/pvc-ee171da3-07d5-11e8-a5be-42010a8001be
type: ""
name: openebs
How is the data protected? Please explain what happens when a host fails, or a client workload fails or a data center fails?
Kubernetes, of course, gives many ways to enable resilience. We leverage these wherever possible. For example, let’s say the IO container that has our iSCSI target fails. Well, it is spun back up by Kubernetes. The same applies to the underlying replica containers, where the data is stored. They are spun back up by Kubernetes. Now - the point of replicas is to make sure that when one or more of these replicas is being respun and then repopulated in the background by OpenEBS, the client applications still run. OpenEBS takes a simple approach to ensuring that multiple replicas can be accessed by an IO controller using a configurable quorum or the minimum number of replica requirements. Besides, our new cStor checks for silent data corruption and in some cases, can fix it in the background. Silent data corruption, unfortunately, can occur from poorly engineered hardware and from other underlying conditions, including those that your cloud provider is unlikely to report or identify.