Skip to main content

Replica Rebuilds#

With the previous versions, the control-plane ensured replica redundancy by monitoring all volume targets and checking for any volume targets that were in Degraded state, indicating that one or more replicas of that volume targets were faulty. When a matching volume targets is found, the faulty replica is removed. Then, a new replica is created and added to the volume targets object. As part of adding the new child data-plane, a full rebuild was initiated from one of the existing Online replicas. However, the drawback to the above approach was that even if a replica was inaccessible for a short period (e.g., due to a node restart), a full rebuild was triggered. This may not have a significant impact on replicas with small sizes, but it is not desirable for large replicas.

The partial rebuild feature overcomes the above problem and helps in achieving faster rebuild times. When volume target encounters I/O error on a child/replica, it marks the child as Faulted (removing it from the I/O path) and begins to maintain a write log for all subsequent writes. The Core agent starts a default 10 minute wait for the replica to come back. If the child's replica is online again within timeout, the control-plane requests the volume target to online the child and add it to the IO path along with a partial rebuild process using the aforementioned write log.

info

The control-plane waits for 10 minutes before initiating the full rebuild process, as the --faulted-child-wait-period is set to 10 minutes. To configure this parameter, edit values.yaml.

Replica Rebuild History#

The data-plane handles both full and partial replica rebuilds. To view history of the rebuilds that an existing volume target has undergone during its lifecycle until now, you can use the given kubectl command.

To get the output in table format:

Command

kubectl mayastor get rebuild-history {your_volume_UUID}

Output

DST SRC STATE TOTAL RECOVERED TRANSFERRED IS-PARTIAL START-TIME END-TIME
b5de71a6-055d-433a-a1c5-2b39ade05d86 0dafa450-7a19-4e21-a919-89c6f9bd2a97 Completed 7MiB 7MiB 0 B true 2023-07-04T05:45:47Z 2023-07-04T05:45:47Z
b5de71a6-055d-433a-a1c5-2b39ade05d86 0dafa450-7a19-4e21-a919-89c6f9bd2a97 Completed 7MiB 7MiB 0 B true 2023-07-04T05:45:46Z 2023-07-04T05:45:46Z

To get the output in JSON format:

Command

kubectl mayastor get rebuild-history {your_volume_UUID} -ojson

Output

{
"targetUuid": "c9eb4172-e90c-40ca-9db0-26b2ae372b28",
"records": [
{
"childUri": "nvmf://10.1.0.9:8420/nqn.2019-05.io.openebs:b5de71a6-055d-433a-a1c5-2b39ade05d86?uuid=b5de71a6-055d-433a-a1c5-2b39ade05d86",
"srcUri": "bdev:///0dafa450-7a19-4e21-a919-89c6f9bd2a97?uuid=0dafa450-7a19-4e21-a919-89c6f9bd2a97",
"rebuildJobState": "Completed",
"blocksTotal": 14302,
"blocksRecovered": 14302,
"blocksTransferred": 0,
"blocksRemaining": 0,
"blockSize": 512,
"isPartial": true,
"startTime": "2023-07-04T05:45:47.765932276Z",
"endTime": "2023-07-04T05:45:47.766825274Z"
},
{
"childUri": "nvmf://10.1.0.9:8420/nqn.2019-05.io.openebs:b5de71a6-055d-433a-a1c5-2b39ade05d86?uuid=b5de71a6-055d-433a-a1c5-2b39ade05d86",
"srcUri": "bdev:///0dafa450-7a19-4e21-a919-89c6f9bd2a97?uuid=0dafa450-7a19-4e21-a919-89c6f9bd2a97",
"rebuildJobState": "Completed",
"blocksTotal": 14302,
"blocksRecovered": 14302,
"blocksTransferred": 0,
"blocksRemaining": 0,
"blockSize": 512,
"isPartial": true,
"startTime": "2023-07-04T05:45:46.242015389Z",
"endTime": "2023-07-04T05:45:46.242927463Z"
}
]
}

For example: kubectl mayastor get rebuild-history e898106d-e735-4edf-aba2-932d42c3c58d -ojson

note

The volume's rebuild history records are stored and maintained as long as the volume target remains intact without any disruptions caused by node failures or recreation.

Was this page helpful? We appreciate your feedback