Restic-based backup
VolSync supports taking backups of PersistentVolume data using the Restic-based data mover. A ReplicationSource defines the backup policy (target, frequency, and retention), while a ReplicationDestination is used for restores.
The Restic mover is different than most of VolSync’s other movers because it is not meant for synchronizing data between clusters. This mover is specifically designed for data backup.
Specifying a repository
For both backup and restore operations, it is necessary to specify a backup
repository for Restic. The repository and connection information are defined in
a restic-config
Secret.
Below is an example showing how to use a repository stored on Minio.
apiVersion: v1
kind: Secret
metadata:
name: restic-config
type: Opaque
stringData:
# The repository url
RESTIC_REPOSITORY: s3:http://minio.minio.svc.cluster.local:9000/restic-repo
# The repository encryption key
RESTIC_PASSWORD: my-secure-restic-password
# ENV vars specific to the chosen back end
# https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html
AWS_ACCESS_KEY_ID: access
AWS_SECRET_ACCESS_KEY: password
This Secret will be referenced for both backup (ReplicationSource) and for restore (ReplicationDestination). The key names in this configuration Secret directly correspond to the environment variable names supported by Restic.
Note
When providing credentials for Google Cloud Storage, the
GOOGLE_APPLICATION_CREDENTIALS
key should contain the actual contents of
the json credential file, not just the path to the file.
The path used in the RESTIC_REPOSITORY
is the s3 bucket but can optionally
contain a folder name within the bucket as well. This can be useful
if multiple PVCs are to be backed up to the same S3 bucket.
As an example one restic-config secret could use:
RESTIC_REPOSITORY: s3:http://minio.minio.svc.cluster.local:9000/restic-repo/pvc-1-backup
While another (saved in a separate restic-config secret) could use:
RESTIC_REPOSITORY: s3:http://minio.minio.svc.cluster.local:9000/restic-repo/pvc-2-backup
Note
If backing up multiple PVCs to the same S3 bucket, the path underneath the bucket must be unique for each PVC. Each PVC will be backed up with a separate ReplicationSource, and each should use its own separate restic-config secret
Note also by sharing the same s3 bucket this means write access to the s3 bucket will be granted to different replicationsources.
Note
If necessary, the repository will be automatically initialized (i.e.,
restic init
) during the first backup.
Configuring backup
A backup policy is defined by a ReplicationSource object that uses the Restic replication method.
---
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: mydata-backup
spec:
# The PVC to be backed up
sourcePVC: mydata
trigger:
# Take a backup every 30 minutes
schedule: "*/30 * * * *"
restic:
# Prune the repository (repack to free space) every 2 weeks
pruneIntervalDays: 14
# Name of the Secret with the connection information
repository: restic-config
# Retention policy for backups
retain:
hourly: 6
daily: 5
weekly: 4
monthly: 2
yearly: 1
# Clone the source volume prior to taking a backup to ensure a
# point-in-time image.
copyMethod: Clone
# The StorageClass to use when creating the PiT copy (same as source PVC if omitted)
#storageClassName: my-sc-name
# The VSC to use if the copy method is Snapshot (default if omitted)
#volumeSnapshotClassName: my-vsc-name
Backup options
There are a number of additional configuration options not shown in the above example. VolSync’s Restic mover options closely follow those of Restic itself.
- accessModes
When using a copyMethod of Clone or Snapshot, this field allows overriding the access modes for the point-in-time (PiT) volume. The default is to use the access modes from the source PVC.
- capacity
When using a copyMethod of Clone or Snapshot, this allows overriding the capacity of the PiT volume. The default is to use the capacity of the source volume.
- copyMethod
This specifies the method used to create a PiT copy of the source volume. Valid values are:
Clone - Create a new volume by cloning the source PVC (i.e., use the source PVC as the volumeSource for the new volume.
Direct - Do no create a PiT copy. The VolSync data mover will directly use the source PVC.
Snapshot - Create a VolumeSnapshot of the source PVC, then use that snapshot to create the new volume. This option should be used for CSI drivers that support snapshots but not cloning.
- storageClassName
This specifies the name of the StorageClass to use when creating the PiT volume. The default is to use the same StorageClass as the source volume.
- volumeSnapshotClassName
When using a copyMethod of Snapshot, this specifies the name of the VolumeSnapshotClass to use. If not specified, the cluster default will be used.
- cacheCapacity
This determines the size of the Restic metadata cache volume. This volume contains cached metadata from the backup repository. It must be large enough to hold the non-pruned repository metadata. The default is
1 Gi
.- cacheStorageClassName
This is the name of the StorageClass that should be used when provisioning the cache volume. It defaults to
.spec.storageClassName
, then to the name of the StorageClass used by the source PVC.- cacheAccessModes
This is the access mode(s) that should be used to provision the cache volume. It defaults to
.spec.accessModes
, then to the access modes used by the source PVC.- customCA
This option allows a custom certificate authority to be used when making TLS (https) connections to the remote repository.
- key
This is the name of the field within the Secret that holds the CA certificate
- secretName
This is the name of a Secret containing the CA certificate
- pruneIntervalDays
This determines the number of days between running
restic prune
on the repository. The prune operation repacks the data to free space, but it can also generate significant I/O traffic as a part of the process. Setting this option allows a trade-off between storage consumption (from no longer referenced data) and access costs.- repository
This is the name of the Secret (in the same Namespace) that holds the connection information for the backup repository. The repository path should be unique for each PV. Shared backup repositories are not currently supported.
- retain
This has sub-fields for
hourly
,daily
,weekly
,monthly
, andyearly
that allow setting the number of each type of backup to retain. There is an additional field,within
that can be used to specify a time period during which all backups should be retained. See Restic’s documentation on –keep-within for more information.When more than the specified number of backups are present in the repository, they will be removed via Restic’s
forget
operation, and the space will be reclaimed during the next prune.- unlock
This can be used to perform a
restic unlock
before the next backup. This is useful if the repository has a stale lock that prevents backups from being made. To run an unlock setunlock
to a string value. Note that this on its own will not schedule a replication (backup), the next replication will happen according to the trigger spec. Once a replication completes,status.restic.lastUnlocked
will be set to the same string value fromspec.restic.unlock
. Unlock will not be performed again on subsequent replications unlessspec.restic.unlock
is set to a different value.
Performing a restore
Data from a backup can be restored using the ReplicationDestination CR. In most cases, it is desirable to perform a single restore into an empty PersistentVolume.
For example, create a PVC to hold the restored data:
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datavol
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
Restore the data into datavol
:
---
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
name: datavol-dest
spec:
trigger:
manual: restore-once
restic:
repository: restic-repo
# Use an existing PVC, don't provision a new one
destinationPVC: datavol
copyMethod: Direct
In the above example, the data will be written directly into the new PVC since
it is specified via destinationPVC
, and no snapshot will be created since a
copyMethod
of Direct
is used.
The restore operation only needs to be performed once, so instead of using a cronspec-based schedule, a manual trigger is used. After the restore completes, the ReplicationDestination object can be deleted.
The example, shown above, will restore the data from the most recent backup. To
restore an older version of the data, the previous
and restoreAsOf
fields can be used. See below for more information on their meaning.
Restore options
There are a number of additional configuration options not shown in the above example.
- accessModes
When VolSync creates the destination volume, this specifies the accessModes for the PVC. The value should be ReadWriteOnce or ReadWriteMany.
- capacity
When VolSync creates the destination volume, this value is used to determine its size. This need not match the size of the source volume, but it must be large enough to hold the incoming data.
- copyMethod
This specifies how the data should be preserved at the end of each synchronization iteration. Valid values are:
Direct - Do not create a point-in-time copy of the data.
Snapshot - Create a VolumeSnapshot at the end of each iteration
- destinationPVC
Instead of having VolSync automatically provision the destination volume (using capacity, accessModes, etc.), the name of a pre-existing PVC may be specified here.
- cleanupTempPVC
This optional boolean specifies whether a destination PVC dynamically provisioned by VolSync should be deleted at the end of a successful sync iteration. If destinationPVC is specified, then this setting will have no effect, VolSync will only cleanup PVCs that it deployed. If this is set to true, every sync this ReplicationDestination makes will re-provision a new temp destination PVC and all data will need to be sent again during the sync. Dynamically provisioned destination PVCs will always be deleted if the owning ReplicationDestination is removed, even if this setting is false. The default is
false
.- storageClassName
When VolSync creates the destination volume, this specifies the name of the StorageClass to use. If omitted, the system default StorageClass will be used.
- volumeSnapshotClassName
When using a copyMethod of Snapshot, this value specifies the name of the VolumeSnapshotClass to use when creating a snapshot. If omitted, the system default VolumeSnapshotClass will be used.
- cacheCapacity
This determines the size of the Restic metadata cache volume. This volume contains cached metadata from the backup repository. It must be large enough to hold the non-pruned repository metadata. The default is
1 Gi
.- cacheStorageClassName
This is the name of the StorageClass that should be used when provisioning the cache volume. It defaults to
.spec.storageClassName
, then to the name of the StorageClass used by the source PVC.- cacheAccessModes
This is the access mode(s) that should be used to provision the cache volume. It defaults to
.spec.accessModes
, then to the access modes used by the source PVC.- cleanupCachePVC
This optional boolean determines if the cache PVC should be cleaned up at the end of the restore. Cache PVCs will always be deleted if the owning ReplicationDestination is removed, even if this setting is false. Defaults to
false
.- customCA
This option allows a custom certificate authority to be used when making TLS (https) connections to the remote repository.
- key
This is the name of the field within the Secret that holds the CA certificate
- secretName
This is the name of a Secret containing the CA certificate
- previous
Non-negative integer which specifies an offset for how many snapshots ago we want to restore from. When
restoreAsOf
is provided, the behavior is the same, however the starting snapshot considered will be the first one taken beforerestoreAsOf
.- repository
This is the name of the Secret (in the same Namespace) that holds the connection information for the backup repository. The repository path should be unique for each PV.
- restoreAsOf
An RFC-3339 timestamp which specifies an upper-limit on the snapshots that we should be looking through when preparing to restore. Snapshots made after this timestamp will not be considered. Note: though this is an RFC-3339 timestamp, Kubernetes will only accept ones with the day and hour fields separated by a
T
. E.g,2022-08-10T20:01:03-04:00
will work but2022-08-10 20:01:03-04:00
will fail.- enableFileDeletion
A boolean indicating whether files and directories that exist on the pvc being restored to should be deleted if they do not exist in the restic snapshot being restored. The default value is
false
.