Skip to content

Commit

Permalink
Add design for repository maintenance job
Browse files Browse the repository at this point in the history
Signed-off-by: Ming Qiu <[email protected]>
  • Loading branch information
qiuming-best committed Feb 23, 2024
1 parent 270b1de commit 574b274
Showing 1 changed file with 336 additions and 0 deletions.
336 changes: 336 additions & 0 deletions design/repository-maintenance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,336 @@
# Design for repository maintenance job

## Abstract
This design proposal aims to decouple repository maintenance from the Velero server by launching a maintenance job when needed, to mitigate the impact on the Velero server during backups.

## Background
During backups, Velero performs periodic maintenance on the repository. This operation may consume significant CPU and memory resources in some cases, leading to potential issues such as the Velero server being killed by OOM. This proposal addresses these challenges by separating repository maintenance from the Velero server.

## Goals
1. **Independent Repository Maintenance**: Decouple maintenance from Velero's main logic to reduce the impact on the Velero server pod.

2. **Configurable Resources Usage**: Make the resources used by the maintenance job configurable.

3. **No API Changes**: Retain existing APIs and workflow in the backup repository controller.

## Non Goals
We have lots of concerns over parallel maintenance, which will increase the complexity of our design currently.

- Non-blocking maintenance job: it may conflict with updating the same `backuprepositories` CR when parallel maintenance.

- Maintenance job concurrency control: there is no one suitable mechanism in Kubernetes to control the concurrency of different jobs.

- Parallel maintenance: Maintaining the same repo by multiple jobs at the same time would have some compatible cases that some providers may not support.

Unfortunately, parallel maintenance is currently not a priority because of the concerns above, improving maintenance efficiency is not the primary focus at this stage.

## High-Level Design
1. **Add Maintenance Subcommand**: Introduce a new Velero server subcommand for repository maintenance.

2. **Create Jobs by Repository Manager**: Modify the backup repository controller to create a maintenance job instead of directly calling the multiple chain calls for Kopia or Restic maintenance.

3. **Update Maintenance Job Result in BackupRepository CR**: Retrieve the result of the maintenance job and update the status of the `BackupRepository` CR accordingly.

4. **Add Setting for Maintenance Job**: Introduce a configuration option to set maintenance jobs, including resource limits (CPU and memory), keeping the latest N maintenance jobs for each repository.

## Detailed Design

### 1. Add Maintenance sub Command

The CLI command will be added to the Velero CLI, the command is designed for use in a pod of maintenance jobs.

Our CLI command is designed as follows:
```shell
$ velero server repo-maintenance --repo-name $repo-name --repo-type $repo-type --backup-storage-location $bsl
```

Compared with other CLI commands, the maintenance command is used in a pod of maintenance jobs not for user using, and the job should show the result of maintenance after finish.

Here we will write the error message into one specific file which could be read by the maintenance job.

on the whole, we record two kinds of logs:

- one is the log output of the intermediate maintenance process: this log could be retrieved via the Kubernetes API server, including the error log.

- one is the result of the command which could indicate whether the execution is an error or not: the result could be redirected to a file that the maintenance job itself could read, and the file only contains the error message.

If we want an error message to be recorded into a file, we could use the [hook of logrus](https://pkg.go.dev/github.com/sirupsen/[email protected]/hooks/writer#section-readme).

below we defined the `FileHook` struct which implements the `logrus.Hook` interface.

```golang
type FileHook struct {
file *os.File
}

// Levels return hook log level
func (hook *FileHook) Levels() []logrus.Level {
return []logrus.Level{logrus.ErrorLevel} // hook error level
}

// Fire will be called when some logging function is called with the current hook
func (hook *FileHook) Fire(entry *logrus.Entry) error {
_, err := hook.file.WriteString(entry.Message + "\n") // write into file
return err
}
```

The main maintenance logic would be using the repository provider to do the maintenance.

```golang
func checkError(err error, log logrus.FieldLogger) {
if err != nil {
if err != context.Canceled {
log.Error("An error occurred: %v\n", err)
}
os.Exit(1)
}
}

func (o *Options) Run(f veleroCli.Factory) {
...
log := logrus.New()
log.SetFormatter(&logrus.TextFormatter{})
errorFile, err := os.Create("/dev/termination-log")
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to create error.log file: %v\n", err)
return
}
defer errorFile.Close()
log.AddHook(&FileHook{errorFile})

credentialFileStore, err := credentials.NewNamespacedFileStore(
cli,
f.Namespace(),
"/tmp/credentials",
filesystem.NewFileSystem(),
)
checkError(err, log)
credentialSecretStore, err := credentials.NewNamespacedSecretStore(cli, f.Namespace())
checkError(err, log)
// Initialize repo provider
repoProvider := provider.NewUnifiedRepoProvider(
credentials.CredentialGetter{
FromFile: credentialFileStore,
FromSecret: credentialSecretStore,
}, o.RepoType, cli, log)

// Get backupRepository
repo, err := repository.GetBackupRepository(context.Background(), cli, f.Namespace(),
repository.BackupRepositoryKey{
VolumeNamespace: o.VolumeNamespace,
BackupLocation: o.BackupStorageLocation,
RepositoryType: o.RepoType,
}, true)
checkError(err, log)
// Get BSL
bsl := &velerov1api.BackupStorageLocation{}
cli.Get(context.Background(), client.ObjectKey{Namespace: f.Namespace(), Name: repo.Spec.BackupStorageLocation}, bsl)
checkError(err, log)

para := provider.RepoParam{
BackupRepo: repo,
BackupLocation: bsl,
}
// Connect
err = repoProvider.BoostRepoConnect(context.Background(), para)
checkError(err, log)
// Prune repo
err = repoProvider.PruneRepo(context.Background(), para)
checkError(err, log)
...
}
```

### 2. Create Jobs by Repository Manager
Currently, the backup repository controller will call the repository manager to do the `PruneRepo`, and Kopia or Restic maintenance is then finally called through multiple chain calls.

We will keep using the `PruneRepo` function in the repository manager, but we cut off the multiple chain calls by creating a maintenance job.

The job definition would be like below:
```yaml
apiVersion: v1
items:
- apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: "2024-02-19T02:22:36Z"
generation: 1
# labels or affinity or topology settings would inherit from the velero deployment
labels:
batch.kubernetes.io/controller-uid: fc7ab8a9-dcd4-4719-b4a2-f2fbcd091cd6
batch.kubernetes.io/job-name: nginx-example-default-kopia-pqz6c
controller-uid: fc7ab8a9-dcd4-4719-b4a2-f2fbcd091cd6
job-name: nginx-example-default-kopia-pqz6c
name: nginx-example-default-kopia-pqz6c
namespace: velero
spec:
backoffLimit: 6
completionMode: NonIndexed
completions: 1
parallelism: 1
selector:
matchLabels:
batch.kubernetes.io/controller-uid: fc7ab8a9-dcd4-4719-b4a2-f2fbcd091cd6
suspend: false
template:
metadata:
creationTimestamp: null
labels:
batch.kubernetes.io/controller-uid: fc7ab8a9-dcd4-4719-b4a2-f2fbcd091cd6
batch.kubernetes.io/job-name: nginx-example-default-kopia-pqz6c
controller-uid: fc7ab8a9-dcd4-4719-b4a2-f2fbcd091cd6
job-name: nginx-example-default-kopia-pqz6c
name: kopia-maintenance-job
spec:
containers:
# arguments for repo maintenance job
- args:
- server
- repo-maintenance
- --repo-name=nginx-example
- --repo-type=kopia
- --backup-storage-location=default
- --keep-latest-maintenance-jobs=3
- --log-level=debug
command:
- /velero
# inherit environment variables from the velero deployment
env:
- name: AZURE_CREDENTIALS_FILE
value: /credentials/cloud
# inherit image from the velero deployment
image: velero/velero:main
imagePullPolicy: IfNotPresent
name: kopia-maintenance-container
resources: {}
# error message would write to /dev/termination-log
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
# inherit volume mounts from the velero deployment
volumeMounts:
- mountPath: /credentials
name: cloud-credentials
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
# inherit service account from the velero deployment
serviceAccount: velero
serviceAccountName: velero
terminationGracePeriodSeconds: 30
volumes:
# inherit cloud credentials from the velero deployment
- name: cloud-credentials
secret:
defaultMode: 420
secretName: cloud-credentials
status:
active: 1
ready: 0
startTime: "2024-02-19T02:22:36Z"
```
Now, the backup repository controller will call the repository manager to create one maintenance job and wait for the job to complete. The Kopia or Restic maintenance multiple chains are called by the job.
### 3. Update the Result of the Maintenance Job into BackupRepository CR
The backup repository controller will update the result of the maintenance job into the backup repository CR.
For how to get the result of the maintenance job we could refer to [here](https://kubernetes.io/docs/tasks/debug/debug-application/determine-reason-pod-failure/#writing-and-reading-a-termination-message).
After the maintenance job is finished, we could get the result of maintenance by getting the terminated message from the related pod:
```golang
func GetContainerTerminatedMessage(pod *v1.Pod) string {
...
for _, containerStatus := range pod.Status.ContainerStatuses {
if containerStatus.LastTerminationState.Terminated != nil {
return containerStatus.LastTerminationState.Terminated.Message
}
}
...
return ""
}
```
Then we could update the status of backupRepository CR with the message.

### 4. Add Setting for Resource Usage of Maintenance
Add one configuration for setting the resource limit of maintenance jobs as below:
```shell
velero server --maintenance-job-cpu-request $cpu-request --maintenance-job-mem-request $mem-request --maintenance-job-cpu-limit $cpu-limit --maintenance-job-mem-limit $mem-limit
```
Our default value is 0, which means we don't limit the resources.

### 5. Clear up Maintenance Jobs
Add configuration for clear up maintenance jobs:

- keep-latest-maintenance-jobs: the number of keeping latest maintenance jobs

```shell
velero server --keep-latest-maintenance-jobs $num
```

We would check and keep the latest N jobs after a new job is finished.

```golang
// deleteOldMaintenanceJobs deletes old maintenance jobs and keeps the latest N jobs for each repository
func deleteOldMaintenanceJobs(cli client.Client, repo string, keep int) error {
// Get the maintenance job list by label
jobList := &batchv1.JobList{}
err := cli.List(context.TODO(), jobList, client.MatchingLabels(map[string]string{RepositoryNameLabel: repo}))
if err != nil {
return err
}

// Delete old maintenance jobs
if len(jobList.Items) > keep {
sort.Slice(jobList.Items, func(i, j int) bool {
return jobList.Items[i].CreationTimestamp.Before(&jobList.Items[j].CreationTimestamp)
})
for i := 0; i < len(jobList.Items)-keep; i++ {
err = cli.Delete(context.TODO(), &jobList.Items[i], client.PropagationPolicy(metav1.DeletePropagationBackground))
if err != nil {
return err
}
}
}

return nil
}
```

### 6. Observability and Debuggability
Some monitoring metrics are added for backup repository maintenance:
- repo_maintenance_total
- repo_maintenance_success_total
- repo_maintenance_failed_total
- repo_maintenance_duration_seconds

We will keep the latest N maintenance jobs for each repo, and users can get the log from the job. the job log level inherent from the Velero server setting.

Roughly, the process is as follows:
1. The backup repository controller will check the BackupRepository request in the queue periodically.

2. If the maintenance period of the repository checked by `runMaintenanceIfDue` in `Reconcile` is due, then the backup repository controller will call the Repository manager to execute `PruneRepo`

3. The `PruneRepo` of the Repository manager will create one maintenance job, the resource usage would follow the setting in the Velero server, and the environment variable, service account, images, etc. would follow the Velero server pod.

4. The maintenance job will execute the Velero maintenance command and write the result into the terminationMessagePath of the related pod.

5. The backup repository controller will wait for the jobs to update the message field and phase in the status of `backuprepositories` CR after finishing maintenance by reading the job terminal message accordingly.

6. The backup repository controller will clear jobs.

## Prospects for Future Work
Future work may focus on improving the efficiency of Velero maintenance through non-blocking parallel modes. Potential areas for enhancement include:

**Non-blocking Mode**: Explore the implementation of a non-blocking mode for parallel maintenance to enhance overall efficiency.

**Concurrency Control**: Investigate mechanisms for better concurrency control of different maintenance jobs.

**Provider Support for Parallel Maintenance**: Evaluate the feasibility of parallel maintenance for different providers and address any compatibility issues.

**Efficiency Improvements**: Investigate strategies to optimize maintenance efficiency without compromising reliability.

By considering these areas, future iterations of Velero may benefit from enhanced parallelization and improved resource utilization during repository maintenance.

0 comments on commit 574b274

Please sign in to comment.