bookmark_borderKubernetes backup with Velero and Ceph

Velero is a tool that enables backup and restore Kubernetes cluster resources and persistent volumes. It simplifies the task of taking backups/restores, migrating resources to other clusters, and replication of clusters.

  • Stores Kubernetes resources in highly available object stores (S3, GCS, Blob Storage, etc.)
  • Backs up PVs / PVCs using cloud providers’ disk capture mechanism
  • Scheduling backups with a syntax cron
  • Rotation of automatic backups with TTL (Time to Live)
  • Supports community-enhanced plugins
Image for post
  1. The Velero client makes a call to the Kubernetes API server to create a Backup object.
  2. The BackupController notices the new Backupobject and performs validation.
  3. The BackupController begins the backup process. It collects the data to back up by querying the API server for resources.
  4. The BackupController makes a call to the object storage service – for example by S3 to upload the backup file.

1. Object Store environment

Velero supports many options for object store and because of my necessity to test this on onpremise environment I would like to test this with Ceph and Rados Gateway.

Check all supported object store in this link .

Create a S3 user

[root@ceph-mon1 ~]# sudo radosgw-admin user create --subuser=velero:s3 --display-name="Velero Kubernetes Backup" --key-type=s3 --access=full    "user_id": "velero",
"display_name": "Velero Kubernetes Backup",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [
"id": "velero:s3",
"permissions": "full-control"
"keys": [
"user": "velero:s3",
"access_key": "AOTBA6CUYR4P2WD7Q7ZK",
"secret_key": "d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6"
"user": "velero:s3",
"access_key": "RKF0CW7T2XA16BMJI8FW",
"secret_key": "Z8BF56cAsuj5KFSSjMOD0At1nGfVmTjPx3sOFpWZ"
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []

Install s3cmd to create a bucket by CLI

[root@ceph-mon1 ~]# yum install s3cmd

Configure s3cmd to use my Rados GW endpoint

[root@ceph-mon1 ~]# s3cmd --configureEnter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key [AOTBA6CUYR4P2WD7Q7ZK]:
Secret Key [d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6]:
Default Region [US]: Use "" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [radosgw.local.lab:80]: Use "%(bucket)" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.radosgw.local.lab]: radosgw.local.lab:80Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]: When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [No]: On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name: New settings:
Secret Key: d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6
Default Region: US
S3 Endpoint: radosgw.local.lab:80
DNS-style bucket+hostname:port template for accessing a bucket: radosgw.local.lab:80
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: False
HTTP Proxy server name:
HTTP Proxy server port: 0Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)Now verifying that encryption works...
Not configured. Never mind.Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'

Create a bucket for Velero

[root@ceph-mon1 ~]# s3cmd mb s3://veleroBucket 's3://velero/' created

2. Velero Install

CLI Download

$ wget
$ tar -xzf velero-v1.2.0-linux-amd64.tar.gz
$ sudo cp velero-v1.2.0-linux-amd64/velero /usr/local/sbin

Create a file with s3 credentials

$ vi credentials-store
aws_access_key_id = AOTBA6CUYR4P2WD7Q7ZK
aws_secret_access_key = d4ZY0cmAQcsmviwcpshE0bjWfyT5RDUROUE0BmJ6

Velero deployment with Rados Gateway

We are using the S3 credentials for access the bucket velero with aws S3 sdk plugin , so , change the s3Url for Rados Gateway like our example http://radosgw.local.lab

$ velero install  --provider aws --bucket velero \
--plugins velero/velero-plugin-for-aws:v1.0.0 \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config \

3. Velero Backup and Restore

Create a single Nginx deployment for the test

$ kubectl velero-v1.2.0-linux-amd64/examples/nginx-app/base.yaml

Create a backup from this Nginx app

$ velero backup create nginx-backup --selector app=nginx --snapshot-volumes=true
$ velero backup describe nginx-backup --details

List the job backup

$ velero get backup
nginx-backup Completed 2020-06-05 13:56:40 +0000 UTC 29d default app=nginx

Check the namespace and resources created for this example

# Get Namespaces
$ kubectl get namespacesNAME STATUS AGE
default Active 139m
kube-node-lease Active 139m
kube-public Active 139m
kube-system Active 139m
metallb-system Active 74m
nginx-example Active 27m
velero Active 132m
# Get Pods
$ kubectl get pods -n nginx-example
nginx-deployment-7cd5ddccc7-ctvg8 1/1 Running 0 27m
nginx-deployment-7cd5ddccc7-kpfz5 1/1 Running 0 27m

Delete the namespace

$ kubectl delete namespace nginx-example

Restore backup for nginx-example

$ velero restore create --from-backup nginx-backupRestore request "nginx-backup-20200605135949" submitted successfully.
Run `velero restore describe nginx-backup-20200605135949` or `velero restore logs nginx-backup-20200605135949` for more details.

Check the restore

$ kubectl get namespaces | grep nginx nginx-example     Active   3m44s
$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE
nginx-deployment-7cd5ddccc7-ctvg8 1/1 Running 0 3m52s
nginx-deployment-7cd5ddccc7-kpfz5 1/1 Running 0 3m51s

The content can be viewed when you check the bucket with s3cmd

$ s3cmd ls s3://velero
DIR s3://velero/backups/
DIR s3://velero/restores/

bookmark_borderKubernetes : Using Ceph RBD as Container Storage Interface (CSI)

Kubernetes Pods with persistent volumes are important for data persistence because containers are being created and destroyed, depending on the load and on the specifications of the developers. Pods and containers can self-heal and replicate. They are, in essence, ephemeral.

Before start this post I’ve been posted some hands on and conceptual docs to understand since the beginning this solution.

Continue reading “Kubernetes : Using Ceph RBD as Container Storage Interface (CSI)”

bookmark_borderCeph : An overview

Ceph is open source, software-defined storage maintained by RedHat. It’s capable of block, object, and file storage. The clusters of Ceph are designed in order to run on any hardware with the help of an algorithm called CRUSH (Controlled Replication Under Scalable Hashing). This algorithm ensures that all the data is properly distributed across the cluster and data quickly without any constraints. Replication, Thin provisioning, Snapshots are the key features of the Ceph storage.

There are good reasons for using Ceph as IaaS and PaaS storage :

  • Scale your operations and move to market faster.
  • Bridge the gaps between application development and data science.
  • Gain deeper insights into your data.
  • File, block and Object storage in the same wrapper.
  • Better transfer speed and lower latency
  • Easily accessible storage that can quickly scal up or down .

Site :

Continue reading “Ceph : An overview”

bookmark_borderCeph : Cluster deployment

In this post will be presented how to deploy the environment by ceph ansible.

An ansible deployment is the most standardized and official format among the main vendors using Ceph. Ex: Suse, Oracle and Redhat.

The installation presented in this document will use a similar flow for deploying:

If you would like to get an overview about Ceph these document could help you.

Continue reading “Ceph : Cluster deployment”

bookmark_borderKubernetes : Deploy a production ready cluster with Ansible.

Kubesray is a group of Ansible playbooks for deployment a Kubernetes cluster in most metal and most clouds.

Official site:

As described by project git repository :

  • Can be deployed on AWS, GCE, Azure, OpenStack, vSphere, Packet (bare metal), Oracle Cloud Infrastructure (Experimental), or Baremetal
  • Highly available cluster
  • Composable (Choice of the network plugin for instance)
  • Supports most popular Linux distributions
  • Continuous integration tests

My intention for this lab is show a deployment on premisses environment, so if you found this post directly without had a Kubernetes background maybe this overview can help you.

Continue reading “Kubernetes : Deploy a production ready cluster with Ansible.”

bookmark_borderKubernetes: An Overview

Kubernetes is a powerful open-source system, initially developed by Google, for managing containerized applications in a clustered environment. It aims to provide better ways of managing related, distributed components and services across varied infrastructure.

We talk about system architecture and how it solves problems with use cases to handle containerized deployments and scaling.

Kubernetes official site :

Continue reading “Kubernetes: An Overview”