Backup and Restore

Backup and Restore (Disaster Recovery)

Tellius supports a couple of types of backup and restore operations. As part of operating the Tellius platform on-premise or in customer-controlled public cloud instances, backup and restore needs to be considered by the Customer IT Teams (CIT).

Regular snapshots of volumes are recommended for backing up Tellius. The snapshots can be used to restore the data in a different instance or cluster in case of a disaster.

Tellius DevOps (TDO) will be available to assist in identifying the components that need to be backed up as well as with deployment on new instances as needed.

Before jumping into specifics on backup/restore, let's understand the different types of Tellius deployment.

Types of Deployments

Standalone Deployment

In a standalone environment, all different Tellius services write data into multiple directories in the same volume. TDO team will identify the specific volume required to be snapshotted and share the details with the CIT Team.

CIT team will be responsible for regular snapshots of this volume, and in case of a disaster, they should restore the latest snapshot onto a new volume or a new instance and hand it over to the TDO team.

TDO team is available to assist with the reinstall of required services on the new instance and restore all services with the snapshot data.

Multi Node Deployment

In a multi-node environment, all different Tellius services write data into different volumes mounted onto the instance. EBS volumes in the case of AWS and Azure Disks in the case of EKS.

There are 2 types of volumes attached to services within Tellius

  1. Temporary Data Volumes: which are used to write intermediate data, any data loss from these volumes will not result in the user losing any resources created within Tellius. These volumes are used to store the temporary intermediate output. For instance, Spark worker, etc
  2. Persistent Data Volumes: which are used to write persistent data, any data loss from these volumes might result in the user losing complete or partial resources created within Tellius. For instance, Postgres, MongoDB, Spark, Azkaban, etc. There can be around 13 volumes of this type.

Let's now look into the backup/restore support in Tellius.

Types of Backup

Full Backup

As a part of the full backup, all the identified persistent data volumes should have a snapshot and the restore process would involve connecting the new cluster with the restored volumes. 

TDO team should identify all the volumes that need to be snapshotted and share the details with the CIT team.

CIT team will be responsible for regular snapshots of all the persistent data volumes, and in case of a disaster, they should restore the latest snapshot onto new volumes. TDO team will use the new volumes for the recovery.

TDO team is available to assist with the reinstall of required services on a new Kubernetes cluster and restore all services with the snapshot data.

Pros

  • Simple Setup
  • Faster Restore

Cons

  • Multiple volume snapshots are needed.
  • Snapshot size can be large.
Metadata Backup

As part of the metadata backups, Tellius uses its Backup & Restore service to backup only necessary metadata from all the services into a single volume.

The restore process would involve using Tellius Backup & Restore service to restore all the data into different services within Tellius and also recreate all the data by pulling them from the configured data sources.

TDO team will identify the volume used by the Backup & Restore service, which needs to be snapshotted and share the details with the CIT Team.

The CIT team will be responsible for regular snapshots of these specific volumes, and in case of a disaster, they should restore the latest snapshot onto a new volume and hand it over to the TDO team for recovery.

TDO team is available to assist with the reinstall of required services on a new Kubernetes cluster and restore all services and resources from the snapshot.

Pros

  • Single volume snapshot
  • Backups only a single copy of relevant data

Cons

  • Regular backups within Tellius and Regular snapshots of the backup volume in the Cloud console, both need to be configured.
  • The restore process can be time-consuming (It can take from a few hours to 1 or 2 days for the restore).
Metadata backup/restore workflow

The Backup feature allows you to take the Backup of your data to the AWS S3 bucket or your local machine. You have to log in as an admin.

To create the backup of your data:

  1.   Click the BACKUP button.
  1.   Select the location for backup, for example, S3 or local.

Note: Once the backup is completed, Tellius provides the S3 backup link or option to download the backup file.

To Restore your Data:

Restore feature allows you to restore the instance to the previously taken backup through the S3 link of the backup file or upload the backup file from local storage. You can restore data of one instance to another instance as well and restore all your Search, Vizpads, Insights, AutoML Models, Metadata, and Data uploaded by the user.  

Did we help you?

How to Attach additional EBS volumes for Backup

Contact