Backup and Restore

Backup and Restore (Disaster Recovery)

Tellius supports a couple of types of backup and restore operations. As part of operating the Tellius platform on-premise or in customer-controlled public cloud instances, backup and restore needs to be considered by the Customer IT Teams (CIT).

Regular snapshots of volumes are recommended for backing up Tellius. The snapshots can be used to restore the data in a different instance or cluster in case of a disaster.

Tellius DevOps (TDO) will be available to assist in identifying the components that need to be backed up as well as with deployment on new instances as needed.

Before jumping into specifics on backup/restore, let's understand the different types of Tellius deployment.

Types of Deployments

Standalone Deployment

In a standalone environment, all different Tellius services write data into multiple directories in the same volume. TDO team will identify the specific volume required to be snapshotted and share the details with the CIT Team.

CIT team will be responsible for regular snapshots of this volume, and in case of a disaster, they should restore the latest snapshot onto a new volume or a new instance and hand it over to the TDO team.

TDO team is available to assist with the reinstall of required services on the new instance and restore all services with the snapshot data.

Multi Node Deployment

In a multi-node environment, all different Tellius services write data into different volumes mounted onto the instance. EBS volumes in the case of AWS and Azure Disks in the case of EKS.

There are 2 types of volumes attached to services within Tellius

  1. Temporary Data Volumes: which are used to write intermediate data, any data loss from these volumes will not result in the user losing any resources created within Tellius. These volumes are used to store the temporary intermediate output. For instance, Spark worker, etc
  2. Persistent Data Volumes: which are used to write persistent data, any data loss from these volumes might result in the user losing complete or partial resources created within Tellius. For instance, Postgres, MongoDB, Spark, Azkaban, etc. There can be around 13 volumes of this type.

Let's now look into the backup/restore support in Tellius.

Full Backup

As a part of the full backup, all the identified persistent data volumes should have a snapshot and the restore process would involve connecting the new cluster with the restored volumes. 

TDO team should identify all the volumes that need to be snapshotted and share the details with the CIT team.

CIT team will be responsible for regular snapshots of all the persistent data volumes, and in case of a disaster, they should restore the latest snapshot onto new volumes. TDO team will use the new volumes for the recovery.

TDO team is available to assist with the reinstall of required services on a new Kubernetes cluster and restore all services with the snapshot data.

Pros

  • Simple Setup
  • Faster Restore

Cons

  • Multiple volume snapshots are needed.
  • Snapshot size can be large.
To Restore your Data

Restore feature allows you to restore the instance to the previously taken backup through the S3 link of the backup file or upload the backup file from local storage. You can restore data of one instance to another instance as well and restore all your Search, Vizpads, Insights, AutoML Models, Metadata, and Data uploaded by the user.  

Did we help you?

How to Attach additional EBS volumes for Backup

Contact