Streamline Kubernetes Storage Upgrades

Introduction:

As technology advances, organizations are constantly seeking ways to optimize their IT infrastructure to enhance performance, reduce costs, and gain a competitive edge. One such approach involves migrating from traditional storage solutions to more advanced options that offer superior performance and cost-effectiveness. 

In this blog post, we’ll explore a recent project (On Azure) where we successfully migrated our client’s applications from Disk type Premium SSD to Premium SSD v2. This migration led to performance improvements and cost savings for our client.

Prerequisites:

Before initiating this migration, ensure the following prerequisites are in place:

  1. Kubernetes Cluster: Ensure you have a working K8S cluster to host your applications.
  2. Velero Backup Tool: Install Velero, a widely-used backup and restoration tool tailored for Kubernetes environments.

Overview of Velero:

Velero stands out as a powerful tool designed for robust backup, restore, and migration solutions within Kubernetes clusters. It plays a crucial role in ensuring data safety and continuity during complex migration operations.

Refer to the article on Velero installation and configuration.

Strategic Plan Overview:

There is two methods for upgrading storage classes:

  • Migration via Velero and CSI Integration: 

This approach leverages Velero’s capabilities in conjunction with CSI integration to achieve a seamless and efficient migration.

  • Using Cloud Methods: 

This method involves leveraging cloud provider-specific procedures. It includes steps like taking a snapshot of the disk, creating a new disk from the snapshot, and then establishing a Kubernetes volume using disk referencing. 

Step-by-Step Guide:

Migration via Velero and CSI Integration:

Step 1 : Storage Class for Premium SSD v2

Define a new storage class that supports Azure Premium SSD v2 disks. This storage class will be used to provision new persistent volumes during the restore process.

# We have taken azure storage class example

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: premium-ssd-v2
parameters:
 cachingMode: None
 skuName: PremiumV2_LRS # (Disk Type)
provisioner: disk.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Step 2: Volume Snapshot Class

Introduce a Volume Snapshot Class to enable snapshot creation for persistent volumes. This class will be utilized for capturing the current state of persistent volumes before restoring them using Premium SSD v2.

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: disk-snapshot-class
driver: disk.csi.azure.com
deletionPolicy: Delete
parameters:
  incremental: "false"

Step 3: Update Velero Deployment and Daemonset

Enable CSI (Container Storage Interface) support in both the Velero deployment and the node-agent daemonset. This modification allows Velero to interact with the Cloud Disk CSI driver for provisioning and managing persistent volumes. Additionally, configure the Velero client to utilize the CSI plugin, ensuring that Velero utilizes the Cloud Disk CSI driver for backup and restore operations.

# Enable CSI Server side features 

$ Kubectl -n velero edit deployment/velero
$ kubectl -n velero edit daemonset/restic

# Add below --features=EnableCSI flag in both resources 

    spec:
      containers:
      - args:
        - server
        - --features=EnableCSI

# Enable client side features 

$ velero client config set features=EnableCSI

Step 4: Take Velero Backup

Create a Velero backup of all existing persistent volumes stored on Disk Premium SSD. These backups serve as a safety net in case of any unforeseen issues during the migration process. And we can use the include and exclude flags with the velero backup commands.

Reference Article : https://velero.io/docs/v1.12/resource-filtering 

# run the below command for taking backup 
$ velero backup create backup_name --include-namespaces namespace_name

Step 5: ConfigMap Deployment 

Deploy a ConfigMap in the Velero namespace. This ConfigMap defines the mapping between the old storage class (Premium SSD) and the new storage class (Premium SSD v2). During the restore process, Velero will use this mapping to recreate the persistent volumes using the new storage class.

apiVersion: v1
data:
  # managed-premium : premium-ssd-v2
  older storage_class name : new storage_class name
kind: ConfigMap
metadata:
  labels:
    velero.io/change-storage-class: RestoreItemAction
    velero.io/plugin-config: ""
  name: storage-class-config
  namespace: velero

Step 6: Velero Restore Operation

Initiate the Velero restore process. This will replace the existing persistent volumes with new ones provisioned using Disk Premium SSD v2. The ConfigMap will ensure that the restored persistent volumes utilize the new storage class. 

Reference article: https://velero.io/docs/v1.12/restore-reference 

# run the below command for restoring from backups to different namespace 
$ velero restore create restore-name --from-backup backup-name --namespace-mappings namespace1:namespace2
# verify the new restored resources in namespace2
$ kubectl get pvc,pv,pod -n namespace2

Step 7: Verification & Testing

Verify that all applications continue to function correctly after the restore process. Check for any performance improvements and cost savings as a result of the migration to Premium SSD v2.

Step 8: Post-Migration Cleanup

Remove any temporary resources created during the migration process, such as Volume Snapshots, and the custom Volume Snapshot Class. And delete the old persistent volume claims (PVCs) that were associated with the Premium SSD disks. This will trigger the automatic deletion of the corresponding persistent volumes (PVs) and Azure Disk storage.

Impact:

It’s less risky because all new objects are created while retaining other copies with snapshots. And during the scheduling of new pods, the new Premium SSD v2 disks will be provisioned in the same zone as the node where the pod is being scheduled. While the content of the new disks is restored from the snapshot, there may be some downtime expected. The duration of downtime depends on the size of the disks being restored.

Conclusion:

Migrating from any storage class to a newer, more performant one using Velero can provide significant benefits for your organization. By leveraging Velero’s comprehensive backup and restore functionalities, you can effectively migrate your applications to the new storage class while maintaining data integrity and application functionality. Whether you’re upgrading from Premium SSD to Premium SSD v2 or transitioning to a completely different storage provider. By adopting this approach, organizations can reap the rewards of enhanced performance, reduced costs, and simplified storage management.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *