Kubernetes CSI in Action: Explained with Features and Use Cases

Kubernetes Volume plugins have been a great way for the third-party storage providers to support a block or file storage system by extending the Kubernetes volume interface and are “In-Tree” in nature.

In this post, we will dig into Kubernetes Container Storage Interface. We will use Hostpath CSI Driver locally on a single node bare metal cluster, to get the conceptual understanding of the CSI workflow in provisioning the Persistent Volume and its lifecycle. Also, a cool feature of snapshotting the volume and recover it back is explained.

Introduction

CSI is a standard for exposing  storage systems in arbitrary block and file storage to containerized workloads on Container Orchestrations like Kubernetes, Mesos, and Cloud Foundry. It becomes very extensible for third-party storage provider to expose their new storage systems using CSI, without actually touching the Kubernetes code. Single independent implementation of CSI Driver by a storage provider will work on any orchestrator.

This new plugin mechanism has been one of the most powerful features of Kubernetes. It enables the storage vendors to:

  1. Automatically create storage when required.
  2. Make storage available to containers wherever they’re scheduled.
  3. Automatically delete the storage when no longer needed.

This decoupling helps the vendors to maintain the independent release and feature cycles and focus on the API implementation without actually worrying about the backward incompatibility and to support their plugin just as easy as deploying a few pods.

 

Image Source: Weekly Geekly

Why CSI?

Prior to CSI, k8s volume plugins have to be “In-tree”, compiled and shipped with core kubernetes binaries. This means, it will require the storage providers to check-in their into the core k8s codebase if they wish to add the support for a new storage system.

A plugin-based solution, flex-volume, tried to address this issue by exposing the exec based API for external  plugins. Although it also tried to work on the similar notion of being detached with k8s binary, there were several major problems with that approach. Firstly, it needed the root access to the host and master file system to deploy the driver files. 

Secondly, it comes with the huge baggage of prerequisites and OS dependencies which are assumed to be available on the host. CSI implicitly solves all these issues by being containerized and using the k8s storage primitives.

CSI has evolved as the one-stop solution addressing all the above issues which enables storage plugins to be out-of-tree and deployed via standard k8s primitives, such as PVC, PV and StorageClasses.

The main aim of introducing CSI is to establish a standard mechanism of exposing any type of storage system under-the-hood for all the container orchestrators.

Deploy the Driver Plugin

The CSI Driver comprises of a few main components which are various side cars and also the implementation of the CSI Services by the vendor, which will be understood by the Cos. The CSI Services will be described later in the blog. Let’s try out deploying hostpath CSI Driver.

Prerequisites:

  • Kubernetes cluster (not Minikube or Microk8s): Running version 1.13 or later
  • Access to the terminal with Kubectl installed

Deploying HostPath Driver Plugin:

  1. Clone the repo of HostPath Driver Plugin locally or just copy the deploy and example folder from the root path
  2. Checkout the master branch (if not)
  3. The hostpath driver comprises of manifests for following side-cars: (in ./deploy/master/hostpath/)
    – csi-hostpath-attacher.yaml
    – csi-hostpath-provisioner.yaml
    – csi-hostpath-snapshotter.yaml
    – csi-hostpath-plugin.yaml:
    It will deploy 2 containers, one is node-driver-registrar and a hospath-plugin
  4. The driver also includes separate Service for each component and in the deployment file with statefulsets for the containers
  5. It also deploys Cluster-role-bindings and RBAC rules for each component, maintained in a separate repo
  6. Each Component (side-car) is managed in a separate repository
  7. The /deploy/util/ contains a shell script which handles the complete deployment process
  8. After copying the folder or cloning the repo, just run:    
$ deploy/kubernetes-latest/deploy-hostpath.sh

     9. The output will be similar to:

applying RBAC rules
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-provisioner/v1.0.1/deploy/kubernetes/rbac.yaml
serviceaccount/csi-provisioner created
clusterrole.rbac.authorization.k8s.io/external-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/csi-provisioner-role created
role.rbac.authorization.k8s.io/external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/csi-provisioner-role-cfg created
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-attacher/v1.0.1/deploy/kubernetes/rbac.yaml
serviceaccount/csi-attacher created
clusterrole.rbac.authorization.k8s.io/external-attacher-runner created
clusterrolebinding.rbac.authorization.k8s.io/csi-attacher-role created
role.rbac.authorization.k8s.io/external-attacher-cfg created
rolebinding.rbac.authorization.k8s.io/csi-attacher-role-cfg created
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v1.0.1/deploy/kubernetes/rbac.yaml
serviceaccount/csi-snapshotter created
clusterrole.rbac.authorization.k8s.io/external-snapshotter-runner created
clusterrolebinding.rbac.authorization.k8s.io/csi-snapshotter-role created
deploying hostpath components
   deploy/kubernetes-1.13/hostpath/csi-hostpath-attacher.yaml
        using           image: quay.io/k8scsi/csi-attacher:v1.0.1
service/csi-hostpath-attacher created
statefulset.apps/csi-hostpath-attacher created
   deploy/kubernetes-1.13/hostpath/csi-hostpath-plugin.yaml
        using           image: quay.io/k8scsi/csi-node-driver-registrar:v1.0.2
        using           image: quay.io/k8scsi/hostpathplugin:v1.0.1
        using           image: quay.io/k8scsi/livenessprobe:v1.0.2
service/csi-hostpathplugin created
statefulset.apps/csi-hostpathplugin created
   deploy/kubernetes-1.13/hostpath/csi-hostpath-provisioner.yaml
        using           image: quay.io/k8scsi/csi-provisioner:v1.0.1
service/csi-hostpath-provisioner created
statefulset.apps/csi-hostpath-provisioner created
   deploy/kubernetes-1.13/hostpath/csi-hostpath-snapshotter.yaml
        using           image: quay.io/k8scsi/csi-snapshotter:v1.0.1
service/csi-hostpath-snapshotter created
statefulset.apps/csi-hostpath-snapshotter created
   deploy/kubernetes-1.13/hostpath/csi-hostpath-testing.yaml
        using           image: alpine/socat:1.0.3
service/hostpath-service created
statefulset.apps/csi-hostpath-socat created
11:43:06 waiting for hostpath deployment to complete, attempt #0
11:43:16 waiting for hostpath deployment to complete, attempt #1
11:43:26 waiting for hostpath deployment to complete, attempt #2
deploying snapshotclass
volumesnapshotclass.snapshot.storage.k8s.io/csi-hostpath-snapclass created

     10. The driver is deployed, we can check:

$ kubectl get pods

NAME                          READY   STATUS        RESTARTS    AGE
csi-hostpath-attacher-0       1/1     Running        0          1m06s
csi-hostpath-provisioner-0    1/1     Running        0          1m06s
csi-hostpath-snapshotter-0    1/1     Running        0          1m06s
csi-hostpathplugin-0          2/2     Running        0          1m06s

CSI API-Resources:

$ kubectl api-resources | grep -E "^Name|csi|storage|PersistentVolume"

NAME                     APIGROUP                  NAMESPACED     KIND
persistentvolumesclaims                            true           PersistentVolumeClaim
persistentvolume                                   false          PersistentVolume
csidrivers               csi.storage.k8s.io        false          CSIDrivers
volumesnapshotclasses    snapshot.storage.k8s.io   false          VolumeSnapshotClass
volumesnapshotcontents   snapshot.storage.k8s.io   false          VolumeSnapshotContent
Volumesnapshots          snapshot.storage.k8s.io   true           VolumeSnapshot
csidrivers               storage.k8s.io            false          CSIDriver
csinodes                 storage.k8s.io            false          CSINode
storageclasses           storage.k8s.io            false          VolumeAttachment

There are resources from core apigroups, storage.k8s.io and resources which created by CRDs snapshot.storage.k8s.io and csi.storage.k8s.io.

CSI SideCars

K8s CSI containers are sidecars that simplify the development and deployment of the CSI Drivers on a k8s cluster. Different Drivers have some similar logic to trigger the appropriate operations against the “CSI volume driver” container and update the Kubernetes API as appropriate.

The common controller (common containers) has to be bundled with the provider-specific containers.

The official sig-k8s contributors maintain the following basic skeleton containers for any CSI Driver:

Note: In case of Hostpath driver, only ‘csi-hostpath-plugin’ container will be having the specific code. All the others are common CSI sidecar containers. These containers have a socket mounted in the socket-dir volume of type EmptyDir, which makes their communication possible using gRPC

  1. External Provisioner:
    It  is a sidecar container that watches Kubernetes PersistentVolumeClaim objects and triggers CSI CreateVolume and DeleteVolume operations against a driver endpoint.
    The CSI external-attacher also supports the Snapshot DataSource. If a Snapshot CRD is specified as a data source on a PVC object, the sidecar container fetches the information about the snapshot by fetching the SnapshotContent object and populates the data source field indicating to the storage system that new volume should be populated using specified snapshot.
  2. External Attacher :
    It  is a sidecar container that watches Kubernetes VolumeAttachment objects and triggers CSI ControllerPublish and ControllerUnpublish operations against a driver endpoint
  3. Node-Driver Registrar:
    It is a sidecar container that registers the CSI driver with kubelet, and adds the drivers custom NodeId to a label on the Kubernetes Node API Object. The communication of this sidecar is handled by the ‘Identity-Service’ implemented by the driver. The CSI Driver is registered with the kubelet using its device–plugin mechanisms
  4. External Snapshotter:
    It is a sidecar container that watches the Kubernetes API server for VolumeSnapshot and VolumeSnapshotContent CRD objects.The creation of a new VolumeSnapshot object referencing a SnapshotClass CRD object corresponding to this driver causes the sidecar container to provision a new snapshot.
  5. This sidecar listens to the service which indicates the successful creation of VolumeSnapshot, and immediately creates the VolumeSnapshotContent resource
  6. Cluster-driver Registrar:
    CSI driver is registered with the cluster by a sidecar container CSI cluster-driver-registrar creating a CSIDriver object. This CSIDriver enables the driver to customize the way of k8s interaction with it.

Developing a CSI Driver

To start the implementation of CSIDriver, an application must implement the gRPC services described by the CSI Specification.

The minimum service a CSI application should implement are following:

  • CSI Identity service: Enables Kubernetes components and CSI containers to identify the driver
  • CSI Node service: Required methods enable callers to make volume available at a specified path.

All the required services may be implemented independently or in the same driver application. The CSI driver application should be containerised to make it easy to deploy on Kubernetes. Once the main specific logic of the driver is containerized, they can be attached to the sidecars and deployed, in node and/or controller mode.

Capabilities

CSI also have provisions to enable the custom CSI driver to support many additional features/services by using the “Capabilities”. It contains a list of all the features the driver supports.

Note: Refer the link for detailed explanation for developing a CSI Driver.

Try out provisioning the PV:

1. A storage class with:

volumeBindingMode: WaitForFirstConsumer
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: csi-hostpath-sc
provisioner: hostpath.csi.k8s.io
volumeBindingMode: WaitForFirstConsumer

2. Now, A PVC is also needed to be consumed by the sample Pod.

And also a sample pod is also required, so that it can be bounded with the PV created by the PVC from above step
The above files are found in ./exmples directory and can be deployed using create or apply kubectl commands

Validate the deployed components:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-fs
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-hostpath-sc # defined in csi-setup.yaml

3. The Pod to consume the PV

kind: Pod
apiVersion: v1
metadata:
  name: pod-fs
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - csi-hostpathplugin
        topologyKey: kubernetes.io/hostname
  containers:
    - name: my-frontend
      image: busybox
      volumeMounts:
      - mountPath: "/data"
        name: my-csi-volume
      command: [ "sleep", "1000000" ]
  volumes:
    - name: my-csi-volume
      persistentVolumeClaim:
        claimName: pvc-fs # defined in csi-pvc.yaml

Validate the deployed components:

$ kubectl get pv

NAME                                    CAPACITY ACCESSMODES STATUS CLAIM         STORAGECLASS     
pvc-58d5ec38-03e5-11e9-be51-000c29e88ff1  1Gi       RWO      Bound  default/pvc-fs csi-hostpath-sc
$ kubectl get pvc

NAME      STATUS   VOLUME                                     CAPACITY  ACCESS MODES    STORAGECLASS
csi-pvc   Bound    pvc-58d5ec38-03e5-11e9-be51-000c29e88ff1     1Gi         RWO         csi-hostpath-sc

Brief on how it works:

  • csi-provisioner issues CreateVolumeRequest call to the CSI socket, then hostpath-plugin calls CreateVolume and informs CSI about its creation
  • csi-provisioner creates PV and updates PVC to be bound and the VolumeAttachment object is created by controller-manager
  • csi-attacher which watches for VolumeAttachments submits ControllerPublishVolume rpc call to hostpath-plugin, then hostpaths-plugin gets ControllerPublishVolume and calls hostpath AttachVolume csi-attacher update VolumeAttachment status
  • All this time kubelet waits for volume to be attached and submits NodeStageVolume (format and mount to the node to the staging dir) to the csi-node.hostpath-plugin
  • csi-node.hostpath-plugin gets NodeStageVolume call and mounts to `/var/lib/kubelet/plugins/kubernetes.io/csi/pv/<pv-name>/globalmount`, then responses to kubelet</pv-name>
  • kubelet calls NodePublishVolume (mount volume to the pod’s dir)
  • csi-node.hostpath-plugin performs NodePublishVolume and mounts the volume to `/var/lib/kubelet/pods/<pod-uuid>/volumes/</pod-uuid>kubernetes.io~csi/<pvc-name>/mount`</pvc-name>

    Finally, kubelet starts container of the pod with the provisioned volume.


Let’s confirm the working of Hostpath CSI driver:

The Hostpath driver is configured to create new volumes in the hostpath container in the plugin daemonset under the ‘/tmp’ directory. This path persist as long as the DaemonSet pod is up and running.

If a file is written in the hostpath mounted volume in an application pod, should be seen in the hostpath cotainer.A file written in a properly mounted Hostpath volume inside an application should show up inside the Hostpath container.

1. To try out the above statement, Create a file on application pod

$ kubectl exec -it pod-fs /bin/sh

/ # touch /data/my-test
/ # exit

2. And then exec in the hostpath container and run ‘ls’ command to check

$ kubectl exec -it $(kubectl get pods --selector app=csi-hostpathplugin 
-o jsonpath='{.items[*].metadata.name}') -c hostpath /bin/sh

/ # find /tmp -name my-test
/tmp/057485ab-c714-11e8-bb16-000c2967769a/my-test
/ # exit

Note: The better way of the verification is to inspect the VolumeAttachment object created that represents the attached volume API object created that represents the attached volume

Support for Snapshot

Volume Snapshotting is introduced as an Alpha feature for the Kubernetes persistent volume in v1.12. 

Being an alpha feature, ‘VolumeSnapshotDataSource’ feature gate needs to be enabled. This feature opens a pool of use cases of keeping the snapshot of data locally. The API objects used are VolumeSnapshot, VolumeSnapshotContent and VolumeSnapshotClass. It was developed with a similar notion and relationship of PV, PVC and StorageClass. 

To create a snapshot, the VolumeSnapshot object needs to be created with the source as PVC and VolumeSnapshotClass

and the CSI-Snapshotter container will create a VolumeSnaphsotContent.

Let’s try out with an example:

Just like the provisioner create a PV for us when a PVC is created, similarly a VolumeSnapshotContent object will be created when VolumeSnapshot object is created.

apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshot
metadata:
  name: fs-pv-snapshot
spec:
  snapshotClassName: csi-hostpath-snapclass
  source:
    name: pvc-fs
    kind: PersistentVolumeClaim

The volumesnapshotcontent is created. The output will look like:

$ kubectl get volumesnapshotcontent
 
NAME                                                  AGE
snapcontent-f55db632-c716-11e8-8911-000c2967769a      14s

Restore from the snapshot:

The DataSource field in the PVC can accept the source of kind: VolumeSnapsot which will create a new PV from that volume snapshot, when a Pod is bound to this PVC.

The new PV will be having the same data as of the PV from which the snapshot was taken and it can be attached to any other pod. The new pod having that PV, proves of the possible “Restore” and “Cloning” use cases.

Tear Down CSI-Hostpath installation:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fs-pvc-restore
spec:
  storageClassName: csi-hostpath-sc
  dataSource:
    name: fs-pv-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

And we’re done here.

Conclusion

Since Kubernetes has started supporting the Raw Block Device type of Persistent Volume, hospath driver and any other driver may also support it, which will be explained in the next part of the blog. In this blog, we got deep understanding of the CSI, its components and services. The new features of CSI and the problems it solves. The CSI Hostpath driver was deeply in this blog to experiment and understand the provisioner, snapshotter flows for PV and VolumeSnapshots. Also, the PV snapshot, restore and cloning use cases were demonstrated.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *