Category: Engineering blogs

  • Scalable Real-time Communication With Pusher

    What and why?

    Pusher is a hosted API service which makes adding real-time data and functionality to web and mobile applications seamless. 

    Pusher works as a real-time communication layer between the server and the client. It maintains persistent connections at the client using WebSockets, as and when new data is added to your server. If a server wants to push new data to clients, they can do it instantly using Pusher. It is highly flexible, scalable, and easy to integrate. Pusher has exposed over 40+ SDKs that support almost all tech stacks.

    In the context of delivering real-time data, there are other hosted and self-hosted services available. It depends on the use case of what exactly one needs, like if you need to broadcast data across all the users or something more complex having specific target groups. In our use case, Pusher was well-suited, as the decision was based on the easy usage, scalability, private and public channels, webhooks, and event-based automation. Other options which we considered were Socket.IO, Firebase & Ably, etc. 

    Pusher is categorically well-suited for communication and collaboration features using WebSockets. The key difference with  Pusher: it’s a hosted service/API.  It takes less work to get started, compared to others, where you need to manage the deployment yourself. Once we do the setup, it comes to scaling, that reduces future efforts/work.

    Some of the most common use cases of Pusher are:

    1. Notification: Pusher can inform users if there is any relevant change.  Notifications can also be thought of as a form of signaling, where there is no representation of the notification in the UI. Still, it triggers a reaction within an application.

    2. Activity streams: Stream of activities which are published when something changes on the server or someone publishes it across all channels.

    3. Live Data Visualizations: Pusher allows you to broadcast continuously changing data when needed.

    4. Chats: You can use Pusher for peer to peer or peer to multichannel communication.

    In this blog, we will be focusing on using Channels, which is an alias for Pub/Sub messaging API for a JavaScript-based application. Pusher also comes with Chatkit and Beams (Push Notification) SDK/APIs.

    • Chatkit is designed to make chat integration to your app as simple as possible. It allows you to add group chat and 1 to 1 chat feature to your app. It also allows you to add file attachments and online indicators.
    • Beams are used for adding Push Notification in your Mobile App. It includes SDKs to seamlessly manage push token and send notifications.

    Step 1: Getting Started

    Setup your account on the Pusher dashboard and get your free API keys.

    Image Source: Pusher

    1. Click on Channels
    2. Create an App. Add details based on the project and the environment
    3. Click on the App Keys tab to get the app keys.
    4. You can also check the getting started page. It will give code snippets to get you started.

    Add Pusher to your project:

    var express = require('express');
    var bodyParser = require('body-parser');
    
    var app = express();
    app.use(bodyParser.json());
    app.use(bodyParser.urlencoded({ extended: false }));
    
    app.post('/pusher/auth', function(req, res) {
      var socketId = req.body.socket_id;
      var channel = req.body.channel_name;
      var auth = pusher.authenticate(socketId, channel);
      res.send(auth);
    });
    
    var port = process.env.PORT || 5000;
    app.listen(port);

    CODE: https://gist.github.com/velotiotech/f09f14363bacd51446d5318e5050d628.js

    or using npm

    npm i pusher

    CODE: https://gist.github.com/velotiotech/423115d0943c1b882c913e437c529d11.js

    Step 2: Subscribing to Channels

    There are three types of channels in Pusher: Public, Private, and Presence.

    • Public channels: These channels are public in nature, so anyone who knows the channel name can subscribe to the channel and start receiving messages from the channel. Public channels are commonly used to broadcast general/public information, which does not contain any secure information or user-specific data.
    • Private channels: These channels have an access control mechanism that allows the server to control who can subscribe to the channel and receive data from the channel. All private channels should have a private- prefixed to the name. They are commonly used when the sever needs to know who can subscribe to the channel and validate the subscribers.
    • Presence channels: It is an extension to the private channel. In addition to the properties which private channels have, it lets the server ‘register’ users information on subscription to the channel. It also enables other members to identify who is online.

    In your application, you can create a subscription and start listening to events on: 

    // Here my-channel is the channel name
    // all the event published to this channel would be available
    // once you subscribe to the channel and start listing to it.
    
    var channel = pusher.subscribe('my-channel');
    
    channel.bind('my-event', function(data) {
      alert('An event was triggered with message: ' + data.message);
    });

    CODE: https://gist.github.com/velotiotech/d8c27960e2fac408a8db57b92f1e846d.js

    Step 3: Creating Channels

    For creating channels, you can use the dashboard or integrate it with your server. For more details on how to integrate Pusher with your server, you can read (Server API). You need to create an app on your Pusher dashboard and can use it to further trigger events to your app.

    or 

    Integrate Pusher with your server. Here is a sample snippet from our node App:

    var Pusher = require('pusher');
    
    var pusher = new Pusher({
      appId: 'APP_ID',
      key: 'APP_KEY',
      secret: 'APP_SECRET',
      cluster: 'APP_CLUSTER'
    });
    
    // Logic which will then trigger events to a channel
    function trigger(){
    ...
    ...
    pusher.trigger('my-channel', 'my-event', {"message": "hello world"});
    ...
    ...
    }

    CODE: https://gist.github.com/velotiotech/6f5b0f6407c0a74a0bce4b398a849410.js

    Step 4: Adding Security

    As a default behavior, anyone who knows your public app key can open a connection to your channels app. This behavior does not add any security risk, as connections can only access data on channels. 

    For more advanced use cases, you need to use the “Authorized Connections” feature. It authorizes every single connection to your channels, and hence, avoids unwanted/unauthorized connection. To enable the authorization, set up an auth endpoint, then modify your client code to look like this.

    const channels = new Pusher(APP_KEY, {
      cluster: APP_CLUSTER,
      authEndpoint: '/your_auth_endpoint'
    });
    
    const channel = channels.subscribe('private-<channel-name>');

    CODE: https://gist.github.com/velotiotech/9369051e5661a95352f08b1fdd8bf9ed.js

    For more details on how to create an auth endpoint for your server, read this. Here is a snippet from Node.js app

    var express = require('express');
    var bodyParser = require('body-parser');
    
    var app = express();
    app.use(bodyParser.json());
    app.use(bodyParser.urlencoded({ extended: false }));
    
    app.post('/pusher/auth', function(req, res) {
      var socketId = req.body.socket_id;
      var channel = req.body.channel_name;
      var auth = pusher.authenticate(socketId, channel);
      res.send(auth);
    });
    
    var port = process.env.PORT || 5000;
    app.listen(port);

    CODE: https://gist.github.com/velotiotech/fb67d5efe3029174abc6991089a910e1.js

    Step 5: Scale as you grow

     

    Pusher comes with a wide range of plans which you can subscribe to based on your usage. You can scale your application as it grows. Here is a snippet from available plans for mode details you can refer this.

    Image Source: Pusher

    Conclusion

    This article has covered a brief description of Pusher, its use cases, and how you can use it to build a scalable real-time application. Using Pusher may vary based on different use cases; it is no real debate on what one can choose. Pusher approach is simple and API based. It enables developers to add real-time functionality to any application in very little time.

    If you want to get hands-on tutorials/blogs, please visit here.

  • The Ultimate Cheat Sheet on Splitting Dynamic Redux Reducers

    This post is specific to need of code-splitting in React/Redux projects. While exploring the possibility to optimise the application, the common problem occurs with reducers. This article specifically focuses on how do we split reducers to be able to deliver them in chunks.

    What are the benefits of splitting reducers in chunks?

    1) True code splitting is possible

    2) A good architecture can be maintained by keeping page/component level reducers isolated from other parts of application minimising the dependency on other parts of application.

    Why Do We Need to Split Reducers?

    1. For fast page loads

    Splitting reducers will have and advantage of loading only required part of web application which in turn makes it very efficient in rendering time of main pages

    2. Organization of code

    Splitting reducers on page level or component level will give a better code organization instead of just putting all reducers at one place. Since reducer is loaded only when page/component is loaded will ensure that there are standalone pages which are not dependent on other parts of application. That ensures seamless development since it will essentially avoid cross references in reducers and throwing away complexities

    3. One page/component

    One reducer design pattern. Things are better written, read and understood when they are modular. With dynamic reducers it becomes possible to achieve it.

    4. SEO

    SEO is vast topic but it gets hit very hard if your website is having huge response times which happens in case if code is not split. With reducer level code splitting, reducers can be code split on component level which will reduce the loading time of website thereby increasing SEO rankings.

    What Exists Today?

    A little googling around the topic shows us some options. Various ways has been discussed here.  

    Dan Abramov’s answer is what we are following in this post and we will be writing a simple abstraction to have dynamic reducers but with more functionality.

    A lot of solutions already exists, so why do we need to create our own? The answer is simple and straightforward:

       1) The ease of use

    Every library out there is little catchy is some way. Some have complex api’s while some have too much boilerplate codes. We will be targeting to be near react-redux api.

       2) Limitation to add reducers at top level only

    This is a very common problem that a lot of existing libraries have as of today. That’s what we will be targeting to solve in this post. This opens new doors for possibilities to do code splitting on component level.

    A quick recap of redux facts:

    1) Redux gives us following methods:
    – “getState”,
    – “dispatch(action)”
    – “subscribe(listener)”
    – “replaceReducer(nextReducer)”

    2) reducers are plain functions returning next state of application

    3) “replaceReducer” requires the entire root reducer.

    What we are going to do?

    We will be writing abstraction around “replaceReducer” to develop an API to allow us to inject a reducer at a given key dynamically.

    A simple Redux store definition goes like the following:

    Let’s simplify the store creation wrapper as:

    What it Does?

    “dynamicActionGenerator” and “isValidReducer” are helper function to determine if given reducer is valid or not.

    For e.g.

    CODE:

    isValidReducer(() => { return ) // should return true
    isValidReducer(1) //should return false
    isValidReducer(true) //should return false
    isValidReducer(“example”) //should return false

    This is an essential check to ensure all inputs to our abstraction layer over createStore should be valid reducers.

    “createStore” takes initial Root reducer, initial state and enhancers that will be applicable to created store.

    In addition to that we are maintaining, “asyncReducers” and “attachReducer” on store object.

    “asyncReducers” keeps the mapping of dynamically added reducers.

    “attachReducer” is partial in above implementation and we will see the complete implementation below. The basic use of “attachReducer” is to add reducer from any part of web application.

    Given that our store object now becomes like follows:

    Store:

    CODE:

    - getState: Func
    - dispatch(action): Func
    - subscribe(listener): Func
    - replaceReducer(RootReducer): Func
    - attachReducer(reducer): Func
    - asyncReducers: JSONObject

    Now here is an interesting problem, replaceReducer requires a final root reducer function. That means we will have to recreate the root reducers every time.
    So we will create a dynamicRootReducer function itself to simply the process.

    So now our store object becomes as follows:
    Store:

    CODE:

    - getState: Func
    - dispatch(action) : Func
    - subscribe(listener) : Func
    - replaceReducer(RootReducer) : Func
    - attachReducer(reducer) : Func

    What does dynamicRootReducer does?
    1) Processes initial root reducer passed to it
    2) Executes dynamic reducers to get next state.

    So we now have an api exposed as :
    store.attachReducer(“home”, (state = {}, action) => { return state }); // Will add a dynamic reducer after the store has been created

    store.attachReducer(“home.grid”, (state={}, action) => { return state}); // Will add a dynamic reducer at a given nested key in store.

    Final Implementation:

    Working Example:

    Further implementations based on simplified code:

    Based on it I have simplified implementations into two libraries:

    Conclusion

    In this way, we can achieve code splitting with reducers which is a very common problem in almost every react-redux application. With above solution you can do code splitting on page level, component level and can also create reusable stateful components which uses redux state. The simplified approach will reduce your application boilerplate. Moreover common complex components like grid or even the whole pages like login can be exported and imported from one project to another making development faster than ever!

  • Continuous Integration & Delivery (CI/CD) for Kubernetes Using CircleCI & Helm

    Introduction

    Kubernetes is getting adopted rapidly across the software industry and is becoming the most preferred option for deploying and managing containerized applications. Once we have a fully functional Kubernetes cluster we need to have an automated process to deploy our applications on it. In this blog post, we will create a fully automated “commit to deploy” pipeline for Kubernetes. We will use CircleCI & helm for it.

    What is CircleCI?

    CircleCI is a fully managed saas offering which allows us to build, test or deploy our code on every check in. For getting started with circle we need to log into their web console with our GitHub or bitbucket credentials then add a project for the repository we want to build and then add the CircleCI config file to our repository. The CircleCI config file is a yaml file which lists the steps we want to execute on every time code is pushed to that repository.

    Some salient features of CircleCI is:

    1. Little or no operational overhead as the infrastructure is managed completely by CircleCI.
    2. User authentication is done via GitHub or bitbucket so user management is quite simple.
    3. It automatically notifies the build status on the github/bitbucket email ids of the users who are following the project on CircleCI.
    4. The UI is quite simple and gives a holistic view of builds.
    5. Can be integrated with Slack, hipchat, jira, etc.

    What is Helm?

    Helm is chart manager where chart refers to package of Kubernetes resources. Helm allows us to bundle related Kubernetes objects into charts and treat them as a single unit of deployment referred to as release.  For example, you have an application app1 which you want to run on Kubernetes. For this app1 you create multiple Kubernetes resources like deployment, service, ingress, horizontal pod scaler, etc. Now while deploying the application you need to create all the Kubernetes resources separately by applying their manifest files. What helm does is it allows us to group all those files into one chart (Helm chart) and then we just need to deploy the chart. This also makes deleting and upgrading the resources quite simple.

    Some other benefits of Helm is:

    1. It makes the deployment highly configurable. Thus just by changing the parameters, we can use the same chart for deploying on multiple environments like stag/prod or multiple cloud providers.
    2. We can rollback to a previous release with a single helm command.
    3. It makes managing and sharing Kubernetes specific application much simpler.

    Note: Helm is composed of two components one is helm client and the other one is tiller server. Tiller is the component which runs inside the cluster as deployment and serves the requests made by helm client. Tiller has potential security vulnerabilities thus we will use tillerless helm in our pipeline which runs tiller only when we need it.

    Building the Pipeline

    Overview:

    We will create the pipeline for a Golang application. The pipeline will first build the binary, create a docker image from it, push the image to ECR, then deploy it on the Kubernetes cluster using its helm chart.

    We will use a simple app which just exposes a `hello` endpoint and returns the hello world message:

    package main
    
    import (
    	"encoding/json"
    	"net/http"
    	"log"
    	"github.com/gorilla/mux"
    )
    
    type Message struct {
    	Msg string
    }
    
    func helloWorldJSON(w http.ResponseWriter, r *http.Request) {
    	m := Message{"Hello World"}
    	response, _ := json.Marshal(m)
    	w.Header().Set("Content-Type", "application/json")
    	w.WriteHeader(http.StatusOK)
    	w.Write(response)
    }
    func main() {
    	r := mux.NewRouter()
    	r.HandleFunc("/hello", helloWorldJSON).Methods("GET")
    	if err := http.ListenAndServe(":8080", r); err != nil {
    		log.Fatal(err)
    	}
    }

    We will create a docker image for hello app using the following Dockerfile:

    FROM centos/systemd
    
    MAINTAINER "Akash Gautam" <akash.gautam@velotio.com>
    
    COPY hello-app  /
    
    ENTRYPOINT ["/hello-app"]

    Creating Helm Chart:

    Now we need to create the helm chart for hello app.

    First, we create the Kubernetes manifest files. We will create a deployment and a service file:

    apiVersion: apps/v1beta1
    kind: Deployment
    metadata:
      name: helloapp
    spec:
      replicas: 1
      strategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 1
      template:
        metadata:
          labels:
            app: helloapp
            env: {{ .Values.labels.env }}
            cluster: {{ .Values.labels.cluster }}
        spec:
          containers:
          - name: helloapp
            image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
            imagePullPolicy: {{ .Values.image.imagePullPolicy }}
            readinessProbe:
              httpGet:
                path: /hello
                port: 8080
                initialDelaySeconds: 5
                periodSeconds: 5
                successThreshold: 1

    apiVersion: v1
    kind: Service
    metadata:
      name: helloapp
    spec:
      type: {{ .Values.service.type }}
      ports:
      - name: helloapp
        port: {{ .Values.service.port }}
        protocol: TCP
        targetPort: {{ .Values.service.targetPort }}
      selector:
        app: helloapp

    In the above file, you must have noticed that we have used .Values object. All the values that we specify in the values.yaml file in our helm chart can be accessed using the .Values object inside the template.

    Let’s create the helm chart now:

    helm create helloapp

    Above command will create a chart helm chart folder structure for us.

    helloapp/
    |
    |- .helmignore # Contains patterns to ignore when packaging Helm charts.
    |
    |- Chart.yaml # Information about your chart
    |
    |- values.yaml # The default values for your templates
    |
    |- charts/ # Charts that this chart depends on
    |
    |- templates/ # The template files

    We can remove the charts/ folder inside our helloapp chart as our chart won’t have any sub-charts. Now we need to move our Kubernetes manifest files to the template folder and update our values.yaml and Chart.yaml

    Our values.yaml looks like:

    image:
      tag: 0.0.1
      repository: 123456789870.dkr.ecr.us-east-1.amazonaws.com/helloapp
      imagePullPolicy: Always
    
    labels:
      env: "staging"
      cluster: "eks-cluster-blog"
    
    service:
      port: 80
      targetPort: 8080
      type: LoadBalancer

    This allows us to make our deployment more configurable. For example, here we have set our service type as LoadBalancer in values.yaml but if we want to change it to nodePort we just need to set is as NodePort while installing the chart (–set service.type=NodePort). Similarly, we have set the image pull policy as Always which is fine for development/staging environment but when we deploy to production we may want to set is as ifNotPresent. In our chart, we need to identify the parameters/values which may change from one environment to another and make them configurable. This allows us to be flexible with our deployment and reuse the same chart

    Finally, we need to update Chart.yaml file. This file mostly contains metadata about the chart like the name, version, maintainer, etc, where name & version are two mandatory fields for Chart.yaml.

    version: 1.0.0
    appVersion: 0.0.1
    name: helloapp
    description: Helm chart for helloapp
    source:
      - https://github.com/akash-gautam/helloapp

    Now our Helm chart is ready we can start with the pipeline. We need to create a folder named .circleci in the root folder of our repository and create a file named config.yml in it. In our config.yml we have defined two jobs one is build&pushImage and deploy.

    Configure the pipeline:

    build&pushImage:
        working_directory: /go/src/hello-app (1)
        docker:
          - image: circleci/golang:1.10 (2)
        steps:
          - checkout (3)
          - run: (4)
              name: build the binary
              command: go build -o hello-app
          - setup_remote_docker: (5)
              docker_layer_caching: true
          - run: (6)
              name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
              command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_BUILD_NUM' >> $BASH_ENV
          - run: (7)
              name: Build the docker image
              command: docker build . -t ${CIRCLE_PROJECT_REPONAME}:$TAG
          - run: (8)
              name: Install AWS cli
              command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
          - run: (9)
              name: Login to ECR
              command: $(aws ecr get-login --region $AWS_REGION | sed -e 's/-e none//g')
          - run: (10)
              name: Tag the image with ECR repo name 
              command: docker tag ${CIRCLE_PROJECT_REPONAME}:$TAG ${HELLOAPP_ECR_REPO}:$TAG    
          - run: (11)
              name: Push the image the ECR repo
              command: docker push ${HELLOAPP_ECR_REPO}:$TAG

    1. We set the working directory for our job, we are setting it on the gopath so that we don’t need to do anything additional.
    2. We set the docker image inside which we want the job to run, as our app is built using golang we are using the image which already has golang installed in it.
    3. This step checks out our repository in the working directory
    4. In this step, we build the binary
    5. Here we setup docker with the help of  setup_remote_docker  key provided by CircleCI.
    6. In this step we create the tag we will be using while building the image, we use the app version available in the VERSION file and append the $CIRCLE_BUILD_NUM value to it, separated by a dash (`-`).
    7. Here we build the image and tag.
    8. Installing AWS CLI to interact with the ECR later.
    9. Here we log into ECR
    10. We tag the image build in step 7 with the ECR repository name.
    11. Finally, we push the image to ECR.

    Now we will deploy our helm charts. For this, we have a separate job deploy.

    deploy:
        docker: (1)
            - image: circleci/golang:1.10
        steps: (2)
          - checkout
          - run: (3)
              name: Install AWS cli
              command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
          - run: (4)
              name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
              command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_PREVIOUS_BUILD_NUM' >> $BASH_ENV
          - run: (5)
              name: Install and confgure kubectl
              command: sudo curl -L https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl && sudo chmod +x /usr/local/bin/kubectl  
          - run: (6)
              name: Install and confgure kubectl aws-iam-authenticator
              command: curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator && sudo chmod +x ./aws-iam-authenticator && sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator
           - run: (7)
              name: Install latest awscli version
              command: sudo apt install unzip && curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" && unzip awscli-bundle.zip &&./awscli-bundle/install -b ~/bin/aws
          - run: (8)
              name: Get the kubeconfig file 
              command: export KUBECONFIG=$HOME/.kube/kubeconfig && /home/circleci/bin/aws eks --region $AWS_REGION update-kubeconfig --name $EKS_CLUSTER_NAME
          - run: (9)
              name: Install and configuire helm
              command: sudo curl -L https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz | tar xz && sudo mv linux-amd64/helm /bin/helm && sudo rm -rf linux-amd64
          - run: (10)
              name: Initialize helm
              command:  helm init --client-only --kubeconfig=$HOME/.kube/kubeconfig
          - run: (11)
              name: Install tiller plugin
              command: helm plugin install https://github.com/rimusz/helm-tiller --kubeconfig=$HOME/.kube/kubeconfig        
          - run: (12)
              name: Release helloapp using helm chart 
              command: bash scripts/release-helloapp.sh $TAG

    1. Set the docker image inside which we want to execute the job.
    2. Check out the code using `checkout` key
    3. Install AWS CLI.
    4. Setting the value of tag just like we did in case of build&pushImage job. Note that here we are using CIRCLE_PREVIOUS_BUILD_NUM variable which gives us the build number of build&pushImage job and ensures that the tag values are the same.
    5. Download kubectl and making it executable.
    6. Installing aws-iam-authenticator this is required because my k8s cluster is on EKS.
    7. Here we install the latest version of AWS CLI, EKS is a relatively newer service from AWS and older versions of AWS CLI doesn’t have it.
    8. Here we fetch the kubeconfig file. This step will vary depending upon where the k8s cluster has been set up. As my cluster is on EKS am getting the kubeconfig file via. AWS CLI similarly if your cluster in on GKE then you need to configure gcloud and use the command  `gcloud container clusters get-credentials <cluster-name> –zone=<zone-name>`. We can also have the kubeconfig file on some other secure storage system and fetch it from there.</zone-name></cluster-name>
    9. Download Helm and make it executable
    10. Initializing helm, note that we are initializing helm in client only mode so that it doesn’t start the tiller server.
    11. Download the tillerless helm plugin
    12. Execute the release-helloapp.sh shell script and pass it TAG value from step 4.

    In the release-helloapp.sh script we first start tiller, after this, we check if the release is already present or not if it is present then we upgrade otherwise we make a new release. Here we override the value of tag for the image present in the chart by setting it to the tag of the newly built image, finally, we stop the tiller server.

    #!/bin/bash
    TAG=$1
    echo "start tiller"
    export KUBECONFIG=$HOME/.kube/kubeconfig
    helm tiller start-ci
    export HELM_HOST=127.0.0.1:44134
    result=$(eval helm ls | grep helloapp) 
    if [ $? -ne "0" ]; then 
       helm install --timeout 180 --name helloapp --set image.tag=$TAG charts/helloapp
    else 
       helm upgrade --timeout 180 helloapp --set image.tag=$TAG charts/helloapp
    fi
    echo "stop tiller"
    helm tiller stop 

    The complete CircleCI config.yml file looks like:

    version: 2
    
    jobs:
      build&pushImage:
        working_directory: /go/src/hello-app
        docker:
          - image: circleci/golang:1.10
        steps:
          - checkout
          - run:
              name: build the binary
              command: go build -o hello-app
          - setup_remote_docker:
              docker_layer_caching: true
          - run:
              name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
              command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_BUILD_NUM' >> $BASH_ENV
          - run:
              name: Build the docker image
              command: docker build . -t ${CIRCLE_PROJECT_REPONAME}:$TAG
          - run:
              name: Install AWS cli
              command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
          - run:
              name: Login to ECR
              command: $(aws ecr get-login --region $AWS_REGION | sed -e 's/-e none//g')
          - run: 
              name: Tag the image with ECR repo name 
              command: docker tag ${CIRCLE_PROJECT_REPONAME}:$TAG ${HELLOAPP_ECR_REPO}:$TAG    
          - run: 
              name: Push the image the ECR repo
              command: docker push ${HELLOAPP_ECR_REPO}:$TAG
      deploy:
        docker:
            - image: circleci/golang:1.10
        steps:
          - attach_workspace:
              at: /tmp/workspace
          - checkout
          - run:
              name: Install AWS cli
              command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
          - run:
              name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
              command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_PREVIOUS_BUILD_NUM' >> $BASH_ENV
          - run:
              name: Install and confgure kubectl
              command: sudo curl -L https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl && sudo chmod +x /usr/local/bin/kubectl  
          - run:
              name: Install and confgure kubectl aws-iam-authenticator
              command: curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator && sudo chmod +x ./aws-iam-authenticator && sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator
           - run:
              name: Install latest awscli version
              command: sudo apt install unzip && curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" && unzip awscli-bundle.zip &&./awscli-bundle/install -b ~/bin/aws
          - run:
              name: Get the kubeconfig file 
              command: export KUBECONFIG=$HOME/.kube/kubeconfig && /home/circleci/bin/aws eks --region $AWS_REGION update-kubeconfig --name $EKS_CLUSTER_NAME
          - run:
              name: Install and configuire helm
              command: sudo curl -L https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz | tar xz && sudo mv linux-amd64/helm /bin/helm && sudo rm -rf linux-amd64
          - run:
              name: Initialize helm
              command:  helm init --client-only --kubeconfig=$HOME/.kube/kubeconfig
          - run:
              name: Install tiller plugin
              command: helm plugin install https://github.com/rimusz/helm-tiller --kubeconfig=$HOME/.kube/kubeconfig        
          - run:
              name: Release helloapp using helm chart 
              command: bash scripts/release-helloapp.sh $TAG
    workflows:
      version: 2
      primary:
        jobs:
          - build&pushImage
          - deploy:
              requires:
                - build&pushImage

    At the end of the file, we see the workflows, workflows control the order in which the jobs specified in the file are executed and establishes dependencies and conditions for the job. For example, we may want our deploy job trigger only after my build job is complete so we added a dependency between them. Similarly, we may want to exclude the jobs from running on some particular branch then we can specify those type of conditions as well.

    We have used a few environment variables in our pipeline configuration some of them were created by us and some were made available by CircleCI. We created AWS_REGION, HELLOAPP_ECR_REPO, EKS_CLUSTER_NAME, AWS_ACCESS_KEY_ID & AWS_SECRET_ACCESS_KEY variables. These variables are set via. CircleCI web console by going to the projects settings. Other variables that we have used are made available by CircleCI as a part of its environment setup process. Complete list of environment variables set by CircleCI can be found here.

    Verify the working of the pipeline:

    Once everything is set up properly then our application will get deployed on the k8s cluster and should be available for access. Get the external IP of the helloapp service and make a curl request to the hello endpoint

    $ curl http://a31e25e7553af11e994620aebe144c51-242977608.us-west-2.elb.amazonaws.com/hello && printf "n"
    
    {"Msg":"Hello World"}

    Now update the code and change the message “Hello World” to “Hello World Returns” and push your code. It will take a few minutes for the pipeline to complete execution and once it is complete make the curl request again to see the changes getting reflected.

    $ curl http://a31e25e7553af11e994620aebe144c51-242977608.us-west-2.elb.amazonaws.com/hello && printf "n"
    
    {"Msg":"Hello World Returns"}

    Also, verify that a new tag is also created for the helloapp docker image on ECR.

    Conclusion

    In this blog post, we explored how we can set up a CI/CD pipeline for kubernetes and got basic exposure to CircleCI and Helm. Although helm is not absolutely necessary for building a pipeline, it has lots of benefits and is widely used across the industry. We can extend the pipeline to consider the cases where we have multiple environments like dev, staging & production and make the pipeline deploy the application to any of them depending upon some conditions. We can also add more jobs like integration tests. All the codes used in the blog post are available here.

    Related Reads:

    1. Continuous Deployment with Azure Kubernetes Service, Azure Container Registry & Jenkins
    2. Know Everything About Spinnaker & How to Deploy Using Kubernetes Engine
  • Automated Containerization and Migration of On-premise Applications to Cloud Platforms

    Containerized applications are becoming more popular with each passing year. All enterprise applications are adopting container technology as they modernize their IT systems. Migrating your applications from VMs or physical machines to containers comes with multiple advantages like optimal resource utilization, faster deployment times, replication, quick cloning, lesser lock-in and so on. Various container orchestration platforms like Kubernetes, Google Container Engine (GKE), Amazon EC2 Container Service (Amazon ECS) help in quick deployment and easy management of your containerized applications. But in order to use these platforms, you need to migrate your legacy applications to containers or rewrite/redeploy your applications from scratch with the containerization approach. Rearchitecting your applications using containerization approach is preferable, but is that possible for complex legacy applications? Is your deployment team capable enough to list down each and every detail about the deployment process of your application? Do you have the patience of authoring a Docker file for each of the components of your complex application stack?

    Automated migrations!

    Velotio has been helping customers with automated migration of VMs and bare-metal servers to various container platforms. We have developed automation to convert these migrated applications as containers on various container deployment platforms like GKE, Amazon ECS and Kubernetes. In this blog post, we will cover one such migration tool developed at Velotio which will migrate your application running on a VM or physical machine to Google Container Engine (GKE) by running a single command.

    Migration tool details

    We have named our migration tool as A2C(Anything to Container). It can migrate applications running on any Unix or Windows operating system. 

    The migration tool requires the following information about the server to be migrated:

    • IP of the server
    • SSH User, SSH Key/Password of the application server
    • Configuration file containing data paths for application/database/components (more details below)
    • Required name of your docker image (The docker image that will get created for your application)
    • GKE Container Cluster details

    In order to store persistent data, volumes can be defined in container definition. Data changes done on volume path remain persistent even if the container is killed or crashes. Volumes are basically filesystem path from host machine on which your container is running, NFS or cloud storage. Containers will mount the filesystem path from your local machine to container, leading to data changes being written on the host machine filesystem instead of the container’s filesystem. Our migration tool supports data volumes which can be defined in the configuration file. It will automatically create disks for the defined volumes and copy data from your application server to these disks in a consistent way.

    The configuration file we have been talking about is basically a YAML file containing filesystem level information about your application server. A sample of this file can be found below:

    includes:
    - /
    volumes:
    - var/log/httpd
    - var/log/mariadb
    - var/www/html
    - var/lib/mysql
    excludes:
    - mnt
    - var/tmp
    - etc/fstab
    - proc
    - tmp

    The configuration file contains 3 sections: includes, volumes and excludes:

    • Includes contains filesystem paths on your application server which you want to add to your container image.
    • Volumes contain filesystem paths on your application server which stores your application data. Generally, filesystem paths containing database files, application code files, configuration files, log files are good candidates for volumes.
    • The excludes section contains filesystem paths which you don’t want to make part of the container. This may include temporary filesystem paths like /proc, /tmp and also NFS mounted paths. Ideally, you would include everything by giving “/” in includes section and exclude specifics in exclude section.

    Docker image name to be given as input to the migration tool is the docker registry path in which the image will be stored, followed by the name and tag of the image. Docker registry is like GitHub of docker images, where you can store all your images. Different versions of the same image can be stored by giving version specific tag to the image. GKE also provides a Docker registry. Since in this demo we are migrating to GKE, we will also store our image to GKE registry.

    GKE container cluster details to be given as input to the migration tool, contains GKE specific details like GKE project name, GKE container cluster name and GKE region name. A container cluster can be created in GKE to host the container applications. We have a separate set of scripts to perform cluster creation operation. Container cluster creation can also be done easily through GKE UI. For now, we will assume that we have a 3 node cluster created in GKE, which we will use to host our application.

    Tasks performed under migration

    Our migration tool (A2C), performs the following set of activities for migrating the application running on a VM or physical machine to GKE Container Cluster:

    1. Install the A2C migration tool with all it’s dependencies to the target application server

    2. Create a docker image of the application server, based on the filesystem level information given in the configuration file

    3. Capture metadata from the application server like configured services information, port usage information, network configuration, external services, etc.

    4.  Push the docker image to GKE container registry

    5. Create disk in Google Cloud for each volume path defined in configuration file and prepopulate disks with data from application server

    6. Create deployment spec for the container application in GKE container cluster, which will open the required ports, configure required services, add multi container dependencies, attach the pre populated disks to containers, etc.

    7. Deploy the application, after which you will have your application running as containers in GKE with application software in running state. New application URL’s will be given as output.

    8. Load balancing, HA will be configured for your application.

    Demo

    For demonstration purpose, we will deploy a LAMP stack (Apache+PHP+Mysql) on a CentOS 7 VM and will run the migration utility for the VM, which will migrate the application to our GKE cluster. After the migration we will show our application preconfigured with the same data as on our VM, running on GKE.

    Step 1

    We setup LAMP stack using Apache, PHP and Mysql on a CentOS 7 VM in GCP. The PHP application can be used to list, add, delete or edit user data. The data is getting stored in MySQL database. We added some data to the database using the application and the UI would show the following:

    Step 2

    Now we run the A2C migration tool, which will migrate this application stack running on a VM into a container and auto-deploy it to GKE.

    # ./migrate.py -c lamp_data_handler.yml -d "tcp://35.202.201.247:4243" -i migrate-lamp -p glassy-chalice-XXXXX -u root -k ~/mykey -l a2c-host --gcecluster a2c-demo --gcezone us-central1-b 130.211.231.58

    Pushing converter binary to target machine
    Pushing data config to target machine
    Pushing installer script to target machine
    Running converter binary on target machine
    [130.211.231.58] out: creating docker image
    [130.211.231.58] out: image created with id 6dad12ba171eaa8615a9c353e2983f0f9130f3a25128708762228f293e82198d
    [130.211.231.58] out: Collecting metadata for image
    [130.211.231.58] out: Generating metadata for cent7
    [130.211.231.58] out: Building image from metadata
    Pushing the docker image to GCP container registryInitiate remote data copy
    Activated service account credentials for: [glassy-chaliceXXXXX@appspot.gserviceaccount.com]
    for volume var/log/httpd
    Creating disk migrate-lamp-0
    Disk Created Successfully
    transferring data from sourcefor volume var/log/mariadb
    Creating disk migrate-lamp-1
    Disk Created Successfully
    transferring data from sourcefor volume var/www/html
    Creating disk migrate-lamp-2
    Disk Created Successfully
    transferring data from sourcefor volume var/lib/mysql
    Creating disk migrate-lamp-3
    Disk Created Successfully
    transferring data from sourceConnecting to GCP cluster for deployment
    Created service file /tmp/gcp-service.yaml
    Created deployment file /tmp/gcp-deployment.yaml

    Deploying to GKE

    $ kubectl get pod
    
    NAMEREADY STATUSRESTARTS AGE
    migrate-lamp-3707510312-6dr5g 0/1 ContainerCreating 058s

    $ kubectl get deployment
    
    NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
    migrate-lamp 1 1 10 1m

    $ kubectl get service
    
    NAME CLUSTER-IP EXTERNAL-IP PORT(S)AGE
    kubernetes 10.59.240.1443/TCP23hmigrate-lamp 10.59.248.44 35.184.53.100 3306:31494/TCP,80:30909/TCP,22:31448/TCP 53s

    You can access your application using above connection details!

    Step 3

    Access LAMP stack on GKE using the IP 35.184.53.100 on default 80 port as was done on the source machine.

    Here is the Docker image being created in GKE Container Registry:

    We can also see that disks were created with migrate-lamp-x, as part of this automated migration.

    Load Balancer also got provisioned in GCP as part of the migration process

    Following service files and deployment files were created by our migration tool to deploy the application on GKE:

    # cat /tmp/gcp-service.yaml
    apiVersion: v1
    kind: Service
    metadata:
    labels:
    app: migrate-lamp
    name: migrate-lamp
    spec:
    ports:
    - name: migrate-lamp-3306
    port: 3306
    - name: migrate-lamp-80
    port: 80
    - name: migrate-lamp-22
    port: 22
    selector:
    app: migrate-lamp
    type: LoadBalancer

    # cat /tmp/gcp-deployment.yaml
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
    labels:
    app: migrate-lamp
    name: migrate-lamp
    spec:
    replicas: 1
    selector:
    matchLabels:
    app: migrate-lamp
    template:
    metadata:
    labels:
    app: migrate-lamp
    spec:
    containers:
    - image: us.gcr.io/glassy-chalice-129514/migrate-lamp
    name: migrate-lamp
    ports:
    - containerPort: 3306
    - containerPort: 80
    - containerPort: 22
    securityContext:
    privileged: true
    volumeMounts:
    - mountPath: /var/log/httpd
    name: migrate-lamp-var-log-httpd
    - mountPath: /var/www/html
    name: migrate-lamp-var-www-html
    - mountPath: /var/log/mariadb
    name: migrate-lamp-var-log-mariadb
    - mountPath: /var/lib/mysql
    name: migrate-lamp-var-lib-mysql
    volumes:
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-0
    name: migrate-lamp-var-log-httpd
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-2
    name: migrate-lamp-var-www-html
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-1
    name: migrate-lamp-var-log-mariadb
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-3
    name: migrate-lamp-var-lib-mysql

    Conclusion

    Migrations are always hard for IT and development teams. At Velotio, we have been helping customers to migrate to cloud and container platforms using streamlined processes and automation. Feel free to reach out to us at contact@rsystems.com to know more about our cloud and container adoption/migration offerings.

  • How to Avoid Screwing Up CI/CD: Best Practices for DevOps Team

    Basic Fundamentals (One-line definition) :

    CI/CD is defined as continuous integration, continuous delivery, and/or continuous deployment. 

    Continuous Integration: 

    Continuous integration is defined as a practice where a developer’s changes are merged back to the main branch as soon as possible to avoid facing integration challenges.

    Continuous Delivery:

    Continuous delivery is basically the ability to get all the types of changes deployed to production or delivered to the customer in a safe, quick, and sustainable way.

    An oversimplified CI/CD pipeline

    Why CI/CD?

    • Avoid integration hell

    In most modern application development scenarios, multiple developers work on different features simultaneously. However, if all the source code is to be merged on the same day, the result can be a manual, tedious process of resolving conflicts between branches, as well as a lot of rework.  

    Continuous integration (CI) is the process of merging the code changes frequently (can be daily or multiple times a day also) to a shared branch (aka master or truck branch). The CI process makes it easier and quicker to identify bugs, saving a lot of developer time and effort.

    • Faster time to market

    Less time is spent on solving integration problems and reworking, allowing faster time to market for products.

    • Have a better and more reliable code

    The changes are small and thus easier to test. Each change goes through a rigorous cycle of unit tests, integration/regression tests, and performance tests before being pushed to prod, ensuring a better quality code.  

    • Lower costs 

    As we have a faster time to market and fewer integration problems,  a lot of developer time and development cycles are saved, leading to a lower cost of development.

    Enough theory now, let’s dive into “How do I get started ?”

    Basic Overview of CI/CD

    Decide on your branching strategy

    A good branching strategy should have the following characteristics:

    • Defines a clear development process from initial commit to production deployment
    • Enables parallel development
    • Optimizes developer productivity
    • Enables faster time to market for products and services
    • Facilitates integration with all DevOps practices and tools such as different versions of control systems

    Types of branching strategies (please refer to references for more details) :

    • Git flow – Ideal when handling multiple versions of the production code and for enterprise customers who have to adhere to release plans and workflows 
    • Trunk-based development – Ideal for simpler workflows and if automated testing is available, leading to a faster development time
    • Other branching strategies that you can read about are Github flow, Gitlab flow, and Forking flow.

    Build or compile your code 

    The next step is to build/compile your code, and if it is interpreted code, go ahead and package it.

    Build best practices :

    • Build Once – Building the same artifact for multiple env is inadvisable.
    • Exact versions of third-party dependencies should be used.
    • Libraries used for debugging, etc., should be removed from the product package.
    • Have a feedback loop so that the team is made aware of the status of the build step.
    • Make sure your builds are versioned correctly using semver 2.0 (https://semver.org/).
    • Commit early, commit often.

    Select tool for stitching the pipeline together

    • You can choose from GitHub actions, Jenkins, circleci, GitLab, etc.
    • Tool selection will not affect the quality of your CI/CD pipeline but might increase the maintenance if we go for managed CI/CD services as opposed to services like Jenkins deployed onprem. 

    Tools and strategy for SAST

    Instead of just DevOps, we should think of devsecops. To make the code more secure and reliable, we can introduce a step for SAST (static application security testing).

    SAST, or static analysis, is a testing procedure that analyzes source code to find security vulnerabilities. SAST scans the application code before the code is compiled. It’s also known as white-box testing, and it helps shift towards a security-first mindset as the code is scanned right at the start of SDLC.

    Problems SAST solves:

    • SAST tools give developers real-time feedback as they code, helping them fix issues before they pass the code to the next phase of the SDLC. 
    • This prevents security-related issues from being considered an afterthought. 

    Deployment strategies

    How will you deploy your code with zero downtime so that the customer has the best experience? Try and implement one of the strategies below automatically via CI/CD. This will help in keeping the blast radius to the minimum in case something goes wrong. 

    • Ramped (also known as rolling-update or incremental): The new version is slowly rolled out to replace the older version of the product .
    • Blue/Green: The new version is released alongside the older version, then the traffic is switched to the newer version.
    • Canary: The new version is released to a selected group of users before doing  a full rollout. This can be achieved by feature flagging as well. For more information, read about tools like launch darkly(https://launchdarkly.com/) and git unleash (https://github.com/Unleash/unleash). 
    • A/B testing: The new version is released to a subset of users under specific conditions.
    • Shadow: The new version receives real-world traffic alongside the older version and doesn’t impact the response.

    Config and Secret Management

    According to the 12-factor app, application configs should be exposed to the application with environment variables. However, it does not have restrictions on where these configurations need to be stored and sourced from.

    A few things to keep in mind while storing configs.

    • Versioning of configs always helps, but storing secrets in VCS is strongly discouraged.
    • For an enterprise, it is beneficial to use a cloud-agnostic solution.

    Solution:

    • Store your configuration secrets outside of the version control system.
    • You can use AWS secret manager, Vault, and even S3 for storing your configs, e.g.: S3 with KMS, etc. There are other services available as well, so choose the one which suits your use case the best.

    Automate versioning and release notes generation

    All the releases should be tagged in the version control system. Versions can be automatically updated by looking at the git commit history and searching for keywords.

    There are many modules available for release notes generation. Try and automate these as well as a part of your CI/CD process. If this is done, you can successfully eliminate human intervention from the release process.

    Example from GitHub actions workflow :

    - name: Automated Version Bump
      id: version-bump
      uses: 'phips28/gh-action-bump-version@v9.0.16'
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      with:
        commit-message: 'CI: Bump version to v{{version}}'

    Have a rollback strategy

    In case of regression, performance, or smoke test fails after deployment onto an environment, feedback should be given and the version should be rolled back automatically as a part of the CI/CD process. This makes sure that the environment is up and also reduces the MTTR (mean time to recovery), and MTTD (mean time to detection) in case there is a production outage due to code deployment.

    GitOps tools like argocd and flux make it easy to do things like this, but even if you are not using any of the GitOps tools, this can be easily managed using scripts or whatever tool you are using for deployment.

    Include db changes as a part of your CI/CD

    Databases are often created manually and frequently evolve through manual changes, informal processes, and even testing in production. Manual changes often lack documentation and are harder to review, test, and coordinate with software releases. This makes the system more fragile with a higher risk of failure.

    The correct way to do this is to include the database in source control and CI/CD pipeline. This lets the team document each change, follow the code review process, test it thoroughly before release, make rollbacks easier, and coordinate with software releases. 

    For a more enterprise or structured solution, we could use a tool such as Liquibase, Alembic, or Flyway.

    How it should ideally be done:

    • We can have a migration-based strategy where, for each DB change, an additional migration script is added and is executed as a part of CI/CD .
    • Things to keep in mind are that the CI/CD process should be the same across all the environments. Also, the amount of data on prod and other environments might vary drastically, so batching and limits should be used so that we don’t end up using all the memory of our database server.
    • As far as possible, DB migrations should be backward compatible. This makes it easier for rollbacks. This is the reason some companies only allow additive changes as a part of db migration scripts. 

    Real-world scenarios

    • Gated approach 

    It is not always possible to have a fully automated CI/CD pipeline because the team may have just started the development of a product and might not have automated testing yet.

    So, in cases like these, we have manual gates that can be approved by the responsible teams. For example, we will deploy to the development environment and then wait for testers to test the code and approve the manual gate, then the pipeline can go forward.

    Most of the tools support these kinds of requests. Make sure that you are not using any kind of resources for this step otherwise you will end up blocking resources for the other pipelines.

    Example:

    https://www.jenkins.io/doc/pipeline/steps/pipeline-input-step/#input-wait-for-interactive-input

    def LABEL_ID = "yourappname-${UUID.randomUUID().toString()}"
    def BRANCH_NAME = "<Your branch name>"
    def GIT_URL = "<Your git url>"
    // Start Agent
    node(LABEL_ID) {
        stage('Checkout') {
            doCheckout(BRANCH_NAME, GIT_URL)
        }
        stage('Build') {
            ...
        }
        stage('Tests') {
            ...
        }    
    }
    // Kill Agent
    // Input Step
    timeout(time: 15, unit: "MINUTES") {
        input message: 'Do you want to approve the deploy in production?', ok: 'Yes'
    }
    // Start Agent Again
    node(LABEL_ID) {
        doCheckout(BRANCH_NAME, GIT_URL) 
        stage('Deploy') {
            ...
        }
    }
    def doCheckout(branchName, gitUrl){
        checkout([$class: 'GitSCM',
            branches: [[name: branchName]],
            doGenerateSubmoduleConfigurations: false,
            extensions:[[$class: 'CloneOption', noTags: true, reference: '', shallow: true]],
            userRemoteConfigs: [[credentialsId: '<Your credentials id>', url: gitUrl]]])
    }

    Observability of releases 

    Whenever we are debugging the root cause of issues in production, we might need the information below. As the system gets more complex with multiple upstreams and downstream, it becomes imperative that we have this information, all in one place, for efficient debugging and support by the operations team.

    • When was the last deployment? What version was deployed?
    • The deployment history as to which version was deployed when along with the code changes that went in.

    Below are the 2 ways generally organizations follow to achieve this:

    • Have a release workflow that is tracked using a Change request or Service request on Jira or any other tracking tool.
    • For GitOps applications using tools like Argo CD and flux, all this information is available as a part of the version control system and can be derived from there.

    DORA metrics 

    DevOps maturity of a team is measured based on mainly four metrics that are defined below, and CI/CD helps in improving all of the below. So, teams and organizations should try and achieve the Elite status for DORA metrics.

    • Deployment Frequency— How often an org successfully releases to production
    • Lead Time for Changes— The amount of time a commit takes to get into prod
    • Change Failure Rate— The percentage of deployments causing a failure in prod
    • Time to Restore Service— How long an org takes to recover from a failure in prod

    Conclusion 

    CI/CD forms an integral part of DevOps and SRE practices, and if done correctly,  it can impact the team’s and organization’s productivity in a huge way. 

    So, try and implement the above principles and get one step closer to having a highly productive team and a better product.

  • Creating a Frictionless SignUp Experience with Auth0 for your Application

    What is Auth0 and frictionless signup?

    Auth0 is a service that handles your application’s authentication and authorization needs with simple drop-in solutions. It can save time and risk compared to building your own authentication/authorization system. Auth0 even has its own universal login/signup page that can be customized through the dashboard, and it also provides APIs to create/manage users.

    A frictionless signup flow allows the user to use a core feature of the application without forcing the user to sign up first. Many companies use this flow, namely Bookmyshow, Redbus, Makemytrip, and Goibibo. 

    So, as an example, we will see how an application like Bookmyshow looks with this frictionless flow. First, let’s assume the user is a first-time user for this application; the user lands on the landing page, selects a movie, selects the theater, selects the number of seats, and then lands on the payment page where they will fill in their contact details (email and mobile number) and proceed to complete the booking flow by paying for the ticket. At this point, the user has accessed the website and made a booking without even signing up.

    Later on, when the user sign ups using the same contact details which were provided during booking, they will notice their previous bookings and other details waiting for them on the app’s account page.

    What we will be doing in this blog?

    In this blog, we will be implementing Auth0 and replicating a similar feature as mentioned above using Auth0. In this code sample, we will be using react.js for the frontend and nest.js for the backend. 

    To keep the blog short, we will only focus on the logic related to the frictionless signup with Auth0. We will not be going through other aspects, like payment service/integration, nest.js, ORM, etc.

    Setup for the Auth0 dashboard:

    Auth0’s documentation is pretty straightforward and easy to understand; we’ll link the sections for this setup, and you can easily sign up and continue your setup with the help of their documentation.

    Do note that you will have to create two applications for this flow. One is a Single-Page Application for your frontend so that you can initiate login from your frontend app and the other is ManagementV2 for your server so that you can use their management APIs to create a user.

    After registering you will get the client id and client secret on the application details page, you will require these keys to plug it in auth0’s SDK so you will be able to use its APIs in your application.

    Setup for your single-page application:

    To use Auth0’s API, we would have to install its SDK. For single-page applications, Auth0 has rewritten a new SDK from auth0-js called auth0-spa-js. But if you are using either Angular, React, or Vue, then auth0 already has created their framework/library-specific wrapper for us to use.

    So, we will move on to installing its React wrapper and continuing with the setup:

    npm install @auth0/auth0-react

    Then, we will wrap our app with Auth0Provider and provide the keys from the Auth0 application settings dashboard:

    <Auth0Provider
         domain={process.env.NEXT_PUBLIC_AUTH0_DOMAIN}
         clientId={process.env.NEXT_PUBLIC_AUTH0_CLIENT_ID}
         redirectUri={
           typeof window !== 'undefined' &&
           `${window.location.origin}/auth-callback`
         }
         onRedirectCallback={onRedirectCallback}
         audience={process.env.NEXT_PUBLIC_AUTH0_AUDIENCE}
         //Safari uses ITP which prevents silent auth.Please refer https://www.py4u.net/discuss/353302
         useRefreshTokens={true}
         cacheLocation="localstorage"
       >
         </App>
       </Auth0Provider>

    You will find the explanation of the above props and more on Auth0’s React APIs through their GitHub link https://github.com/auth0/auth0-react.

    But we do want to cover one issue with their authenticated state and redirection. We noticed that when Auth0 redirects to our application, the isAuthenticated flag doesn’t get reflected immediately. The states get sequentially updated like so:

    • isLoading: false
      isAuthenticated: false
    • isLoading: true
      isAuthenticated: false
    • isLoading: false
      isAuthenticated: false
    • isLoading: false
      isAuthenticated: true

    This can be a pain if you have some common redirection logic based on the user‘s authentication state and user type. 

    What we found out from the Auth0’s community forum is that Auth0 does take some time to parse and update its states, and after the update operations, it then calls the onRedirectCallback function, so it’s safe to put your redirection logic in onRedirectCallback, but there is another issue with that. 

    The function doesn’t have access to Auth0’s context, so you can’t access the user object or any other state for your redirection logic, so you would want to redirect to a page where you have your redirection logic when onRedirectCallback is called.

    So, in place of the actual page set in redirectUri, you would want to use a buffer page like the /auth-callback route where it just shows a progress bar and nothing else.

    Implementation:

    For login/signup, since we are using the universal page we don’t have to do much, we just have to initiate the login with loginWithRedirect() function from the UI, and Auth0 will handle the rest.

    Now, for the core part of the blog, we will now be creating a createBooking API on our nest.js backend, which will accept email, mobile number, booking details (movie, theater location, number of seats), and try to create a booking.

    In this frictionless flow, internally the application does create a user for the booking to refer to; otherwise, it would be difficult to show the bookings once the user signups and tries to access its bookings. 

    So, the logic would go as follows: first, it will check if a user exists with the provided email in our DB. If not, then we will create the user in Auth0 through its management API with a temporary password, and then we will link the newly created Auth0 user in our users table. Then, by using this, we will create a booking.

    Here is an overview of how the createBooking API will look:

    @Post('/bookings/create')
      async createBooking(
        @Body() createBookingDto: createBookingDto
      ): Promise<BookingResponseDto> {
        const { email } = createBookingDto;
        // Checks if the email exists or not, if it doesn’t exists then we will create an account, else we will use the existing user to create a booking
        let user = await this.userRepository.findByEmail(email);
     
        if (!user){
        	const password = Utilities.generatePassword(16);
        	// We use a random password here to create the user on auth0 
        	const { auth0Response } = await this.createUserOnAuth0(
            email,
      password,
        	);
        	this.logger.debug(auth0Response, 'Created Auth0 User');
     
        	let userData: CreateUserDto = {
            email,
            auth0UserId: auth0Response['_id'],
        	};
        	// Creates and links the auth0 user with our DB
        	user = await this.userRepository.addUser(userData);
        }
     
        const booking = {
    userId: user.id,
    transactionId: createBookingDto.transaction.id, // Assuming the payment was done before this API call in a different service
    showId: createBookingDto.show.id,
    theaterId: createBookingDto.theater.id,
    seatNumbers: createBookingDto.seats
        }
        // Creates a booking 
        const bookingObject = await this.bookingRepository.bookTicket(booking)
     
        return new BookingResponseDto(bookingObject)
      }

    As for creating the user on Auth0, we will use Auth0’s management API with the /dbconnections/signup endpoint.
    Apart from the config details that the API requires (client_id, client_secret and connection), it also requires email and password. For the password, we will use a randomly generated one.

    After the user has been created, we will send a forgotten password email to that email address so that the user can set the password and access the account.

    Do note you will have to use the client_id, client_secret, and connection of the ManagementV2 application that was created in the Auth0 dashboard.

    private async createUserOnAuth0(
        email: string,
        password: string,
        createdBy: string,
        retryCount = 0,
      ): Promise<Record<string, string>> {
        try {
          const axiosResponse = await this.httpService
            .post(
              `https://${configService.getAuth0Domain()}/dbconnections/signup`,
              {
                client_id: configService.getAuth0ClientId(),
                client_secret: configService.getAuth0ClientSecret(),
                connection: configService.getAuth0Connection(),
                email,
                password,
              },
            )
            .toPromise();
     
          this.logger.log(
            axiosResponse.data.email,
            'Auth0 user created with email',
          );
     
          // Send password reset email
          this.sendPasswordResetEmail(email);
     
          return { auth0Response: axiosResponse.data, password };
        } catch (err) {
          this.logger.error(err);
          /**
           * {@link https://auth0.com/docs/connections/database/password-strength}
           * Auth0 does not send any specific response, so here we are calling create user again
           * assuming password failed to meet the requirement of auth0
           * But here also we are gonna try it ERROR_RETRY_COUNT times and after that stop call,
           * so we don't get in infinite loop
           */
          if (retryCount < ERROR_RETRY_COUNT) {
            return this.createUserOnAuth0(
              email,
              Utilities.generatePassword(16),
              createdBy,
              retryCount + 1,
            );
          }
     
          throw new HttpException(err, HttpStatus.BAD_REQUEST);
        }
      }

    To send the forgotten password email, we will use the /dbconnections/change_password endpoint from the management API. The code is pretty straightforward.

    This way, the user can change the password, and he/she will be able to access their account.

    private async sendPasswordResetEmail(email: string): Promise<void> {
        try {
          const axiosResponse = await this.httpService
            .post(
              `https://${configService.getAuth0Domain()}/dbconnections/change_password`,
              {
                client_id: configService.getAuth0ClientId(),
                client_secret: configService.getAuth0ClientSecret(),
                connection: configService.getAuth0Connection(),
                email,
              },
            )
            .toPromise();
     
          this.logger.log(email, 'Password reset email sent to');
     
          return axiosResponse.data;
        } catch (err) {
          this.logger.error(err);
        }
      }

    With this, the user can now make a booking without signing up and have a user created in Auth0 for that user, so when he/she logs in later using the universal login page, Auth0 will have a reference for it.

    Conclusion:

    Auth0 is a great platform for managing your application’s authentication and authorization needs if you have a simple enough login/signup flow. It can get a bit tricky when you are trying to implement a non-traditional login/signup flow or a custom flow, which is not supported by Auth0. In such a scenario, you would need to add some custom code as explained in the example above. 

  • UI Automation and API Testing with Cypress – A Step-by-step Guide

    These days, most web applications are driven by JavaScript frameworks that include front-end and back-end development. So, we need to have a robust QA automation framework that covers APIs as well as end-to-end tests (E2E tests). These tests check the user flow over a web application and confirm whether it meets the requirement. 

    Full-stack QA testing is critical in stabilizing APIs and UI, ensuring a high-quality product that satisfies user needs.

    To test UI and APIs independently, we can use several tools and frameworks, like Selenium, Postman, Rest-Assured, Nightwatch, Katalon Studio, and Jest, but this article will be focusing on Cypress.

    We will cover how we can do full stack QA testing using Cypress. 

    What exactly is Cypress?

    Cypress is a free, open-source, locally installed Test Runner and Dashboard Service for recording your tests. It is a frontend and backend test automation tool built for the next generation of modern web applications.

    It is useful for developers as well as QA engineers to test real-life applications developed in React.js, Angular.js, Node.js, Vue.js, and other front-end technologies.

    How does Cypress Work Functionally?

    Cypress is executed in the same run loop as your application. Behind Cypress is a Node.js server process.

    Most testing tools operate by running outside of the browser and executing remote commands across the network. Cypress does the opposite, while at the same time working outside of the browser for tasks that require higher privileges.

    Cypress takes snapshots of your application and enables you to time travel back to the state it was in when commands ran. 

    Why Use Cypress Over Other Automation Frameworks?

    Cypress is a JavaScript test automation solution for web applications.

    This all-in-one testing framework provides a chai assertion library with mocking and stubbing all without Selenium. Moreover, it supports the Mocha test framework, which can be used to develop web test automations.

    Key Features of Cypress:

    • Mocking – By mocking the server response, it has the ability to test edge cases.
    • Time Travel – It takes snapshots as your tests run, allowing users to go back and forth in time during test scenarios.
    • Flake Resistant – It automatically waits for commands and assertions before moving on.
    • Spies, Stubs, and Clocks – It can verify and control the behavior of functions, server responses, or timers.
    • Real Time Reloads – It automatically reloads whenever you make changes to your tests.
    • Consistent Results – It gives consistent and reliable tests that aren’t flaky.
    • Network Traffic Control – Easily control, stub, and test edge cases without involving your server.
    • Automatic Waiting – It automatically waits for commands and assertions without ever adding waits or sleeps to your tests. No more async hell. 
    • Screenshots and Videos – View screenshots taken automatically on failure, or videos of your entire test suite when it has run smoothly.
    • Debuggability – Readable error messages help you to debug quickly.

       


    Fig:- How Cypress works 

     

     Installation and Configuration of the Cypress Framework

    This will also create a package.json file for the test settings and project dependencies.

    The test naming convention should be test_name.spec.js 

    • To run the Cypress test, use the following command:
    $ npx cypress run --spec "cypress/integration/examples/tests/e2e_test.spec.js"

    • This is how the folder structure will look: 
    Fig:- Cypress Framework Outline

    REST API Testing Using Cypress

    It’s important to test APIs along with E2E UI tests, and it can also be helpful to stabilize APIs and prepare data to interact with third-party servers.

    Cypress provides the functionality to make an HTTP request.

    Using Cypress’s Request() method, we can validate GET, POST, PUT, and DELETE API Endpoints.

    Here are some examples: 

    describe(“Testing API Endpoints Using Cypress”, () => {
    
          it(“Test GET Request”, () => {
                cy.request(“http://localhost:3000/api/posts/1”)
                     .then((response) => {
                            expect(response.body).to.have.property('code', 200);
                })
          })
    
          it(“Test POST Request”, () => {
                cy.request({
                     method: ‘POST’,
                     url: ‘http://localhost:3000/api/posts’,
                     body: {
                         “id” : 2,
                         “title”:“Automation”
                     }
                }).then((response) => { 
                        expect(response.body).has.property(“title”,“Automation”); 
                })
          })
    
          it(“Test PUT Request”, () => {
                cy.request({
                        method: ‘PUT’,
                        url: ‘http://localhost:3000/api/posts/2’,
                        body: { 
                           “id”: 2,
                           “title” : “Test Automation”
                        }
                }).then((response) => { 
                        expect(response.body).has.property(“title”,“ Test Automation”); 
                })          
          })        
    
          it(“Test DELETE Request”, () => {
                cy.request({
                          method : ‘DELETE’,
                          url: ‘http://localhost:3000/api/post/2’
                          }).then((response) => {
                            expect(response.body).to.be.empty;
                })	
          })
       
     })

    How to Write End-to-End UI Tests Using Cypress

    With Cypress end-to-end testing, you can replicate user behaviour on your application and cross-check whether everything is working as expected. In this section, we’ll check useful ways to write E2E tests on the front-end using Cypress. 

    Here is an example of how to write E2E test in Cypress: 

    How to Pass Test Case Using Cypress

    1. Navigate to the Google website
    2. Click on the search input field 
    3. Type Cypress and press enter  
    4. The search results should contain Cypress

    How to Fail Test Case Using Cypress

    1. Navigate to the wrong URL http://locahost:8080
    2. Click on the search input field 
    3. Type Cypress and press enter
    4. The search results should contain Cypress  
    describe('Testing Google Search', () => {
    
         // To Pass the Test Case 1 
    
         it('I can search for Valid Content on Google', () => {
    
              cy.visit('https://www.google.com');
              cy.get("input[title='Search']").type('Cypress').type(‘{enter}’);
              cy.contains('https://www.cypress.io'); 
    
         });
    
         // To Fail the Test Case 2
    
         it('I can navigate to Wrong URL’', () => {
    
              cy.visit('http://localhost:8080');
              cy.get("input[title='Search']").type('Cypress').type(‘{enter}’);
              cy.contains('https://www.cypress.io'); 
    
         });
     
    });

    Cross Browser Testing Using Cypress 

    Cypress can run tests across the latest releases of multiple browsers. It currently has support for Chrome and Firefox (beta). 

    Cypress supports the following browsers:

    • Google Chrome
    • Firefox (beta)
    • Chromium
    • Edge
    • Electron

    Browsers can be specified via the –browser flag when using the run command to launch Cypress. npm scripts can be used as shortcuts in package.json to launch Cypress with a specific browser more conveniently. 

    To run tests on browsers:

    $ npx cypress run --browser chrome --spec “cypress/integration/examples/tests/e2e_test.spec.js”

    Here is an example of a package.json file to show how to define the npm script:

    "scripts": {
      "cy:run:chrome": "cypress run --browser chrome",
      "cy:run:firefox": "cypress run --browser firefox"
    }

    Cypress Reporting

    Reporter options can be specified in the cypress.json configuration file or via CLI options. Cypress supports the following reporting capabilities:

    • Mocha Built-in Reporting – As Cypress is built on top of Mocha, it has the default Mochawesome reporting 
    • JUnit and TeamCity – These 3rd party Mocha reporters are built into Cypress.

    To install additional dependencies for report generation: 

    Installing Mochaawesome:

    $ npm install mochawesome

    Or installing JUnit:

    $ npm install junit

    Examples of a config file and CLI for the Mochawesome report 

    • Cypress.json config file:
    {
        "reporter": "mochawesome",
        "reporterOptions":
        {
            "reportDir": "cypress/results",
            "overwrite": false,
            "html": true,
            "json": true
        }
    }

    • CLI Reporting:
    $ npx cypress run --reporter mochawesome --spec “cypress/integration/examples/tests/e2e_test.spec.js”

    Examples of a config File and CLI for the JUnit Report: 

    • Cypress.json config file for JUnit: 
    {
        "reporter": "junit",
        "reporterOptions": 
        {
            "reportDir": "cypress/results",
            "mochaFile": "results/my-test-output.xml",
            "toConsole": true
        }
    }

    • CLI Reporting: <cypress_junit_reporting></cypress_junit_reporting>
    $ npx cypress run --reporter junit  --reporter-options     “mochaFile=results/my-test-output.xml,toConsole=true

    Fig:- Collapsed View of Mochawesome Report

     

    Fig:- Expanded View of Mochawesome Report

     

    Fig:- Mochawesome Report Settings

    Additional Possibilities of Using Cypress 

    There are several other things we can do using Cypress that we could not cover in this article, although we’ve covered the most important aspects of the tool..

    Here are some other usages of Cypress that we could not explore here:

    • Continuous integration and continuous deployment with Jenkins 
    • Behavior-driven development (BDD) using Cucumber
    • Automating applications with XHR
    • Test retries and retry ability
    • Custom commands
    • Environment variables
    • Plugins
    • Visual testing
    • Slack integration 
    • Model-based testing
    • GraphQL API Testing 

    Limitations with Cypress

    Cypress is a great tool with a great community supporting it. Although it is still young, it is being continuously developed and is quickly catching up with the other full-stack automation tools on the market.

    So, before you decide to use Cypress, we would like to touch upon some of its limitations. These limitations are for version 5.2.0, the latest version of Cypress at the time of this article’s publishing.

    Here are the current limitations of using Cypress:

    • It can’t use two browsers at the same time.
    • It doesn’t provide support for multi-tabs.
    • It only supports the JavaScript language for creating test cases.
    • It doesn’t currently provide support for browsers like Safari and IE.
    • It has limited support for iFrames.

    Conclusion

    Cypress is a great tool with a growing feature-set. It makes setting up, writing, running, and debugging tests easy for QA automation engineers. It also has a quicker learning cycle with a good, baked-in execution environment.

    It is fully JavaScript/MochaJS-oriented with specific new APIs to make scripting easier. It also provides a flexible test execution plan that can implement significant and unexpected changes.

    In this blog, we talked about how Cypress works functionally, performed end-to-end UI testing, and touched upon its limitations. We hope you learned more about using Cypress as a full-stack test automation tool.

    Related QA Articles

    1. Building a scalable API testing framework with Jest and SuperTest
    2. Automation testing with Nightwatch JS and Cucumber: Everything you need to know
    3. API testing using Postman and Newman
  • Kubernetes CSI in Action: Explained with Features and Use Cases

    Kubernetes Volume plugins have been a great way for the third-party storage providers to support a block or file storage system by extending the Kubernetes volume interface and are “In-Tree” in nature.

    In this post, we will dig into Kubernetes Container Storage Interface. We will use Hostpath CSI Driver locally on a single node bare metal cluster, to get the conceptual understanding of the CSI workflow in provisioning the Persistent Volume and its lifecycle. Also, a cool feature of snapshotting the volume and recover it back is explained.

    Introduction

    CSI is a standard for exposing  storage systems in arbitrary block and file storage to containerized workloads on Container Orchestrations like Kubernetes, Mesos, and Cloud Foundry. It becomes very extensible for third-party storage provider to expose their new storage systems using CSI, without actually touching the Kubernetes code. Single independent implementation of CSI Driver by a storage provider will work on any orchestrator.

    This new plugin mechanism has been one of the most powerful features of Kubernetes. It enables the storage vendors to:

    1. Automatically create storage when required.
    2. Make storage available to containers wherever they’re scheduled.
    3. Automatically delete the storage when no longer needed.

    This decoupling helps the vendors to maintain the independent release and feature cycles and focus on the API implementation without actually worrying about the backward incompatibility and to support their plugin just as easy as deploying a few pods.

     

    Image Source: Weekly Geekly

    Why CSI?

    Prior to CSI, k8s volume plugins have to be “In-tree”, compiled and shipped with core kubernetes binaries. This means, it will require the storage providers to check-in their into the core k8s codebase if they wish to add the support for a new storage system.

    A plugin-based solution, flex-volume, tried to address this issue by exposing the exec based API for external  plugins. Although it also tried to work on the similar notion of being detached with k8s binary, there were several major problems with that approach. Firstly, it needed the root access to the host and master file system to deploy the driver files. 

    Secondly, it comes with the huge baggage of prerequisites and OS dependencies which are assumed to be available on the host. CSI implicitly solves all these issues by being containerized and using the k8s storage primitives.

    CSI has evolved as the one-stop solution addressing all the above issues which enables storage plugins to be out-of-tree and deployed via standard k8s primitives, such as PVC, PV and StorageClasses.

    The main aim of introducing CSI is to establish a standard mechanism of exposing any type of storage system under-the-hood for all the container orchestrators.

    Deploy the Driver Plugin

    The CSI Driver comprises of a few main components which are various side cars and also the implementation of the CSI Services by the vendor, which will be understood by the Cos. The CSI Services will be described later in the blog. Let’s try out deploying hostpath CSI Driver.

    Prerequisites:

    • Kubernetes cluster (not Minikube or Microk8s): Running version 1.13 or later
    • Access to the terminal with Kubectl installed

    Deploying HostPath Driver Plugin:

    1. Clone the repo of HostPath Driver Plugin locally or just copy the deploy and example folder from the root path
    2. Checkout the master branch (if not)
    3. The hostpath driver comprises of manifests for following side-cars: (in ./deploy/master/hostpath/)
      – csi-hostpath-attacher.yaml
      – csi-hostpath-provisioner.yaml
      – csi-hostpath-snapshotter.yaml
      – csi-hostpath-plugin.yaml:
      It will deploy 2 containers, one is node-driver-registrar and a hospath-plugin
    4. The driver also includes separate Service for each component and in the deployment file with statefulsets for the containers
    5. It also deploys Cluster-role-bindings and RBAC rules for each component, maintained in a separate repo
    6. Each Component (side-car) is managed in a separate repository
    7. The /deploy/util/ contains a shell script which handles the complete deployment process
    8. After copying the folder or cloning the repo, just run:    
    $ deploy/kubernetes-latest/deploy-hostpath.sh

         9. The output will be similar to:

    applying RBAC rules
    kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-provisioner/v1.0.1/deploy/kubernetes/rbac.yaml
    serviceaccount/csi-provisioner created
    clusterrole.rbac.authorization.k8s.io/external-provisioner-runner created
    clusterrolebinding.rbac.authorization.k8s.io/csi-provisioner-role created
    role.rbac.authorization.k8s.io/external-provisioner-cfg created
    rolebinding.rbac.authorization.k8s.io/csi-provisioner-role-cfg created
    kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-attacher/v1.0.1/deploy/kubernetes/rbac.yaml
    serviceaccount/csi-attacher created
    clusterrole.rbac.authorization.k8s.io/external-attacher-runner created
    clusterrolebinding.rbac.authorization.k8s.io/csi-attacher-role created
    role.rbac.authorization.k8s.io/external-attacher-cfg created
    rolebinding.rbac.authorization.k8s.io/csi-attacher-role-cfg created
    kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/v1.0.1/deploy/kubernetes/rbac.yaml
    serviceaccount/csi-snapshotter created
    clusterrole.rbac.authorization.k8s.io/external-snapshotter-runner created
    clusterrolebinding.rbac.authorization.k8s.io/csi-snapshotter-role created
    deploying hostpath components
       deploy/kubernetes-1.13/hostpath/csi-hostpath-attacher.yaml
            using           image: quay.io/k8scsi/csi-attacher:v1.0.1
    service/csi-hostpath-attacher created
    statefulset.apps/csi-hostpath-attacher created
       deploy/kubernetes-1.13/hostpath/csi-hostpath-plugin.yaml
            using           image: quay.io/k8scsi/csi-node-driver-registrar:v1.0.2
            using           image: quay.io/k8scsi/hostpathplugin:v1.0.1
            using           image: quay.io/k8scsi/livenessprobe:v1.0.2
    service/csi-hostpathplugin created
    statefulset.apps/csi-hostpathplugin created
       deploy/kubernetes-1.13/hostpath/csi-hostpath-provisioner.yaml
            using           image: quay.io/k8scsi/csi-provisioner:v1.0.1
    service/csi-hostpath-provisioner created
    statefulset.apps/csi-hostpath-provisioner created
       deploy/kubernetes-1.13/hostpath/csi-hostpath-snapshotter.yaml
            using           image: quay.io/k8scsi/csi-snapshotter:v1.0.1
    service/csi-hostpath-snapshotter created
    statefulset.apps/csi-hostpath-snapshotter created
       deploy/kubernetes-1.13/hostpath/csi-hostpath-testing.yaml
            using           image: alpine/socat:1.0.3
    service/hostpath-service created
    statefulset.apps/csi-hostpath-socat created
    11:43:06 waiting for hostpath deployment to complete, attempt #0
    11:43:16 waiting for hostpath deployment to complete, attempt #1
    11:43:26 waiting for hostpath deployment to complete, attempt #2
    deploying snapshotclass
    volumesnapshotclass.snapshot.storage.k8s.io/csi-hostpath-snapclass created

         10. The driver is deployed, we can check:

    $ kubectl get pods
    
    NAME                          READY   STATUS        RESTARTS    AGE
    csi-hostpath-attacher-0       1/1     Running        0          1m06s
    csi-hostpath-provisioner-0    1/1     Running        0          1m06s
    csi-hostpath-snapshotter-0    1/1     Running        0          1m06s
    csi-hostpathplugin-0          2/2     Running        0          1m06s

    CSI API-Resources:

    $ kubectl api-resources | grep -E "^Name|csi|storage|PersistentVolume"
    
    NAME                     APIGROUP                  NAMESPACED     KIND
    persistentvolumesclaims                            true           PersistentVolumeClaim
    persistentvolume                                   false          PersistentVolume
    csidrivers               csi.storage.k8s.io        false          CSIDrivers
    volumesnapshotclasses    snapshot.storage.k8s.io   false          VolumeSnapshotClass
    volumesnapshotcontents   snapshot.storage.k8s.io   false          VolumeSnapshotContent
    Volumesnapshots          snapshot.storage.k8s.io   true           VolumeSnapshot
    csidrivers               storage.k8s.io            false          CSIDriver
    csinodes                 storage.k8s.io            false          CSINode
    storageclasses           storage.k8s.io            false          VolumeAttachment

    There are resources from core apigroups, storage.k8s.io and resources which created by CRDs snapshot.storage.k8s.io and csi.storage.k8s.io.

    CSI SideCars

    K8s CSI containers are sidecars that simplify the development and deployment of the CSI Drivers on a k8s cluster. Different Drivers have some similar logic to trigger the appropriate operations against the “CSI volume driver” container and update the Kubernetes API as appropriate.

    The common controller (common containers) has to be bundled with the provider-specific containers.

    The official sig-k8s contributors maintain the following basic skeleton containers for any CSI Driver:

    Note: In case of Hostpath driver, only ‘csi-hostpath-plugin’ container will be having the specific code. All the others are common CSI sidecar containers. These containers have a socket mounted in the socket-dir volume of type EmptyDir, which makes their communication possible using gRPC

    1. External Provisioner:
      It  is a sidecar container that watches Kubernetes PersistentVolumeClaim objects and triggers CSI CreateVolume and DeleteVolume operations against a driver endpoint.
      The CSI external-attacher also supports the Snapshot DataSource. If a Snapshot CRD is specified as a data source on a PVC object, the sidecar container fetches the information about the snapshot by fetching the SnapshotContent object and populates the data source field indicating to the storage system that new volume should be populated using specified snapshot.
    2. External Attacher :
      It  is a sidecar container that watches Kubernetes VolumeAttachment objects and triggers CSI ControllerPublish and ControllerUnpublish operations against a driver endpoint
    3. Node-Driver Registrar:
      It is a sidecar container that registers the CSI driver with kubelet, and adds the drivers custom NodeId to a label on the Kubernetes Node API Object. The communication of this sidecar is handled by the ‘Identity-Service’ implemented by the driver. The CSI Driver is registered with the kubelet using its device–plugin mechanisms
    4. External Snapshotter:
      It is a sidecar container that watches the Kubernetes API server for VolumeSnapshot and VolumeSnapshotContent CRD objects.The creation of a new VolumeSnapshot object referencing a SnapshotClass CRD object corresponding to this driver causes the sidecar container to provision a new snapshot.
    5. This sidecar listens to the service which indicates the successful creation of VolumeSnapshot, and immediately creates the VolumeSnapshotContent resource
    6. Cluster-driver Registrar:
      CSI driver is registered with the cluster by a sidecar container CSI cluster-driver-registrar creating a CSIDriver object. This CSIDriver enables the driver to customize the way of k8s interaction with it.

    Developing a CSI Driver

    To start the implementation of CSIDriver, an application must implement the gRPC services described by the CSI Specification.

    The minimum service a CSI application should implement are following:

    • CSI Identity service: Enables Kubernetes components and CSI containers to identify the driver
    • CSI Node service: Required methods enable callers to make volume available at a specified path.

    All the required services may be implemented independently or in the same driver application. The CSI driver application should be containerised to make it easy to deploy on Kubernetes. Once the main specific logic of the driver is containerized, they can be attached to the sidecars and deployed, in node and/or controller mode.

    Capabilities

    CSI also have provisions to enable the custom CSI driver to support many additional features/services by using the “Capabilities”. It contains a list of all the features the driver supports.

    Note: Refer the link for detailed explanation for developing a CSI Driver.

    Try out provisioning the PV:

    1. A storage class with:

    volumeBindingMode: WaitForFirstConsumer
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
      name: csi-hostpath-sc
    provisioner: hostpath.csi.k8s.io
    volumeBindingMode: WaitForFirstConsumer

    2. Now, A PVC is also needed to be consumed by the sample Pod.

    And also a sample pod is also required, so that it can be bounded with the PV created by the PVC from above step
    The above files are found in ./exmples directory and can be deployed using create or apply kubectl commands

    Validate the deployed components:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: pvc-fs
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
      storageClassName: csi-hostpath-sc # defined in csi-setup.yaml

    3. The Pod to consume the PV

    kind: Pod
    apiVersion: v1
    metadata:
      name: pod-fs
    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - csi-hostpathplugin
            topologyKey: kubernetes.io/hostname
      containers:
        - name: my-frontend
          image: busybox
          volumeMounts:
          - mountPath: "/data"
            name: my-csi-volume
          command: [ "sleep", "1000000" ]
      volumes:
        - name: my-csi-volume
          persistentVolumeClaim:
            claimName: pvc-fs # defined in csi-pvc.yaml

    Validate the deployed components:

    $ kubectl get pv
    
    NAME                                    CAPACITY ACCESSMODES STATUS CLAIM         STORAGECLASS     
    pvc-58d5ec38-03e5-11e9-be51-000c29e88ff1  1Gi       RWO      Bound  default/pvc-fs csi-hostpath-sc
    $ kubectl get pvc
    
    NAME      STATUS   VOLUME                                     CAPACITY  ACCESS MODES    STORAGECLASS
    csi-pvc   Bound    pvc-58d5ec38-03e5-11e9-be51-000c29e88ff1     1Gi         RWO         csi-hostpath-sc

    Brief on how it works:

    • csi-provisioner issues CreateVolumeRequest call to the CSI socket, then hostpath-plugin calls CreateVolume and informs CSI about its creation
    • csi-provisioner creates PV and updates PVC to be bound and the VolumeAttachment object is created by controller-manager
    • csi-attacher which watches for VolumeAttachments submits ControllerPublishVolume rpc call to hostpath-plugin, then hostpaths-plugin gets ControllerPublishVolume and calls hostpath AttachVolume csi-attacher update VolumeAttachment status
    • All this time kubelet waits for volume to be attached and submits NodeStageVolume (format and mount to the node to the staging dir) to the csi-node.hostpath-plugin
    • csi-node.hostpath-plugin gets NodeStageVolume call and mounts to `/var/lib/kubelet/plugins/kubernetes.io/csi/pv/<pv-name>/globalmount`, then responses to kubelet</pv-name>
    • kubelet calls NodePublishVolume (mount volume to the pod’s dir)
    • csi-node.hostpath-plugin performs NodePublishVolume and mounts the volume to `/var/lib/kubelet/pods/<pod-uuid>/volumes/</pod-uuid>kubernetes.io~csi/<pvc-name>/mount`</pvc-name>

      Finally, kubelet starts container of the pod with the provisioned volume.


    Let’s confirm the working of Hostpath CSI driver:

    The Hostpath driver is configured to create new volumes in the hostpath container in the plugin daemonset under the ‘/tmp’ directory. This path persist as long as the DaemonSet pod is up and running.

    If a file is written in the hostpath mounted volume in an application pod, should be seen in the hostpath cotainer.A file written in a properly mounted Hostpath volume inside an application should show up inside the Hostpath container.

    1. To try out the above statement, Create a file on application pod

    $ kubectl exec -it pod-fs /bin/sh
    
    / # touch /data/my-test
    / # exit

    2. And then exec in the hostpath container and run ‘ls’ command to check

    $ kubectl exec -it $(kubectl get pods --selector app=csi-hostpathplugin 
    -o jsonpath='{.items[*].metadata.name}') -c hostpath /bin/sh
    
    / # find /tmp -name my-test
    /tmp/057485ab-c714-11e8-bb16-000c2967769a/my-test
    / # exit

    Note: The better way of the verification is to inspect the VolumeAttachment object created that represents the attached volume API object created that represents the attached volume

    Support for Snapshot

    Volume Snapshotting is introduced as an Alpha feature for the Kubernetes persistent volume in v1.12. 

    Being an alpha feature, ‘VolumeSnapshotDataSource’ feature gate needs to be enabled. This feature opens a pool of use cases of keeping the snapshot of data locally. The API objects used are VolumeSnapshot, VolumeSnapshotContent and VolumeSnapshotClass. It was developed with a similar notion and relationship of PV, PVC and StorageClass. 

    To create a snapshot, the VolumeSnapshot object needs to be created with the source as PVC and VolumeSnapshotClass

    and the CSI-Snapshotter container will create a VolumeSnaphsotContent.

    Let’s try out with an example:

    Just like the provisioner create a PV for us when a PVC is created, similarly a VolumeSnapshotContent object will be created when VolumeSnapshot object is created.

    apiVersion: snapshot.storage.k8s.io/v1alpha1
    kind: VolumeSnapshot
    metadata:
      name: fs-pv-snapshot
    spec:
      snapshotClassName: csi-hostpath-snapclass
      source:
        name: pvc-fs
        kind: PersistentVolumeClaim

    The volumesnapshotcontent is created. The output will look like:

    $ kubectl get volumesnapshotcontent
     
    NAME                                                  AGE
    snapcontent-f55db632-c716-11e8-8911-000c2967769a      14s

    Restore from the snapshot:

    The DataSource field in the PVC can accept the source of kind: VolumeSnapsot which will create a new PV from that volume snapshot, when a Pod is bound to this PVC.

    The new PV will be having the same data as of the PV from which the snapshot was taken and it can be attached to any other pod. The new pod having that PV, proves of the possible “Restore” and “Cloning” use cases.

    Tear Down CSI-Hostpath installation:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: fs-pvc-restore
    spec:
      storageClassName: csi-hostpath-sc
      dataSource:
        name: fs-pv-snapshot
        kind: VolumeSnapshot
        apiGroup: snapshot.storage.k8s.io
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi

    And we’re done here.

    Conclusion

    Since Kubernetes has started supporting the Raw Block Device type of Persistent Volume, hospath driver and any other driver may also support it, which will be explained in the next part of the blog. In this blog, we got deep understanding of the CSI, its components and services. The new features of CSI and the problems it solves. The CSI Hostpath driver was deeply in this blog to experiment and understand the provisioner, snapshotter flows for PV and VolumeSnapshots. Also, the PV snapshot, restore and cloning use cases were demonstrated.

  • Game of Hackathon 2019: A Glimpse of Our First Hackathon

    Hackathons for technology startups are like picking up a good book. It may take a long time before you start, but once you do, you wonder why you didn’t do it sooner? Last Friday, on 31st May 2019, we conducted our first Hackathon at Velotio and it was a grand success!

    Although challenging projects from our clients are always pushing us to learn new things, we saw a whole new level of excitement and enthusiasm among our employees to bring their own ideas to life during the event. The 12-hour Hackathon saw participation from 15 teams, a lot of whom came well-prepared in advance with frameworks and ideas to start building immediately.

    Here are some pictures from the event:

    The intense coding session was then followed by a series of presentations where all the teams showcased their solutions.

    The first prize was bagged by Team Mitron who worked on a performance review app and was awarded a cash prize of 75,000 Rs.

    The second prize of 50,000 Rs. was awarded to Team WireQ. Their solution was an easy sync up platform that would serve as a single source of truth for the designers, developers, and testers to work seamlessly on projects together — a problem we have often struggled with in-house as well.

    Our QA Team put together a complete test suite framework that would perform all functional and non-functional testing activities, including maintaining consistency in testing, minimal code usage, improvement in test structuring and so on. They won the third prize worth 25,000 Rs.

    Our heartiest congratulations to all the winners!

    This Hackathon has definitely injected a lot of positive energy and innovation into our work culture and got so many of us to collaborate more effectively and learn from each other. We cannot wait to do our next Hackathon and share more with you all.

    Until then, stay tuned!

  • Introduction to the Modern Server-side Stack – Golang, Protobuf, and gRPC

    There are some new players in town for server programming and this time it’s all about Google. Golang has rapidly been gaining popularity ever since Google started using it for their own production systems. And since the inception of Microservice Architecture, people have been focusing on modern data communication solutions like gRPC along with Protobuf. In this post, I will walk you through each of these briefly.

    Golang

    Golang or Go is an open source, general purpose programming language by Google. It has been gaining popularity recently for all the good reasons. It may come as a surprise to most people that language is almost 10 years old and has been production ready for almost 7 years, according to Google.

    Golang is designed to be simple, modern, easy to understand, and quick to grasp. The creators of the language designed it in such a way that an average programmer can have a working knowledge of the language over a weekend. I can attest to the fact that they definitely succeeded. Speaking of the creators, these are the experts that have been involved in the original draft of the C language so we can be assured that these guys know what they are doing.

    That’s all good but why do we need another language?

    For most of the use cases, we actually don’t. In fact, Go doesn’t solve any new problems that haven’t been solved by some other language/tool before. But it does try to solve a specific set of relevant problems that people generally face in an efficient, elegant, and intuitive manner. Go’s primary focus is the following:

    • First class support for concurrency
    • An elegant, modern language that is very simple to its core
    • Very good performance
    • First hand support for the tools required for modern software development

    I’m going to briefly explain how Go provides all of the above. You can read more about the language and its features in detail from Go’s official website.

    Concurrency

    Concurrency is one of the primary concerns in most of the server applications and it should be the primary concern of the language, considering the modern microprocessors. Go introduces a concept called a ‘goroutine’. A ‘goroutine’ is analogous to a ‘lightweight user-space thread’. It is much more complicated than that in reality as several goroutines multiplex on a single thread but the above expression should give you a general idea. These are light enough that you can actually spin up a million goroutines simultaneously as they start with a very tiny stack. In fact, that’s recommended. Any function/method in Go can be used to spawn a Goroutine. You can just do ‘go myAsyncTask()’ to spawn a goroutine from ‘myAsyncTask’ function. The following is an example:

    // This function performs the given task concurrently by spawing a goroutine
    // for each of those tasks.
    
    func performAsyncTasks(task []Task) {
      for _, task := range tasks {
        // This will spawn a separate goroutine to carry out this task.
        // This call is non-blocking
        go task.Execute()
      }
    }

    Yes, it’s that easy and it is meant to be that way as Go is a simple language and you are expected to spawn a goroutine for every independent async task without caring much. Go’s runtime automatically takes care of running the goroutines in parallel if multiple cores are available. But how do these goroutines communicate? The answer is channels.

    ‘Channel’ is also a language primitive that is meant to be used for communication among goroutines. You can pass anything from a channel to another goroutine (A primitive Go type or a Go struct or even other channels). A channel is essentially a blocking double ended queue (can be single ended too). If you want a goroutine(s) to wait for a certain condition to be met before continuing further you can implement cooperative blocking of goroutines with the help of channels.

    These two primitives give a lot of flexibility and simplicity in writing asynchronous or parallel code. Other helper libraries like a goroutine pool can be easily created from the above primitives. One basic example is:

    package executor
    
    import (
    	"log"
    	"sync/atomic"
    )
    
    // The Executor struct is the main executor for tasks.
    // 'maxWorkers' represents the maximum number of simultaneous goroutines.
    // 'ActiveWorkers' tells the number of active goroutines spawned by the Executor at given time.
    // 'Tasks' is the channel on which the Executor receives the tasks.
    // 'Reports' is channel on which the Executor publishes the every tasks reports.
    // 'signals' is channel that can be used to control the executor. Right now, only the termination
    // signal is supported which is essentially is sending '1' on this channel by the client.
    type Executor struct {
    	maxWorkers    int64
    	ActiveWorkers int64
    
    	Tasks   chan Task
    	Reports chan Report
    	signals chan int
    }
    
    // NewExecutor creates a new Executor.
    // 'maxWorkers' tells the maximum number of simultaneous goroutines.
    // 'signals' channel can be used to control the Executor.
    func NewExecutor(maxWorkers int, signals chan int) *Executor {
    	chanSize := 1000
    
    	if maxWorkers > chanSize {
    		chanSize = maxWorkers
    	}
    
    	executor := Executor{
    		maxWorkers: int64(maxWorkers),
    		Tasks:      make(chan Task, chanSize),
    		Reports:    make(chan Report, chanSize),
    		signals:    signals,
    	}
    
    	go executor.launch()
    
    	return &executor
    }
    
    // launch starts the main loop for polling on the all the relevant channels and handling differents
    // messages.
    func (executor *Executor) launch() int {
    	reports := make(chan Report, executor.maxWorkers)
    
    	for {
    		select {
    		case signal := <-executor.signals:
    			if executor.handleSignals(signal) == 0 {
    				return 0
    			}
    
    		case r := <-reports:
    			executor.addReport(r)
    
    		default:
    			if executor.ActiveWorkers < executor.maxWorkers && len(executor.Tasks) > 0 {
    				task := <-executor.Tasks
    				atomic.AddInt64(&executor.ActiveWorkers, 1)
    				go executor.launchWorker(task, reports)
    			}
    		}
    	}
    }
    
    // handleSignals is called whenever anything is received on the 'signals' channel.
    // It performs the relevant task according to the received signal(request) and then responds either
    // with 0 or 1 indicating whether the request was respected(0) or rejected(1).
    func (executor *Executor) handleSignals(signal int) int {
    	if signal == 1 {
    		log.Println("Received termination request...")
    
    		if executor.Inactive() {
    			log.Println("No active workers, exiting...")
    			executor.signals <- 0
    			return 0
    		}
    
    		executor.signals <- 1
    		log.Println("Some tasks are still active...")
    	}
    
    	return 1
    }
    
    // launchWorker is called whenever a new Task is received and Executor can spawn more workers to spawn
    // a new Worker.
    // Each worker is launched on a new goroutine. It performs the given task and publishes the report on
    // the Executor's internal reports channel.
    func (executor *Executor) launchWorker(task Task, reports chan<- Report) {
    	report := task.Execute()
    
    	if len(reports) < cap(reports) {
    		reports <- report
    	} else {
    		log.Println("Executor's report channel is full...")
    	}
    
    	atomic.AddInt64(&executor.ActiveWorkers, -1)
    }
    
    // AddTask is used to submit a new task to the Executor is a non-blocking way. The Client can submit
    // a new task using the Executor's tasks channel directly but that will block if the tasks channel is
    // full.
    // It should be considered that this method doesn't add the given task if the tasks channel is full
    // and it is up to client to try again later.
    func (executor *Executor) AddTask(task Task) bool {
    	if len(executor.Tasks) == cap(executor.Tasks) {
    		return false
    	}
    
    	executor.Tasks <- task
    	return true
    }
    
    // addReport is used by the Executor to publish the reports in a non-blocking way. It client is not
    // reading the reports channel or is slower that the Executor publishing the reports, the Executor's
    // reports channel is going to get full. In that case this method will not block and that report will
    // not be added.
    func (executor *Executor) addReport(report Report) bool {
    	if len(executor.Reports) == cap(executor.Reports) {
    		return false
    	}
    
    	executor.Reports <- report
    	return true
    }
    
    // Inactive checks if the Executor is idle. This happens when there are no pending tasks, active
    // workers and reports to publish.
    func (executor *Executor) Inactive() bool {
    	return executor.ActiveWorkers == 0 && len(executor.Tasks) == 0 && len(executor.Reports) == 0
    }

    Simple Language

    Unlike a lot of other modern languages, Golang doesn’t have a lot of features. In fact, a compelling case can be made for the language being too restrictive in its feature set and that’s intended. It is not designed around a programming paradigm like Java or designed to support multiple programming paradigms like Python. It’s just bare bones structural programming. Just the essential features thrown into the language and not a single thing more.

    After looking at the language, you may feel that the language doesn’t follow any particular philosophy or direction and it feels like every feature is included in here to solve a specific problem and nothing more than that. For example, it has methods and interfaces but not classes; the compiler produces a statically linked binary but still has a garbage collector; it has strict static typing but doesn’t support generics. The language does have a thin runtime but doesn’t support exceptions.

    The main idea here that the developer should spend the least amount of time expressing his/her idea or algorithm as code without thinking about “What’s the best way to do this in x language?” and it should be easy to understand for others. It’s still not perfect, it does feel limiting from time to time and some of the essential features like Generics and Exceptions are being considered for the ‘Go 2’.

    Performance

    Single threaded execution performance NOT a good metric to judge a language, especially when the language is focused around concurrency and parallelism. But still, Golang sports impressive benchmark numbers only beaten by hardcore system programming languages like C, C++, Rust, etc. and it is still improving. The performance is actually very impressive considering its a Garbage collected language and is good enough for almost every use case.

    (Image Source: Medium)

    Developer Tooling

    The adoption of a new tool/language directly depends on its developer experience. And the adoption of Go does speak for its tooling. Here we can see that same ideas and tooling is very minimal but sufficient. It’s all achieved by the ‘go’ command and its subcommands. It’s all command line.

    There is no package manager for the language like pip, npm. But you can get any community package by just doing

    go get github.com/velotiotech/WebCrawler/blob/master/executor/executor.go

    CODE: https://gist.github.com/velotiotech/3977b7932b96564ac9a041029d760d6d.js

    Yes, it works. You can just pull packages directly from github or anywhere else. They are just source files.

    But what about package.json..? I don’t see any equivalent for `go get`. Because there isn’t. You don’t need to specify all your dependency in a single file. You can directly use:

    import "github.com/xlab/pocketsphinx-go/sphinx"

    In your source file itself and when you do `go build` it will automatically `go get` it for you. You can see the full source file here:

    package main
    
    import (
    	"encoding/binary"
    	"bytes"
    	"log"
    	"os/exec"
    
    	"github.com/xlab/pocketsphinx-go/sphinx"
    	pulse "github.com/mesilliac/pulse-simple" // pulse-simple
    )
    
    var buffSize int
    
    func readInt16(buf []byte) (val int16) {
    	binary.Read(bytes.NewBuffer(buf), binary.LittleEndian, &val)
    	return
    }
    
    func createStream() *pulse.Stream {
    	ss := pulse.SampleSpec{pulse.SAMPLE_S16LE, 16000, 1}
    	buffSize = int(ss.UsecToBytes(1 * 1000000))
    	stream, err := pulse.Capture("pulse-simple test", "capture test", &ss)
    	if err != nil {
    		log.Panicln(err)
    	}
    	return stream
    }
    
    func listen(decoder *sphinx.Decoder) {
    	stream := createStream()
    	defer stream.Free()
    	defer decoder.Destroy()
    	buf := make([]byte, buffSize)
    	var bits []int16
    
    	log.Println("Listening...")
    
    	for {
    		_, err := stream.Read(buf)
    		if err != nil {
    			log.Panicln(err)
    		}
    
    		for i := 0; i < buffSize; i += 2 {
    			bits = append(bits, readInt16(buf[i:i+2]))
    		}
    
    		process(decoder, bits)
    		bits = nil
    	}
    }
    
    func process(dec *sphinx.Decoder, bits []int16) {
    	if !dec.StartUtt() {
    		panic("Decoder failed to start Utt")
    	}
    	
    	dec.ProcessRaw(bits, false, false)
    	dec.EndUtt()
    	hyp, score := dec.Hypothesis()
    	
    	if score > -2500 {
    		log.Println("Predicted:", hyp, score)
    		handleAction(hyp)
    	}
    }
    
    func executeCommand(commands ...string) {
    	cmd := exec.Command(commands[0], commands[1:]...)
    	cmd.Run()
    }
    
    func handleAction(hyp string) {
    	switch hyp {
    		case "SLEEP":
    		executeCommand("loginctl", "lock-session")
    		
    		case "WAKE UP":
    		executeCommand("loginctl", "unlock-session")
    
    		case "POWEROFF":
    		executeCommand("poweroff")
    	}
    }
    
    func main() {
    	cfg := sphinx.NewConfig(
    		sphinx.HMMDirOption("/usr/local/share/pocketsphinx/model/en-us/en-us"),
    		sphinx.DictFileOption("6129.dic"),
    		sphinx.LMFileOption("6129.lm"),
    		sphinx.LogFileOption("commander.log"),
    	)
    	
    	dec, err := sphinx.NewDecoder(cfg)
    	if err != nil {
    		panic(err)
    	}
    
    	listen(dec)
    }

    This binds the dependency declaration with source itself.

    As you can see by now, it’s simple, minimal and yet sufficient and elegant. There is first hand support for both unit tests and benchmarks with flame charts too. Just like the feature set, it also has its downsides. For example, `go get` doesn’t support versions and you are locked to the import URL passed in you source file. It is evolving and other tools have come up for dependency management.

    Golang was originally designed to solve the problems that Google had with their massive code bases and the imperative need to code efficient concurrent apps. It makes coding applications/libraries that utilize the multicore nature of modern microchips very easy. And, it never gets into a developer’s way. It’s a simple modern language and it never tries to become anything more that that.

    Protobuf (Protocol Buffers)

    Protobuf or Protocol Buffers is a binary communication format by Google. It is used to serialize structured data. A communication format? Kind of like JSON? Yes. It’s more than 10 years old and Google has been using it for a while now.

    But don’t we have JSON and it’s so ubiquitous…

    Just like Golang, Protobufs doesn’t really solve anything new. It just solves existing problems more efficiently and in a modern way. Unlike Golang, they are not necessarily more elegant than the existing solutions. Here are the focus points of protobuf:

    • It’s a binary format, unlike JSON and XML, which are text based and hence it’s vastly space efficient.
    • First hand and sophisticated support for schemas.
    • First hand support for generating parsing and consumer code in various languages.

    Binary format and speed

    So are protobuf really that fast? The short answer is, yes. According to the Google Developers they are 3 to 10 times smaller and 20 to 100 times faster than XML. It’s not a surprise as it is a binary format, the serialized data is not human readable.

    (Image Source: Beating JSON performance with Protobuf)

    Protobufs take a more planned approach. You define `.proto` files which are kind of the schema files but are much more powerful. You essentially define how you want your messages to be structured, which fields are optional or required, their data types etc. After that the protobuf compiler will generate the data access classes for you. You can use these classes in your business logic to facilitate communication.

    Looking at a `.proto` file related to a service will also give you a very clear idea of the specifics of the communication and the features that are exposed. A typical .proto file looks like this:

    message Person {
      required string name = 1;
      required int32 id = 2;
      optional string email = 3;
    
      enum PhoneType {
        MOBILE = 0;
        HOME = 1;
        WORK = 2;
      }
    
      message PhoneNumber {
        required string number = 1;
        optional PhoneType type = 2 [default = HOME];
      }
    
      repeated PhoneNumber phone = 4;
    }

    Fun Fact: Jon Skeet, the king of Stack Overflow is one of the main contributors in the project.

    gRPC

    gRPC, as you guessed it, is a modern RPC (Remote Procedure Call) framework. It is a batteries included framework with built in support for load balancing, tracing, health checking, and authentication. It was open sourced by Google in 2015 and it’s been gaining popularity ever since.

    An RPC framework…? What about REST…?

    SOAP with WSDL has been used long time for communication between different systems in a Service Oriented Architecture. At the time, the contracts used to be strictly defined and systems were big and monolithic, exposing a large number of such interfaces.

    Then came the concept of ‘browsing’ where the server and client don’t need to be tightly coupled. A client should be able to browse service offerings even if they were coded independently. If the client demanded the information about a book, the service along with what’s requested may also offer a list of related books so that client can browse. REST paradigm was essential to this as it allows the server and client to communicate freely without strict restriction using some primitive verbs.

    As you can see above, the service is behaving like a monolithic system, which along with what is required is also doing n number of other things to provide the client with the intended `browsing` experience. But this is not always the use case. Is it?

    Enter the Microservices

    There are many reasons to adopt for a Microservice Architecture. The prominent one being the fact that it is very hard to scale a Monolithic system. While designing a big system with Microservices Architecture each business or technical requirement is intended to be carried out as a cooperative composition of several primitive ‘micro’ services.

    These services don’t need to be comprehensive in their responses. They should perform specific duties with expected responses. Ideally, they should behave like pure functions for seamless composability.

    Now using REST as a communication paradigm for such services doesn’t provide us with much of a benefit. However, exposing a REST API for a service does enable a lot of expression capability for that service but again if such expression power is neither required nor intended we can use a paradigm that focuses more on other factors.

    gRPC intends to improve upon the following technical aspects over traditional HTTP requests:

    • HTTP/2 by default with all its goodies.
    • Protobuf as machines are talking.
    • Dedicated support for streaming calls thanks to HTTP/2.
    • Pluggable auth, tracing, load balancing and health checking because you always need these.

    As it’s an RPC framework, we again have concepts like Service Definition and Interface Description Language which may feel alien to the people who were not there before REST but this time it feels a lot less clumsy as gRPC uses Protobuf for both of these.

    Protobuf is designed in such a way that it can be used as a communication format as well as a protocol specification tool without introducing anything new. A typical gRPC service definition looks like this:

    service HelloService {
      rpc SayHello (HelloRequest) returns (HelloResponse);
    }
    
    message HelloRequest {
      string greeting = 1;
    }
    
    message HelloResponse {
      string reply = 1;
    }

    You just write a `.proto` file for your service describing the interface name, what it expects, and what it returns as Protobuf messages. Protobuf compiler will then generate both the client and server side code. Clients can call this directly and server-side can implement these APIs to fill in the business logic.

    Conclusion

    Golang, along with gRPC using Protobuf is an emerging stack for modern server programming. Golang simplifies making concurrent/parallel applications and gRPC with Protobuf enables efficient communication with a pleasing developer experience.