Blog

  • Demystifying High Availability in Kubernetes Using Kubeadm

    Introduction

    The rise of containers has reshaped the way we develop, deploy and maintain the software. Containers allow us to package the different services that constitute an application into separate containers, and to deploy those containers across a set of virtual and physical machines. This gives rise to container orchestration tool to automate the deployment, management, scaling and availability of a container-based application. Kubernetes allows deployment and management of container-based applications at scale. Learn more about backup and disaster recovery for your Kubernetes clusters.

    One of the main advantages of Kubernetes is how it brings greater reliability and stability to the container-based distributed application, through the use of dynamic scheduling of containers. But, how do you make sure Kubernetes itself stays up when a component or its master node goes down?
     

     

    Why we need Kubernetes High Availability?

    Kubernetes High-Availability is about setting up Kubernetes, along with its supporting components in a way that there is no single point of failure. A single master cluster can easily fail, while a multi-master cluster uses multiple master nodes, each of which has access to same worker nodes. In a single master cluster the important component like API server, controller manager lies only on the single master node and if it fails you cannot create more services, pods etc. However, in case of Kubernetes HA environment, these important components are replicated on multiple masters(usually three masters) and if any of the masters fail, the other masters keep the cluster up and running.

    Advantages of multi-master

    In the Kubernetes cluster, the master node manages the etcd database, API server, controller manager, and scheduler, along with all the worker nodes. What if we have only a single master node and if that node fails, all the worker nodes will be unscheduled and the cluster will be lost.

    In a multi-master setup, by contrast, a multi-master provides high availability for a single cluster by running multiple apiserver, etcd, controller-manager, and schedulers. This does not only provides redundancy but also improves network performance because all the masters are dividing the load among themselves.

    A multi-master setup protects against a wide range of failure modes, from a loss of a single worker node to the failure of the master node’s etcd service. By providing redundancy, a multi-master cluster serves as a highly available system for your end-users.

    Steps to Achieve Kubernetes HA

    Before moving to steps to achieve high-availability, let us understand what we are trying to achieve through a diagram:

    (Image Source: Kubernetes Official Documentation)

    Master Node: Each master node in a multi-master environment run its’ own copy of Kube API server. This can be used for load balancing among the master nodes. Master node also runs its copy of the etcd database, which stores all the data of cluster. In addition to API server and etcd database, the master node also runs k8s controller manager, which handles replication and scheduler, which schedules pods to nodes.

    Worker Node: Like single master in the multi-master cluster also the worker runs their own component mainly orchestrating pods in the Kubernetes cluster. We need 3 machines which satisfy the Kubernetes master requirement and 3 machines which satisfy the Kubernetes worker requirement.

    For each master, that has been provisioned, follow the installation guide to install kubeadm and its dependencies. In this blog we will use k8s 1.10.4 to implement HA.

    Note: Please note that cgroup driver for docker and kubelet differs in some version of k8s, make sure you change cgroup driver to cgroupfs for docker and kubelet. If cgroup driver for kubelet and docker differs then the master doesn’t come up when rebooted.

    Setup etcd cluster

    1. Install cfssl and cfssljson

    $ curl -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 
    $ curl -o /usr/local/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
    $ chmod +x /usr/local/bin/cfssl*
    $ export PATH=$PATH:/usr/local/bin

    2 . Generate certificates on master-0

    $ mkdir -p /etc/kubernetes/pki/etcd
    $ cd /etc/kubernetes/pki/etcd

    3. Create config.json file in /etc/kubernetes/pki/etcd folder with following content.

    {
        "signing": {
            "default": {
                "expiry": "43800h"
            },
            "profiles": {
                "server": {
                    "expiry": "43800h",
                    "usages": [
                        "signing",
                        "key encipherment",
                        "server auth",
                        "client auth"
                    ]
                },
                "client": {
                    "expiry": "43800h",
                    "usages": [
                        "signing",
                        "key encipherment",
                        "client auth"
                    ]
                },
                "peer": {
                    "expiry": "43800h",
                    "usages": [
                        "signing",
                        "key encipherment",
                        "server auth",
                        "client auth"
                    ]
                }
            }
        }
    }

    4. Create ca-csr.json file in /etc/kubernetes/pki/etcd folder with following content.

    {
        "CN": "etcd",
        "key": {
            "algo": "rsa",
            "size": 2048
        }
    }

    5. Create client.json file in /etc/kubernetes/pki/etcd folder with following content.

    {
        "CN": "client",
        "key": {
            "algo": "ecdsa",
            "size": 256
        }
    }

    $ cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
    $ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client

    6. Create a directory  /etc/kubernetes/pki/etcd on master-1 and master-2 and copy all the generated certificates into it.

    7. On all masters, now generate peer and etcd certs in /etc/kubernetes/pki/etcd. To generate them, we need the previous CA certificates on all masters.

    $ export PEER_NAME=$(hostname)
    $ export PRIVATE_IP=$(ip addr show eth0 | grep -Po 'inet K[d.]+')
    
    $ cfssl print-defaults csr > config.json
    $ sed -i 's/www.example.net/'"$PRIVATE_IP"'/' config.json
    $ sed -i 's/example.net/'"$PEER_NAME"'/' config.json
    $ sed -i '0,/CN/{s/example.net/'"$PEER_NAME"'/}' config.json
    
    $ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server config.json | cfssljson -bare server
    $ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer config.json | cfssljson -bare peer

    This will replace the default configuration with your machine’s hostname and IP address, so in case if you encounter any problem just check the hostname and IP address are correct and rerun cfssl command.

    8. On all masters, Install etcd and set it’s environment file.

    $ yum install etcd -y
    $ touch /etc/etcd.env
    $ echo "PEER_NAME=$PEER_NAME" >> /etc/etcd.env
    $ echo "PRIVATE_IP=$PRIVATE_IP" >> /etc/etcd.env

    9. Now, we will create a 3 node etcd cluster on all 3 master nodes. Starting etcd service on all three nodes as systemd. Create a file /etc/systemd/system/etcd.service on all masters.

    [Unit]
    Description=etcd
    Documentation=https://github.com/coreos/etcd
    Conflicts=etcd.service
    Conflicts=etcd2.service
    
    [Service]
    EnvironmentFile=/etc/etcd.env
    Type=notify
    Restart=always
    RestartSec=5s
    LimitNOFILE=40000
    TimeoutStartSec=0
    
    ExecStart=/bin/etcd --name <host_name>  --data-dir /var/lib/etcd --listen-client-urls http://<host_private_ip>:2379,http://127.0.0.1:2379 --advertise-client-urls http://<host_private_ip>:2379 --listen-peer-urls http://<host_private_ip>:2380 --initial-advertise-peer-urls http://<host_private_ip>:2380 --cert-file=/etc/kubernetes/pki/etcd/server.pem --key-file=/etc/kubernetes/pki/etcd/server-key.pem --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem --peer-cert-file=/etc/kubernetes/pki/etcd/peer.pem --peer-key-file=/etc/kubernetes/pki/etcd/peer-key.pem --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem --initial-cluster master-0=http://<master0_private_ip>:2380,master-1=http://<master1_private_ip>:2380,master-2=http://<master2_private_ip>:2380 --initial-cluster-token my-etcd-token --initial-cluster-state new --client-cert-auth=false --peer-client-cert-auth=false
    
    [Install]
    WantedBy=multi-user.target

    10. Ensure that you will replace the following placeholder with

    • <host_name> : Replace as the master’s hostname</host_name>
    • <host_private_ip>: Replace as the current host private IP</host_private_ip>
    • <master0_private_ip>: Replace as the master-0 private IP</master0_private_ip>
    • <master1_private_ip>: Replace as the master-1 private IP</master1_private_ip>
    • <master2_private_ip>: Replace as the master-2 private IP</master2_private_ip>

    11. Start the etcd service on all three master nodes and check the etcd cluster health:

    $ systemctl daemon-reload
    $ systemctl enable etcd
    $ systemctl start etcd
    
    $ etcdctl cluster-health

    This will show the cluster healthy and connected to all three nodes.

    Setup load balancer

    There are multiple cloud provider solutions for load balancing like AWS elastic load balancer, GCE load balancing etc. There might not be a physical load balancer available, we can setup a virtual IP load balancer to healthy node master. We are using keepalived for load balancing, install keepalived on all master nodes

    $ yum install keepalived -y

    Create the following configuration file /etc/keepalived/keepalived.conf on all master nodes:

    ! Configuration File for keepalived
    global_defs {
      router_id LVS_DEVEL
    }
    
    vrrp_script check_apiserver {
      script "/etc/keepalived/check_apiserver.sh"
      interval 3
      weight -2
      fall 10
      rise 2
    }
    
    vrrp_instance VI_1 {
        state <state>
        interface <interface>
        virtual_router_id 51
        priority <priority>
        authentication {
            auth_type PASS
            auth_pass velotiotechnologies
        }
        virtual_ipaddress {
            <virtual ip>
        }
        track_script {
            check_apiserver
        }
    }

    • state is either MASTER (on the first master nodes) or BACKUP (the other master nodes).
    • Interface is generally the primary interface, in my case it is eth0
    • Priority should be higher for master node e.g 101 and lower for others e.g 100
    • Virtual_ip should contain the virtual ip of master nodes

    Install the following health check script to /etc/keepalived/check_apiserver.sh on all master nodes:

    #!/bin/sh
    
    errorExit() {
        echo "*** $*" 1>&2
        exit 1
    }
    
    curl --silent --max-time 2 --insecure https://localhost:6443/ -o /dev/null || errorExit "Error GET https://localhost:6443/"
    if ip addr | grep -q <VIRTUAL-IP>; then
        curl --silent --max-time 2 --insecure https://<VIRTUAL-IP>:6443/ -o /dev/null || errorExit "Error GET https://<VIRTUAL-IP>:6443/"
    fi

    $ systemctl restart keepalived

    Setup three master node cluster

    Run kubeadm init on master0:

    Create config.yaml file with following content.

    apiVersion: kubeadm.k8s.io/v1alpha1
    kind: MasterConfiguration
    api:
      advertiseAddress: <master-private-ip>
    etcd:
      endpoints:
      - http://<master0-ip-address>:2379
      - http://<master1-ip-address>:2379
      - http://<master2-ip-address>:2379
      caFile: /etc/kubernetes/pki/etcd/ca.pem
      certFile: /etc/kubernetes/pki/etcd/client.pem
      keyFile: /etc/kubernetes/pki/etcd/client-key.pem
    networking:
      podSubnet: <podCIDR>
    apiServerCertSANs:
    - <load-balancer-ip>
    apiServerExtraArgs:
      endpoint-reconciler-type: lease

    Please ensure that the following placeholders are replaced:

    • <master-private-ip> with the private IPv4 of the master server on which config file resides.</master-private-ip>
    • <master0-ip-address>, <master1-ip-address> and <master-2-ip-address> with the IP addresses of your three master nodes</master-2-ip-address></master1-ip-address></master0-ip-address>
    • <podcidr> with your Pod CIDR. Please read the </podcidr>CNI network section of the docs for more information. Some CNI providers do not require a value to be set. I am using weave-net as pod network, hence podCIDR will be 10.32.0.0/12
    • <load-balancer-ip> with the virtual IP set up in the load balancer in the previous section.</load-balancer-ip>
    $ kubeadm init --config=config.yaml

    10. Run kubeadm init on master1 and master2:

    First of all copy /etc/kubernetes/pki/ca.crt, /etc/kubernetes/pki/ca.key, /etc/kubernetes/pki/sa.key, /etc/kubernetes/pki/sa.pub to master1’s and master2’s /etc/kubernetes/pki folder.

    Note: Copying this files is crucial, otherwise the other two master nodes won’t go into the ready state.

    Copy the config file config.yaml from master0 to master1 and master2. We need to change <master-private-ip> to current master host’s private IP.</master-private-ip>

    $ kubeadm init --config=config.yaml

    11. Now you can install pod network on all three masters to bring them in the ready state. I am using weave-net pod network, to apply weave-net run:

    export kubever=$(kubectl version | base64 | tr -d 'n') kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"

    12. By default, k8s doesn’t schedule any workload on the master, so if you want to schedule workload on master node as well, taint all the master nodes using the command:

    $ kubectl taint nodes --all node-role.kubernetes.io/master-

    13. Now that we have functional master nodes, we can join some worker nodes:

    Use the join string you got at the end of kubeadm init command

    $ kubeadm join 10.0.1.234:6443 --token llb1kx.azsbunpbg13tgc8k --discovery-token-ca-cert-hash sha256:1ad2a436ce0c277d0c5bd3826091e72badbd8417ffdbbd4f6584a2de588bf522

    High Availability in action

    The Kubernetes HA cluster will look like:

    [root@master-0 centos]# kubectl get nodes
    NAME       STATUS     ROLES     AGE       VERSION
    master-0   NotReady   master    4h        v1.10.4
    master-1   Ready      master    4h        v1.10.4
    master-2   Ready      master    4h        v1.10.4

    [root@master-0 centos]# kubectl get pods -n kube-system
    NAME                              READY     STATUS     RESTARTS   AGE
    kube-apiserver-master-0            1/1       Unknown    0          4h
    kube-apiserver-master-1            1/1       Running    0          4h
    kube-apiserver-master-2            1/1       Running    0          4h
    kube-controller-manager-master-0   1/1       Unknown    0          4h
    kube-controller-manager-master-1   1/1       Running    0          4h
    kube-controller-manager-master-2   1/1       Running    0          4h
    kube-dns-86f4d74b45-wh795          3/3       Running    0          4h
    kube-proxy-9ts6r                   1/1       Running    0          4h
    kube-proxy-hkbn7                   1/1       NodeLost   0          4h
    kube-proxy-sq6l6                   1/1       Running    0          4h
    kube-scheduler-master-0            1/1       Unknown    0          4h
    kube-scheduler-master-1            1/1       Running    0          4h
    kube-scheduler-master-2            1/1       Running    0          4h
    weave-net-6nzbq                    2/2       NodeLost   0          4h
    weave-net-ndx2q                    2/2       Running    0          4h
    weave-net-w2mfz                    2/2       Running    0          4h

    After failing over one master node the Kubernetes cluster is still accessible.

    [root@master-0 centos]# kubectl get nodes
    NAME       STATUS     ROLES     AGE       VERSION
    master-0   NotReady   master    4h        v1.10.4
    master-1   Ready      master    4h        v1.10.4
    master-2   Ready      master    4h        v1.10.4

    [root@master-0 centos]# kubectl get pods -n kube-system
    NAME                              READY     STATUS     RESTARTS   AGE
    kube-apiserver-master-0            1/1       Unknown    0          4h
    kube-apiserver-master-1            1/1       Running    0          4h
    kube-apiserver-master-2            1/1       Running    0          4h
    kube-controller-manager-master-0   1/1       Unknown    0          4h
    kube-controller-manager-master-1   1/1       Running    0          4h
    kube-controller-manager-master-2   1/1       Running    0          4h
    kube-dns-86f4d74b45-wh795          3/3       Running    0          4h
    kube-proxy-9ts6r                   1/1       Running    0          4h
    kube-proxy-hkbn7                   1/1       NodeLost   0          4h
    kube-proxy-sq6l6                   1/1       Running    0          4h
    kube-scheduler-master-0            1/1       Unknown    0          4h
    kube-scheduler-master-1            1/1       Running    0          4h
    kube-scheduler-master-2            1/1       Running    0          4h
    weave-net-6nzbq                    2/2       NodeLost   0          4h
    weave-net-ndx2q                    2/2       Running    0          4h
    weave-net-w2mfz                    2/2       Running    0          4h

    Even after one node failed, all the important components are up and running. The cluster is still accessible and you can create more pods, deployment services etc.

    [root@master-1 centos]# kubectl create -f nginx.yaml 
    deployment.apps "nginx-deployment" created
    [root@master-1 centos]# kubectl get pods -o wide
    NAME                                READY     STATUS    RESTARTS   AGE       IP              NODE
    nginx-deployment-75675f5897-884kc   1/1       Running   0          10s       10.117.113.98   master-2
    nginx-deployment-75675f5897-crgxt   1/1       Running   0          10s       10.117.113.2    master-1

    Conclusion

    High availability is an important part of reliability engineering, focused on making system reliable and avoid any single point of failure of the complete system. At first glance, its implementation might seem quite complex, but high availability brings tremendous advantages to the system that requires increased stability and reliability. Using highly available cluster is one of the most important aspects of building a solid infrastructure.

  • Creating GraphQL APIs Using Elixir Phoenix and Absinthe

    Introduction

    GraphQL is a new hype in the Field of API technologies. We have been constructing and using REST API’s for quite some time now and started hearing about GraphQL recently. GraphQL is usually described as a frontend-directed API technology as it allows front-end developers to request data in a more simpler way than ever before. The objective of this query language is to formulate client applications formed on an instinctive and adjustable format, for portraying their data prerequisites as well as interactions.

    The Phoenix Framework is running on Elixir, which is built on top of Erlang. Elixir core strength is scaling and concurrency. Phoenix is a powerful and productive web framework that does not compromise speed and maintainability. Phoenix comes in with built-in support for web sockets, enabling you to build real-time apps.

    Prerequisites:

    1. Elixir & Erlang: Phoenix is built on top of these
    2. Phoenix Web Framework: Used for writing the server application. (It’s a well-unknown and lightweight framework in elixir) 
    3. Absinthe: GraphQL library written for Elixir used for writing queries and mutations.
    4. GraphiQL: Browser based GraphQL ide for testing your queries. Consider it similar to what Postman is used for testing REST APIs.

    Overview:

    The application we will be developing is a simple blog application written using Phoenix Framework with two schemas User and Post defined in Accounts and Blog resp. We will design the application to support API’s related to blog creation and management. Assuming you have Erlang, Elixir and mix installed.

    Where to Start:

    At first, we have to create a Phoenix web application using the following command:

    mix phx.new  --no-brunch --no-html

    –no-brunch – do not generate brunch files for static asset building. When choosing this option, you will need to manually handle JavaScript  dependencies if building HTML apps

    • –-no-html – do not generate HTML views.

    Note: As we are going to mostly work with API, we don’t need any web pages, HTML views and so the command args  and

    Dependencies:

    After we create the project, we need to add dependencies in mix.exs to make GraphQL available for the Phoenix application.

    defp deps do
    [
    {:absinthe, "~> 1.3.1"},
    {:absinthe_plug, "~> 1.3.0"},
    {:absinthe_ecto, "~> 0.1.3"}
    ]
    end

    Structuring the Application:

    We can used following components to design/structure our GraphQL application:

    1. GraphQL Schemas : This has to go inside lib/graphql_web/schema/schema.ex. The schema definitions your queries and mutations.
    2. Custom types: Your schema may include some custom properties which should be defined inside lib/graphql_web/schema/types.ex

    Resolvers: We have to write respective Resolver Function’s that handles the business logic and has to be mapped with respective query or mutation. Resolvers should be defined in their own files. We defined it inside lib/graphql/accounts/user_resolver.ex and lib/graphql/blog/post_resolver.ex folder.

    Also, we need to uppdate the router we have to be able to make queries using the GraphQL client in lib/graphql_web/router.ex and also have to create a GraphQL pipeline to route the API request which also goes inside lib/graphql_web/router.ex:

    pipeline :graphql do
    	  plug Graphql.Context  #custom plug written into lib/graphql_web/plug/context.ex folder
    end
    
    scope "/api" do
      pipe_through(:graphql)  #pipeline through which the request have to be routed
    
      forward("/",  Absinthe.Plug, schema: GraphqlWeb.Schema)
      forward("/graphiql", Absinthe.Plug.GraphiQL, schema: GraphqlWeb.Schema)
    end

    Writing GraphQL Queries:

    Lets write some graphql queries which can be considered to be equivalent to GET requests in REST. But before getting into queries lets take a look at GraphQL schema we defined and its equivalent resolver mapping:

    defmodule GraphqlWeb.Schema do
      use Absinthe.Schema
      import_types(GraphqlWeb.Schema.Types)
    
      query do
        field :blog_posts, list_of(:blog_post) do
          resolve(&Graphql.Blog.PostResolver.all/2)
        end
    
        field :blog_post, type: :blog_post do
          arg(:id, non_null(:id))
          resolve(&Graphql.Blog.PostResolver.find/2)
        end
    
        field :accounts_users, list_of(:accounts_user) do
          resolve(&Graphql.Accounts.UserResolver.all/2)
        end
    
        field :accounts_user, :accounts_user do
          arg(:email, non_null(:string))
          resolve(&Graphql.Accounts.UserResolver.find/2)
        end
      end
    end

    You can see above we have defined four queries in the schema. Lets pick a query and see what goes into it :

    field :accounts_user, :accounts_user do
    arg(:email, non_null(:string))
    resolve(&Graphql.Accounts.UserResolver.find/2)
    end

    Above, we have retrieved a particular user using his email address through Graphql query.

    1. arg(:, ): defines an non-null incoming string argument i.e user email for us.
    2. Graphql.Accounts.UserResolver.find/2 : the resolver function that is mapped via schema, which contains the core business logic for retrieving an user.
    3. Accounts_user : the custome defined type which is defined inside lib/graphql_web/schema/types.ex as follows:
    object :accounts_user do
    field(:id, :id)
    field(:name, :string)
    field(:email, :string)
    field(:posts, list_of(:blog_post), resolve: assoc(:blog_posts))
    end

    We need to write a separate resolver function for every query we define. Will go over the resolver function for accounts_user which is present in lib/graphql/accounts/user_resolver.ex file:

    defmodule Graphql.Accounts.UserResolver do
      alias Graphql.Accounts                    #import lib/graphql/accounts/accounts.ex as Accounts
    
      def all(_args, _info) do
        {:ok, Accounts.list_users()}
      end
    
      def find(%{email: email}, _info) do
        case Accounts.get_user_by_email(email) do
          nil -> {:error, "User email #{email} not found!"}
          user -> {:ok, user}
        end
      end
    end

    This function is used to list all users or retrieve a particular user using an email address. Let’s run it now using GraphiQL browser. You need to have the server running on port 4000. To start the Phoenix server use:

    mix deps.get #pulls all the dependencies
    mix deps.compile #compile your code
    mix phx.server #starts the phoenix server

    Let’s retrieve an user using his email address via query:

    Above, we have retrieved the id, email and name fields by executing accountsUser query with an email address. GraphQL also allow us to define variables which we will show later when writing different mutations.

    Let’s execute another query to list all blog posts that we have defined:

     Writing GraphQL Mutations:

    Let’s write some GraphQl mutations. If you have understood the way graphql queries are written mutations are much simpler and similar to queries and easy to understand. It is defined in the same form as queries with a resolver function. Different mutations we are gonna write are as follow:

    1. create_post:- create a new blog post
    2. update_post :- update a existing blog post
    3. delete_post:- delete an existing blog post

    The mutation looks as follows:

    defmodule GraphqlWeb.Schema do
      use Absinthe.Schema
      import_types(GraphqlWeb.Schema.Types)
    
      query do
        mutation do
          field :create_post, type: :blog_post do
            arg(:title, non_null(:string))
            arg(:body, non_null(:string))
            arg(:accounts_user_id, non_null(:id))
    
            resolve(&Graphql.Blog.PostResolver.create/2)
          end
    
          field :update_post, type: :blog_post do
            arg(:id, non_null(:id))
            arg(:post, :update_post_params)
    
            resolve(&Graphql.Blog.PostResolver.update/2)
          end
    
          field :delete_post, type: :blog_post do
            arg(:id, non_null(:id))
            resolve(&Graphql.Blog.PostResolver.delete/2)
          end
        end
    
      end
    end

    Let’s run some mutations to create a post in GraphQL:

    Notice the method is POST and not GET over here.

    Let’s dig into update mutation function :

    field :update_post, type: :blog_post do
    arg(:id, non_null(:id))
    arg(:post, :update_post_params)
    
    resolve(&Graphql.Blog.PostResolver.update/2)
    end

    Here, update post takes two arguments as input ,  non null id and a post parameter of type update_post_params that holds the input parameter values to update. The mutation is defined in lib/graphql_web/schema/schema.ex while the input parameter values are defined in lib/graphql_web/schema/types.ex —

    input_object :update_post_params do
    field(:title, :string)
    field(:body, :string)
    field(:accounts_user_id, :id)
    end

    The difference with previous type definitions is that it’s defined as input_object instead of object.

    The corresponding resolver function is defined as follows :

    def update(%{id: id, post: post_params}, _info) do
    case find(%{id: id}, _info) do
    {:ok, post} -> post |> Blog.update_post(post_params)
    {:error, _} -> {:error, "Post id #{id} not found"}
    end
    end

         

    Here we have defined a query parameter to specify the id of the blog post to be updated.

    Conclusion

    This is all you need, to write a basic GraphQL server for any Phoenix application using Absinthe.  

    References:

    1. https://www.howtographql.com/graphql-elixir/0-introduction/
    2. https://pragprog.com/book/wwgraphql/craft-graphql-apis-in-elixir-with-absinthe
    3. https://itnext.io/graphql-with-elixir-phoenix-and-absinthe-6b0ffd260094
  • Creating a Frictionless SignUp Experience with Auth0 for your Application

    What is Auth0 and frictionless signup?

    Auth0 is a service that handles your application’s authentication and authorization needs with simple drop-in solutions. It can save time and risk compared to building your own authentication/authorization system. Auth0 even has its own universal login/signup page that can be customized through the dashboard, and it also provides APIs to create/manage users.

    A frictionless signup flow allows the user to use a core feature of the application without forcing the user to sign up first. Many companies use this flow, namely Bookmyshow, Redbus, Makemytrip, and Goibibo. 

    So, as an example, we will see how an application like Bookmyshow looks with this frictionless flow. First, let’s assume the user is a first-time user for this application; the user lands on the landing page, selects a movie, selects the theater, selects the number of seats, and then lands on the payment page where they will fill in their contact details (email and mobile number) and proceed to complete the booking flow by paying for the ticket. At this point, the user has accessed the website and made a booking without even signing up.

    Later on, when the user sign ups using the same contact details which were provided during booking, they will notice their previous bookings and other details waiting for them on the app’s account page.

    What we will be doing in this blog?

    In this blog, we will be implementing Auth0 and replicating a similar feature as mentioned above using Auth0. In this code sample, we will be using react.js for the frontend and nest.js for the backend. 

    To keep the blog short, we will only focus on the logic related to the frictionless signup with Auth0. We will not be going through other aspects, like payment service/integration, nest.js, ORM, etc.

    Setup for the Auth0 dashboard:

    Auth0’s documentation is pretty straightforward and easy to understand; we’ll link the sections for this setup, and you can easily sign up and continue your setup with the help of their documentation.

    Do note that you will have to create two applications for this flow. One is a Single-Page Application for your frontend so that you can initiate login from your frontend app and the other is ManagementV2 for your server so that you can use their management APIs to create a user.

    After registering you will get the client id and client secret on the application details page, you will require these keys to plug it in auth0’s SDK so you will be able to use its APIs in your application.

    Setup for your single-page application:

    To use Auth0’s API, we would have to install its SDK. For single-page applications, Auth0 has rewritten a new SDK from auth0-js called auth0-spa-js. But if you are using either Angular, React, or Vue, then auth0 already has created their framework/library-specific wrapper for us to use.

    So, we will move on to installing its React wrapper and continuing with the setup:

    npm install @auth0/auth0-react

    Then, we will wrap our app with Auth0Provider and provide the keys from the Auth0 application settings dashboard:

    <Auth0Provider
         domain={process.env.NEXT_PUBLIC_AUTH0_DOMAIN}
         clientId={process.env.NEXT_PUBLIC_AUTH0_CLIENT_ID}
         redirectUri={
           typeof window !== 'undefined' &&
           `${window.location.origin}/auth-callback`
         }
         onRedirectCallback={onRedirectCallback}
         audience={process.env.NEXT_PUBLIC_AUTH0_AUDIENCE}
         //Safari uses ITP which prevents silent auth.Please refer https://www.py4u.net/discuss/353302
         useRefreshTokens={true}
         cacheLocation="localstorage"
       >
         </App>
       </Auth0Provider>

    You will find the explanation of the above props and more on Auth0’s React APIs through their GitHub link https://github.com/auth0/auth0-react.

    But we do want to cover one issue with their authenticated state and redirection. We noticed that when Auth0 redirects to our application, the isAuthenticated flag doesn’t get reflected immediately. The states get sequentially updated like so:

    • isLoading: false
      isAuthenticated: false
    • isLoading: true
      isAuthenticated: false
    • isLoading: false
      isAuthenticated: false
    • isLoading: false
      isAuthenticated: true

    This can be a pain if you have some common redirection logic based on the user‘s authentication state and user type. 

    What we found out from the Auth0’s community forum is that Auth0 does take some time to parse and update its states, and after the update operations, it then calls the onRedirectCallback function, so it’s safe to put your redirection logic in onRedirectCallback, but there is another issue with that. 

    The function doesn’t have access to Auth0’s context, so you can’t access the user object or any other state for your redirection logic, so you would want to redirect to a page where you have your redirection logic when onRedirectCallback is called.

    So, in place of the actual page set in redirectUri, you would want to use a buffer page like the /auth-callback route where it just shows a progress bar and nothing else.

    Implementation:

    For login/signup, since we are using the universal page we don’t have to do much, we just have to initiate the login with loginWithRedirect() function from the UI, and Auth0 will handle the rest.

    Now, for the core part of the blog, we will now be creating a createBooking API on our nest.js backend, which will accept email, mobile number, booking details (movie, theater location, number of seats), and try to create a booking.

    In this frictionless flow, internally the application does create a user for the booking to refer to; otherwise, it would be difficult to show the bookings once the user signups and tries to access its bookings. 

    So, the logic would go as follows: first, it will check if a user exists with the provided email in our DB. If not, then we will create the user in Auth0 through its management API with a temporary password, and then we will link the newly created Auth0 user in our users table. Then, by using this, we will create a booking.

    Here is an overview of how the createBooking API will look:

    @Post('/bookings/create')
      async createBooking(
        @Body() createBookingDto: createBookingDto
      ): Promise<BookingResponseDto> {
        const { email } = createBookingDto;
        // Checks if the email exists or not, if it doesn’t exists then we will create an account, else we will use the existing user to create a booking
        let user = await this.userRepository.findByEmail(email);
     
        if (!user){
        	const password = Utilities.generatePassword(16);
        	// We use a random password here to create the user on auth0 
        	const { auth0Response } = await this.createUserOnAuth0(
            email,
      password,
        	);
        	this.logger.debug(auth0Response, 'Created Auth0 User');
     
        	let userData: CreateUserDto = {
            email,
            auth0UserId: auth0Response['_id'],
        	};
        	// Creates and links the auth0 user with our DB
        	user = await this.userRepository.addUser(userData);
        }
     
        const booking = {
    userId: user.id,
    transactionId: createBookingDto.transaction.id, // Assuming the payment was done before this API call in a different service
    showId: createBookingDto.show.id,
    theaterId: createBookingDto.theater.id,
    seatNumbers: createBookingDto.seats
        }
        // Creates a booking 
        const bookingObject = await this.bookingRepository.bookTicket(booking)
     
        return new BookingResponseDto(bookingObject)
      }

    As for creating the user on Auth0, we will use Auth0’s management API with the /dbconnections/signup endpoint.
    Apart from the config details that the API requires (client_id, client_secret and connection), it also requires email and password. For the password, we will use a randomly generated one.

    After the user has been created, we will send a forgotten password email to that email address so that the user can set the password and access the account.

    Do note you will have to use the client_id, client_secret, and connection of the ManagementV2 application that was created in the Auth0 dashboard.

    private async createUserOnAuth0(
        email: string,
        password: string,
        createdBy: string,
        retryCount = 0,
      ): Promise<Record<string, string>> {
        try {
          const axiosResponse = await this.httpService
            .post(
              `https://${configService.getAuth0Domain()}/dbconnections/signup`,
              {
                client_id: configService.getAuth0ClientId(),
                client_secret: configService.getAuth0ClientSecret(),
                connection: configService.getAuth0Connection(),
                email,
                password,
              },
            )
            .toPromise();
     
          this.logger.log(
            axiosResponse.data.email,
            'Auth0 user created with email',
          );
     
          // Send password reset email
          this.sendPasswordResetEmail(email);
     
          return { auth0Response: axiosResponse.data, password };
        } catch (err) {
          this.logger.error(err);
          /**
           * {@link https://auth0.com/docs/connections/database/password-strength}
           * Auth0 does not send any specific response, so here we are calling create user again
           * assuming password failed to meet the requirement of auth0
           * But here also we are gonna try it ERROR_RETRY_COUNT times and after that stop call,
           * so we don't get in infinite loop
           */
          if (retryCount < ERROR_RETRY_COUNT) {
            return this.createUserOnAuth0(
              email,
              Utilities.generatePassword(16),
              createdBy,
              retryCount + 1,
            );
          }
     
          throw new HttpException(err, HttpStatus.BAD_REQUEST);
        }
      }

    To send the forgotten password email, we will use the /dbconnections/change_password endpoint from the management API. The code is pretty straightforward.

    This way, the user can change the password, and he/she will be able to access their account.

    private async sendPasswordResetEmail(email: string): Promise<void> {
        try {
          const axiosResponse = await this.httpService
            .post(
              `https://${configService.getAuth0Domain()}/dbconnections/change_password`,
              {
                client_id: configService.getAuth0ClientId(),
                client_secret: configService.getAuth0ClientSecret(),
                connection: configService.getAuth0Connection(),
                email,
              },
            )
            .toPromise();
     
          this.logger.log(email, 'Password reset email sent to');
     
          return axiosResponse.data;
        } catch (err) {
          this.logger.error(err);
        }
      }

    With this, the user can now make a booking without signing up and have a user created in Auth0 for that user, so when he/she logs in later using the universal login page, Auth0 will have a reference for it.

    Conclusion:

    Auth0 is a great platform for managing your application’s authentication and authorization needs if you have a simple enough login/signup flow. It can get a bit tricky when you are trying to implement a non-traditional login/signup flow or a custom flow, which is not supported by Auth0. In such a scenario, you would need to add some custom code as explained in the example above. 

  • Creating Faster and High Performing User Interfaces in Web Apps With Web Workers

    The data we render on a UI originates from different sources like databases, APIs, files, and more. In React applications, when the data is received, we first store it in state and then pass it to the other components in multiple ways for rendering.

    But most of the time, the format of the data is inconvenient for the rendering component. So, we have to format data and perform some prior calculations before we give it to the rendering component.

    Sending data directly to the rendering component and processing the data inside that component is not recommended. Not only data processing but also any heavy background jobs that we would have to depend on the backend can now be done on the client-side because React allows the holding of business logic on the front-end.

    A good practice is to create a separate function for processing that data which is isolated from the rendering logic, so that data processing and data representation will be done separately.

    Why? There are two possible reasons:

    – The processed data can be shared/used by other components, too.

    – The main reason to avoid this is: if the data processing is a time-consuming task, you will see some lag on the UI, or in the worst-case scenario, sometimes the page may become unresponsive.

    As JavaScript is a single-threaded environment, it has only one call stack to execute scripts (in a simple way, you cannot run more than one script at the same time).

    For example, suppose you have to do some DOM manipulations and, at the same time, want to do some complex calculations. You can not perform these two operations in parallel. If the JavaScript engine is busy computing the complex computation, then all the other tasks like event listeners and rendering callbacks will get blocked for that amount of time, and the page may become unresponsive.

    ‍How can you solve this problem?

    Though JavaScript is single-threaded, many developers mimic the concurrency with the help of timer functions and event handlers. Like by breaking heavy (time-consuming) tasks into tiny chunks and by using the timers you can split their execution. Let’s take a look at the following example.

    Here, the processDataArray function uses the timer function to split the execution, which internally uses the setTimeout method for processing some items of array, again after a dedicated time passed execute more items, once all the array elements have been processed, send the processed result back by using the finishCallback

    const processDataArray = (dataArray, finishCallback) => {
     // take a new copy of array
     const todo = dataArray.concat();
     // to store each processed data
     let result = [];
     // timer function
     const timedProcessing = () => {
       const start = +new Date();
       do {
         // process each data item and store it's result
         const singleResult = processSingleData(todo.shift());
         result.push(singleResult);
         // check if todo has something to process and the time difference must not be greater than 50
       } while (todo.length > 0 && +new Date() - start < 50);
     
       // check for remaining items to process
       if (todo.length > 0) {
         setTimeout(timedProcessing, 25);
       } else {
         // finished with all the items, initiate finish callback
         finishCallback(result);
       }
     };
     setTimeout(timedProcessing, 25);
    };
    
    
    
    const processSingleData = data => {
     // process data
     return processedData;
    };

    You can find more about how JavaScript timers work internally here.

    The problem is not solved yet, and the main thread is still busy in the computation so you can see the delay in the UI events like button clicks or mouse scroll. This is a bad user experience when you have a big array computation going on and an impatient web user.

    The better and real multithreading way to solve this problem and to run multiple scripts in parallel is by using Web Workers.

    What are Web Workers?‍

    Web Workers provide a mechanism to spawn a separate script in the background. Where you can do any type of calculations without disturbing the UI. Web Workers run outside the context of the HTML document’s script, making it easiest to allow concurrent execution of JavaScript programs. You can experience multithreading behavior while using Web Workers.

    Communication between the page (main thread) and the worker happens using a simple mechanism. They can send messages to each other using the postMessage method, and they can receive the messages using onmessage callback function. Let’s take a look at a simple example:

    In this example, we will delegate the work of multiplying all the numbers in an array to a Web Worker, and the Web Worker returns the result back to the main thread.

    import "./App.css";
    import { useEffect, useState } from "react";
     
    function App() {
     // This will load and execute the worker.js script in the background.
     const [webworker] = useState(new window.Worker("worker.js"));
     const [result, setResult] = useState("Calculating....");
     
     useEffect(() => {
       const message = { multiply: { array: new Array(1000).fill(2) } };
       webworker.postMessage(message);
    
    
    
       webworker.onerror = () => {
         setResult("Error");
       };
     
       webworker.onmessage = (e) => {
         if (e.data) {
           setResult(e.data.result);
         } else {
           setResult("Error");
         }
       };
     }, []);
    
    useEffect(() => {
       return () => {
         webworker.terminate();
       };
     }, []);
    
     return (
       <div className="App">
         <h1>Webworker Example In React</h1>
         <header className="App-header">
           <h1>Multiplication Of large array</h1>
           <h2>Result: {result}</h2>
         </header>
       </div>
     );
    }
     
    export default App;

    onmessage = (e) => {
     const { multiply } = e.data;
     // check data is correctly framed
     if (multiply && multiply.array.length) {
       // intentionally delay the execution
       setTimeout(() => {
         // this post back the result to the page
         postMessage({
           result: multiply.array.reduce(
             (firstItem, secondItem) => firstItem * secondItem
           ),
         });
       }, 2000);
     } else {
       postMessage({ result: 0 });
     }
    };

    If the worker script throws an exception, you can handle it by attaching a callback function to the onerror property of the worker in the App.js script.

    From the main thread, you can terminate the worker immediately if you want by using the worker’s terminate method. Once the worker is terminated, the worker variable becomes undefined. You need to create another instance if needed.

    You can find a working example here.

    Use cases of Web Workers:

    Charting middleware – Suppose you have to design a dashboard that represents the analytics of businesses engagement for a business retention application by means of a pivot table, pie charts, and bar charts. It involves heavy processing of data to convert it to the expected format of a table, pie chart, a bar chart. This may result in the UI failing to update, freezing, or the page becoming unresponsive because of single-threaded behavior. Here, we can use Web Workers and delegate the processing logic to it. So that the main thread is always available to handle other UI events.

    Emulating excel functionality – For example, if you have thousands of rows in the spreadsheet and each one of them needs some calculations (longer), you can write custom functions containing the processing logic and put them in the WebWorker’s script.

    Real-time text analyzer – This is another good example where we can use WebWorker to show the word count, characters count, repeated word count, etc., by analyzing the text typed by the user in real-time. With a traditional implementation, you may experience performance issues as the text size grows, but this can be optimized by using WebWorkers.

    Web Worker limitations:

    Yes, Web Workers are amazing and quite simple to use, but as the WebWorker is a separate thread, it does not have access to the window object, document object, and parent object. And we can not pass functions through postmessage.

    But Web Workers have access to:

    – Navigator object

    – Location object (read-only)

    – XMLHttpRequest

    – setTimeout, setInterval, clearTimeout, clearInterval

    – You can import other scripts in WebWorker using the importScripts() method

    Here are some other types of workers:
    Shared Worker
    Service Worker
    Audio Worklet

    Conclusion:

    Web Workers make our life easier by doing jobs in parallel in the background, but Web Workers are relatively heavy-weight and come with high startup performance cost and a high per-instance memory cost, so as per the WHATWG community, they are not intended to be used in large numbers.

  • Create CI/CD Pipeline in GitLab in under 10 mins

    Why Chose GitLab Over Other CI tools?

    If there are many tools available in the market, like CircleCI, Github Actions, Travis CI, etc., what makes GitLab CI so special? The easiest way for you to decide if GitLab CI is right for you is to take a look at following use-case:

    GitLab knocks it out of the park when it comes to code collaboration and version control. Monitoring the entire code repository along with all branches becomes manageable. With other popular tools like Jenkins, you can only monitor some branches. If your development teams are spread across multiple locations globally, GitLab serves a good purpose. Regarding price, while Jenkins is free, you need to have a subscription to use all of Gitlab’s features.

    In GitLab, every branch can contain the gitlab-ci.yml file, which makes it easy to modify the workflows. For example, if you want to run unit tests on branch A and perform functional testing on branch B, you can simply modify the YAML configuration for CI/CD, and the runner will take care of running the job for you. Here is a comprehensive list of Pros and Cons of Gitlab to help you make a better decision.

    Intro

    GitLab is an open-source collaboration platform that provides powerful features beyond hosting a code repository. You can track issues, host packages and registries, maintain Wikis, set up continuous integration (CI) and continuous deployment (CD) pipelines, and more.

    In this tutorial, you will configure a pipeline with three stages: build, deploy, test. The pipeline will run for each commit pushed to the repository.

    GitLab and CI/CD

    As we all are aware, a fully-fledged CI/CD pipeline primarily includes the following stages:

    • Build
    • Test
    • Deploy

    Here is a pictorial representation of how GitLab covers CI and CD:

    Source: gitlab.com

    Let’s take a look at an example of an automation testing pipeline. Here, CI empowers test automation and CD automates the release process to various environments. The below image perfectly demonstrates the entire flow.

    Source: xenonstack.com

    Let’s create the basic 3-stage pipeline

    Step 1: Create a project > Create a blank project

    Visit gitlab.com and create your account if you don’t have one already. Once done, click “New Project,” and on the following screen, click “Create Blank Project.” Name it My First Project, leave other settings to default for now, and click Create.
    Alternatively, if you already have your codebase in GitLab, proceed to Step 2.

    Step 2: Create a GitLab YAML

    To create a pipeline in GitLab, we need to define it in a YAML file. This yaml file should reside in the root directory of your project and should be named gitlab-ci.yml. GitLab provides a set of predefined keywords that are used to define a pipeline. 

    In order to design a basic pipeline, let’s understand the structure of a pipeline. If you are already familiar with the basic structure given below, you may want to jump below to the advanced pipeline outline for various environments.

    The hierarchy in GitLab has Pipeline > Stages > Jobs as shown below. The Source or SRC  is often a git commit or a CRON job, which triggers the pipeline on a defined branch.

    Now, let’s understand the commonly used keywords to design a pipeline:

    1. stages: This is used to define stages in the pipeline.
    2. variables: Here you can define the environment variables that can be accessed in all the jobs.
    3. before_script: This is a list of commands to be executed before each job. For example: creating specific directories, logging, etc.
    4. artifacts: If your job creates any artifacts, you can mention the path to find them here.
    5. after_script: This is a list of commands to be executed after each job. For example: cleanup.
    6. tags: This is a tag/label to identify the runner or a GitLab agent to assign your jobs to. If the tags are not specified, the jobs run on shared runners.
    7. needs: If you want your jobs to be executed in a certain order or you want a particular job to be executed before the current job, then you can set this value to the specific job name.
    8. only/except: These keywords are used to control when the job should be added to the pipeline. Use ‘only’ to define when a job should be added, whereas ‘except’ is used to define when a job should not be added. Alternatively, the ‘rules’ keyword is also used to add/exclude jobs based on conditions.

    You can find more keywords here.

    Let’s create a sample YAML file.

    stages:
        - build
        - deploy
        - test
    
    variables:
      RAILS_ENV: "test"
      NODE_ENV: "test"
      GIT_STRATEGY: "clone"
      CHROME_VERSION: "103"
      DOCKER_VERSION: "20.10.14"
    
    build-job:
      stage: build
      script:
        - echo "Check node version and build your binary or docker image."
        - node -v
        - bash buildScript.sh
    
    deploy-code:
      stage: deploy
      needs: build-job
      script:
        - echo "Deploy your code "
        - cd to/your/desired/folder
        - bash deployScript.sh
    
    test-code:
      stage: test
      needs: deploy-code
      script:
        - echo "Run your tests here."
        - cd to/your/desired/folder
        - npm run test

    As you can see, if you have your scripts in a bash file, you can run them from here providing the correct path. 

    Once your YAML is ready, commit the file. 

    Step 3: Check Pipeline Status

    Navigate to CICD > Pipelines from the left navigation bar. You can check the status of the pipeline on this page.

    Here, you can check the commit ID, branch, the user who triggered the pipeline, stages, and their status.

    If you click on the status, you will get a detailed view of pipeline execution.

    If you click on a job under any stage, you can check console logs in detail.

    If you have any artifacts created in your pipeline jobs, you can find them by clicking on the 3 dots for the pipeline instance.

    Advanced Pipeline Outline

    For an advanced pipeline that consists of various environments, you can refer to the below YAML. Simply remove the echo statements and replace them with your set of commands.

    image: your-repo:tag
    variables:
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: ""
    DOCKER_HOST: tcp://localhost:2375
    SAST_DISABLE_DIND: "true"
    DS_DISABLE_DIND: "false"
    GOCACHE: "$CI_PROJECT_DIR/.cache"
    cache: # this section is used to cache libraries etc between pipeline runs thus reducing the amount of time required for pipeline to run
    key: ${CI_PROJECT_NAME}
    paths:
      - cache-path/
    #include:
     #- #echo "You can add other projects here."
     #- #project: "some/other/important/project"
       #ref: main
       #file: "src/project.yml"
    default:
    tags:
      - your-common-instance-tag
    stages:
    - build
    - test
    - deploy_dev
    - dev_tests
    - deploy_qa
    - qa_tests
    - rollback_qa
    - prod_gate
    - deploy_prod
    - rollback_prod
    - cleanup
    build:
    stage: build
    services:
      - docker:19.03.0-dind
    before_script:
      - echo "Run your pre-build commnadss here"
      - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    script:
      - docker build -t $CI_REGISTRY/repo:$DOCKER_IMAGE_TAG  --build-arg GITLAB_USER=$GITLAB_USER --build-arg GITLAB_PASSWORD=$GITLAB_PASSWORD -f ./Dockerfile .
      - docker push $CI_REGISTRY/repo:$DOCKER_IMAGE_TAG
      - echo "Run your builds here"
    unit_test:
    stage: test
    image: your-repo:tag
    script:
      - echo "Run your unit tests here"
    linting:
    stage: test
    image: your-repo:tag
    script:
      - echo "Run your linting tests here"
    sast:
    stage: test
    image: your-repo:tag
    script:
      - echo "Run your Static application security testing here "
    deploy_dev:
    stage: deploy_dev
    image: your-repo:tag
    before_script:
      - source file.sh
      - export VARIABLE="$VALUE"
      - echo "deploy on dev"
    script:
      - echo "deploy on dev"
    after_script:
      #if deployment fails run rollback on dev
      - echo "Things to do after deployment is run"
    only:
      - master #Depends on your branching strategy
    integration_test_dev:
    stage: dev_tests
    image: your-repo:tag
    script:
      - echo "run test  on dev"
    only:
      - master
    allow_failure: true # In case failures are allowed
    deploy_qa:
    stage: deploy_qa
    image: your-repo:tag
    before_script:
      - source file.sh
      - export VARIABLE="$VALUE"
      - echo "deploy on qa"
    script:
      - echo "deploy on qa
    after_script:
      #if deployment fails run rollback on qa
      - echo "Things to do after deployment script is complete "
    only:
      - master
    needs: ["integration_test_dev", "deploy_dev"]
    allow_failure: false
    integration_test_qa:
    stage: qa_tests
    image: your-repo:tag
    script:
      - echo "deploy on qa
    only:
      - master
    allow_failure: true # in case you want to allow failures
     
    rollback_qa:
    stage: rollback_qa
    image: your-repo:tag
    before_script:
      - echo "Things to rollback after qa integration failure"
    script:
      - echo "Steps to rollback"
    after_script:
      - echo "Things to do after rollback"
    only:
      - master
    needs:
      [
        "deploy_qa",
      ]
    when: on_failure #This will run in case the qa deploy job fails
    allow_failure: false
     
    prod_gate: # this is manual gate for prod approval
    before_script:
      - echo "your commands here"
    stage: prod_gate
    only:
      - master
    needs:
      - deploy_qa
    when: manual
     
     
    deploy_prod:
    stage: deploy_prod
    image: your-repo:tag
    tags:
      - some-tag
    before_script:
      - source file.sh
      - echo "your commands here"
    script:
      - echo "your commands here"
    after_script:
      #if deployment fails
      - echo "your commands here"
    only:
      - master
    needs: [ "deploy_qa"]
    allow_failure: false
    rollback_prod: # This stage should be run only when prod deployment fails
    stage: rollback_prod
    image: your-repo:tag
    before_script:
      - export VARIABLE="$VALUE"
      - echo "your commands here"
    script:
      - echo "your commands here"
    only:
      - master
    needs: [ "deploy_prod"]
    allow_failure: false
    when: on_failure
    cleanup:
    stage: cleanup
    script:
      - echo "run cleanup"
      - rm -rf .cache/
    when: always

    Conclusion

    If you have worked with Jenkins, you know the pain points of working with groovy code. Thus, GitLab CI makes it easy to design, understand, and maintain the pipeline code. 

    Here are some pros and cons of using GitLab CI that will help you decide if this is the right tool for you!

  • Continuous Deployment with Azure Kubernetes Service, Azure Container Registry & Jenkins

    Introduction

    Containerization has taken the application development world by storm. Kubernetes has become the standard way of deploying new containerized distributed applications used by the largest enterprises in a wide range of industries for mission-critical tasks, it has become one of the biggest open-source success stories.

    Although Google Cloud has been providing Kubernetes as a service since November 2014 (Note it started with a beta project), Microsoft with AKS (Azure Kubernetes Service) and Amazon with EKS (Elastic Kubernetes Service)  have jumped on to the scene in the second half of 2017.

    Example:

    AWS had KOPS

    Azure had Azure Container Service.

    However, they were wrapper tools available prior to these services which would help a user create a Kubernetes cluster, but the management and the maintenance (like monitoring and upgrades) needed efforts.

    Azure Container Registry:

    With container demand growing, there is always a need in the market for storing and protecting the container images. Microsoft provides a Geo Replica featured private repository as a service named Azure Container Registry.

    Azure Container Registry is a registry offering from Microsoft for hosting container images privately. It integrates well with orchestrators like Azure Container Service, including Docker Swarm, DC/OS, and the new Azure Kubernetes service. Moreover, ACR  provides capabilities such as Azure Active Directory-based authentication, webhook support, and delete operations.

    The coolest feature provided is Geo-Replication. This will create multiple copies of your image and distribute it across the globe and the container when spawned will have access to the image which is nearest.

    Although Microsoft has good documentation on how to set up ACR  in your Azure Subscription, we did encounter some issues and hence decided to write a blog on the precautions and steps required to configure the Registry in the correct manner.

    Note: We tried this using a free trial account. You can setup it up by referring the following link

    Prerequisites:

    • Make sure you have resource groups created in the supported region.
      Supported Regions: eastus, westeurope, centralus, canada central, canadaeast
    • If you are using Azure CLI for operations please make sure you use the version: 2.0.23 or 2.0.25 (This was the latest version at the time of writing this blog)

    Steps to install Azure CLI 2.0.23 or 2.0.25 (ubuntu 16.04 workstation):

    echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ wheezy main" |            
    sudo tee /etc/apt/sources.list.d/azure-cli.list
    sudo apt-key adv --keyserver packages.microsoft.com --recv-keys 52E16F86FEE04B979B07E28DB02C46DF417A0893
    sudo apt-get install apt-transport-httpssudo apt-get update && sudo apt-get install azure-cli
    
    Install a specific version:
    
    sudo apt install azure-cli=2.0.23-1
    sudo apt install azure-cli=2.0.25.1

    Steps for Container Registry Setup:

    • Login to your Azure Account:
    az  login --username --password

    • Create a resource group:
    az group create --name <RESOURCE-GROUP-NAME>  --location eastus
    Example : az group create --name acr-rg  --location eastus

    • Create a Container Registry:
    az acr create --resource-group <RESOURCE-GROUP-NAME> --name <CONTAINER-REGISTRY-NAME> --sku Basic --admin-enabled true
    Example : az acr create --resource-group acr-rg --name testacr --sku Basic --admin-enabled true

    Note: SKU defines the storage available for the registry for type Basic the storage available is 10GB, 1 WebHook and the billing amount is 11 Rs/day.

    For detailed information on the different SKU available visit the following link

    • Login to the registry :
    az acr login --name <CONTAINER-REGISTRY-NAME>
    Example :az acr login --name testacr

    • Sample docker file for a node application :
    FROM node:carbon
    # Create app directory
    WORKDIR /usr/src/app
    COPY package*.json ./
    # RUN npm install
    EXPOSE 8080
    CMD [ "npm", "start" ]

    • Build the docker image :
    docker build -t <image-tag>:<software>
    Example :docker build -t base:node8

    • Get the login server value for your ACR :
    az acr list --resource-group acr-rg --query "[].{acrLoginServer:loginServer}" --output table
    Output  :testacr.azurecr.io

    • Tag the image with the Login Server Value:
      Note: Get the image ID from docker images command

    Example:

    docker tag image-id testacr.azurecr.io/base:node8

    Push the image to the Azure Container Registry:Example:

    docker push testacr.azurecr.io/base:node8

    Microsoft does provide a GUI option to create the ACR.

    • List Images in the Registry:

    Example:

    az acr repository list --name testacr --output table

    • List tags for the Images:

    Example:

    az acr repository show-tags --name testacr --repository <name> --output table

    • How to use the ACR image in Kubernetes deployment: Use the login Server Name + the image name

    Example :

    containers:- 
    name: demo
    image: testacr.azurecr.io/base:node8

    Azure Kubernetes Service

    Microsoft released the public preview of Managed Kubernetes for Azure Container Service (AKS) on October 24, 2017. This service simplifies the deployment, management, and operations of Kubernetes. It features an Azure-hosted control plane, automated upgrades, self-healing, easy scaling.

    Similarly to Google AKE and Amazon EKS, this new service will allow access to the nodes only and the master will be managed by Cloud Provider. For more information visit the following link.

    Let’s now get our hands dirty and deploy an AKS infrastructure to play with:

    • Enable AKS preview for your Azure Subscription: At the time of writing this blog, AKS is in preview mode, it requires a feature flag on your subscription.
    az provider register -n Microsoft.ContainerService

    • Kubernetes Cluster Creation Command: Note: A new separate resource group should be created for the Kubernetes service.Since the service is in preview, it is available only to certain regions.

    Make sure you create a resource group under the following regions.

    eastus, westeurope, centralus, canadacentral, canadaeast
    az  group create  --name  <RESOURCE-GROUP>   --location eastus
    Example : az group create --name aks-rg --location eastus
    az aks create --resource-group <RESOURCE-GROUP-NAME> --name <CLUSTER-NAME>   --node-count 2 --generate-ssh-keys
    Example : az aks create --resource-group aks-rg --name akscluster  --node-count 2 --generate-ssh-keys

    Example with different arguments :

    Create a Kubernetes cluster with a specific version.

    az aks create -g MyResourceGroup -n MyManagedCluster --kubernetes-version 1.8.1

    Create a Kubernetes cluster with a larger node pool.

    az aks create -g MyResourceGroup -n MyManagedCluster --node-count 7

    Install the Kubectl CLI :

    To connect to the kubernetes cluster from the client computer Kubectl command line client is required.

    sudo az aks install-cli

    Note: If you’re using Azure CloudShell, kubectl is already installed. If you want to install it locally, run the above  command:

    • To configure kubectl to connect to your Kubernetes cluster :
    az aks get-credentials --resource-group=<RESOURCE-GROUP-NAME> --name=<CLUSTER-NAME>

    Example :

    CODE: <a href="https://gist.github.com/velotiotech/ac40b6014a435271f49ca0e3779e800f">https://gist.github.com/velotiotech/ac40b6014a435271f49ca0e3779e800f</a>.js

    • Verify the connection to the cluster :
    kubectl get nodes -o wide 

    • For all the command line features available for Azure check the link: https://docs.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest

    We had encountered a few issues while setting up the AKS cluster at the time of writing this blog. Listing them along with the workaround/fix:

    az aks create --resource-group aks-rg --name akscluster  --node-count 2 --generate-ssh-keys

    Error: Operation failed with status: ‘Bad Request’.

    Details: Resource provider registrations Microsoft.Compute, Microsoft.Storage, Microsoft.Network are needed we need to enable them.

    Fix: If you are using the trial account, click on subscriptions and check whether the following providers are registered or not :

    • Microsoft.Compute
    • Microsoft.Storage
    • Microsoft.Network
    • Microsoft.ContainerRegistry
    • Microsoft.ContainerService

    Error: We had encountered the following mentioned open issues at the time of writing this blog.

    1. Issue-1
    2. Issue-2
    3. Issue-3

    Jenkins setup for CI/CD with ACR, AKS

    Microsoft provides a solution template which will install the latest stable Jenkins version on a Linux (Ubuntu 14.04 LTS) VM along with tools and plugins configured to work with Azure. This includes:

    • git for source control
    • Azure Credentials plugin for connecting securely
    • Azure VM Agents plugin for elastic build, test and continuous integration
    • Azure Storage plugin for storing artifacts
    • Azure CLI to deploy apps using scripts

    Refer the below link to bring up the Instance

    Pipeline plan for Spinning up a Nodejs Application using ACR – AKS – Jenkins

    What the pipeline accomplishes :

    Stage 1:

    The code gets pushed in the Github. The Jenkins job gets triggered automatically. The Dockerfile is checked out from Github.

    Stage 2:

    Docker builds an image from the Dockerfile and then the image is tagged with the build number.Additionally, the latest tag is also attached to the image for the containers to use.

    Stage 3:

    We have default deployment and service YAML files stored on the Jenkins server. Jenkins makes a copy of the default YAML files, make the necessary changes according to the build and put them in a separate folder.

    Stage 4:

    kubectl was initially configured at the time of setting up AKS on the Jenkins server. The YAML files are fed to the kubectl util which in turn creates pods and services.

    Sample Jenkins pipeline code :

    node {      
      // Mark the code checkout 'stage'....        
        stage('Checkout the dockefile from GitHub') {            
          git branch: 'docker-file', credentialsId: 'git_credentials', url: 'https://gitlab.com/demo.git'        
        }        
        // Build and Deploy to ACR 'stage'...        
        stage('Build the Image and Push to Azure Container Registry') {                
          app = docker.build('testacr.azurecr.io/demo')                
          withDockerRegistry([credentialsId: 'acr_credentials', url: 'https://testacr.azurecr.io']) {                
          app.push("${env.BUILD_NUMBER}")                
          app.push('latest')                
          }        
         }        
         stage('Build the Kubernetes YAML Files for New App') {
    <The code here will differ depending on the YAMLs used for the application>        
      }        
      stage('Delpoying the App on Azure Kubernetes Service') {            
        app = docker.image('testacr.azurecr.io/demo:latest')            
        withDockerRegistry([credentialsId: 'acr_credentials', url: 'https://testacr.azurecr.io']) {            
        app.pull()            
        sh "kubectl create -f ."            
        }       
       }    
    }

    What we achieved:

    • We managed to create a private Docker registry on Azure using the ACR feature using az-cli 2.0.25.
    • Secondly, we were able to spin up a private Kubernetes cluster on Azure with 2 nodes.
    • Setup Up Jenkins using a pre-cooked template which had all the plugins necessary for communication with ACR and AKS.
    • Orchestrate  a Continuous Deployment pipeline in Jenkins which uses docker features.
  • Container Security: Let’s Secure Your Enterprise Container Infrastructure!

    Introduction

    Containerized applications are becoming more popular with each passing year. A reason for this rise in popularity could be the pivotal role they play in Continuous Delivery by enabling fast and automated deployment of software services.

    Security still remains a major concern mainly because of the way container images are being used. In the world of VMs, infra/security team used to validate the OS images and installed packages for vulnerabilities. But with the adoption of containers, developers are building their own container images. Images are rarely built from scratch. They are typically built on some base image, which is itself built on top of other base images. When a developer builds a container image, he typically grabs a base image and other layers from public third party sources. These images and libraries may contain obsolete or vulnerable packages, thereby putting your infrastructure at risk. An added complexity is that many existing vulnerability-scanning tools may not work with containers, nor do they support container delivery workflows including registries and CI/CD pipelines. In addition, you can’t simply scan for vulnerabilities – you must scan, manage vulnerability fixes and enforce vulnerability-based policies.

    The Container Security Problem

    The table below shows the number of vulnerabilities found in the images available on dockerhub. Note that (as of April 2016) the worst offending community images contained almost 1,800 vulnerabilities! Official images were much better, but still contained 392 vulnerabilities in the worst case.

    If we look at the distribution of vulnerability severities, we see that pretty much all of them are high severity, for both official and community images. What we’re not told is the underlying distribution of vulnerability severities in the CVE database, so this could simply be a reflection of that distribution.

    Over 80% of the latest versions of official images contained at least one high severity vulnerability!

    • There are so many docker images readily available on dockerhub – are you sure the ones you are using are safe?
    • Do you know where your containers come from?
    • Are your developers downloading container images and libraries from unknown and potentially harmful sources?
    • Do the containers use third party library code that is obsolete or vulnerable?

    In this blog post, I will explain some of the solutions available which can help with these challenges. Solutions like ‘Docker scanning services‘, ‘Twistlock Trust’ and an open-source solution ‘Clair‘ from Coreos.com which can help in scanning and fixing vulnerability problems making your container images secure.

    Clair

    Clair is an open source project for the static analysis of vulnerabilities in application containers. It works as an API that analyzes every container layer to find known vulnerabilities using existing package managers such as Debian (dpkg), Ubuntu (dpkg), CentOS (rpm). It also can be used from the command line. It provides a list of vulnerabilities that threaten a container, and can notify users when new vulnerabilities that affect existing containers become known. In regular intervals, Clair ingests vulnerability metadata from a configured set of sources and stores it in the database. Clients use the Clair API to index their container images; this parses a list of installed source packages and stores them in the database. Clients use the Clair API to query the database; correlating data in real time, rather than a cached result that needs re-scanning.

    Clair identifies security issues that developers introduce in their container images. The vanilla process for using Clair is as follows:

    1. A developer programmatically submits their container image to Clair
    2. Clair analyzes the image, looking for security vulnerabilities
    3. Clair returns a detailed report of security vulnerabilities present in the image
    4. Developer acts based on the report

    How to use Clair

    Docker is required to follow along with this demonstration. Once Docker is installed, use the Dockerfile below to create an Ubuntu image that contains a version of SSL that is susceptible to Heartbleed attacks.

    #Dockerfile
    FROM ubuntu:precise-20160303
    #Install WGet
    RUN apt-get update
    RUN apt-get -f install
    RUN apt-get install -y wget
    #Install an OpenSSL vulnerable to Heartbleed (CVE-2014-0160)
    RUN wget --no-check-certificate https://launchpad.net/~ubuntu-security/+archive/ubuntu/ppa/+build/5436462/+files/openssl_1.0.1-4ubuntu5.11_amd64.deb
    RUN dpkg -i openssl_1.0.1-4ubuntu5.11_amd64.deb

    Build the image using below command:

    $ docker build . -t madhurnawandar/heartbeat

    After creating the insecure Docker image, the next step is to download and install Clair from here. The installation choice used for this demonstration was the Docker Compose solution. Once Clair is installed, it can be used via querying its API or through the clairctl command line tool. Submit the insecure Docker image created above to Clair for analysis and it will catch the Heartbleed vulnerability.

    $ clairctl analyze --local madhurnawandar/heartbeat
    Image: /madhurnawandar/heartbeat:latest
    9 layers found 
    ➜ Analysis [f3ce93f27451] found 0 vulnerabilities. 
    ➜ Analysis [738d67d10278] found 0 vulnerabilities. 
    ➜ Analysis [14dfb8014dea] found 0 vulnerabilities. 
    ➜ Analysis [2ef560f052c7] found 0 vulnerabilities. 
    ➜ Analysis [69a7b8948d35] found 0 vulnerabilities. 
    ➜ Analysis [a246ec1b6259] found 0 vulnerabilities. 
    ➜ Analysis [fc298ae7d587] found 0 vulnerabilities. 
    ➜ Analysis [7ebd44baf4ff] found 0 vulnerabilities. 
    ➜ Analysis [c7aacca5143d] found 52 vulnerabilities.
    $ clairctl report --local --format json madhurnawandar/heartbeat
    JSON report at reports/json/analysis-madhurnawandar-heartbeat-latest.json

    You can view the detailed report here.

    Docker Security Scanning

    Docker Cloud and Docker Hub can scan images in private repositories to verify that they are free from known security vulnerabilities or exposures, and report the results of the scan for each image tag. Docker Security Scanning is available as an add-on to Docker hosted private repositories on both Docker Cloud and Docker Hub.

    Security scanning is enabled on a per-repository basis and is only available for private repositories. Scans run each time a build pushes a new image to your private repository. They also run when you add a new image or tag. The scan traverses each layer of the image, identifies the software components in each layer, and indexes the SHA of each component.

    The scan compares the SHA of each component against the Common Vulnerabilities and Exposures (CVE®) database. The CVE is a “dictionary” of known information security vulnerabilities. When the CVE database is updated, the service reviews the indexed components for any that match the new vulnerability. If the new vulnerability is detected in an image, the service sends an email alert to the maintainers of the image.

    A single component can contain multiple vulnerabilities or exposures and Docker Security Scanning reports on each one. You can click an individual vulnerability report from the scan results and navigate to the specific CVE report data to learn more about it.

    Twistlock

    Twistlock is a rule-based access control policy system for Docker and Kubernetes containers. Twistlock is able to be fully integrated within Docker, with out-of-the-box security policies that are ready to use.

    Security policies can set the conditions for users to, say, create new containers but not delete them; or, they can launch containers but aren’t allowed to push code to them. Twistlock features the same policy management rules as those on Kubernetes, wherein a user can modify management policies but cannot delete them.

    Twistlock also handles image scanning. Users can scan an entire container image, including any packaged Docker application. Twistlock has done its due-diligence in this area, correlating with Red Hat and Mirantis to ensure no container is left vulnerable while a scan is running.

    Twistlock also deals with image scanning of containers within the registries themselves. In runtime environments, Twistlock features a Docker proxy running on the same server with an application’s other containers. This is essentially traffic filtering, whereupon the application container calling the Docker daemon is then re-routed through Twistlock. This approach enforces access control, allowing for safer configuration where no containers are set to run as root. It’s also able to SSH into an instance, for example. In order to delve into these layers of security, Twistlock enforces the policy at runtime.

    When new code is written in images, it is then integrated into the Twistlock API to push an event, whereupon the new image is deposited into the registry along with its unique IDs. It is then pulled out by Twistlock and scanned to ensure it complies with the set security policies in place. Twistlock deposits the scan result into the CI process so that developers can view the result for debugging purposes.

    Integrating these vulnerability scanning tools into your CI/CD Pipeline:

    These tools becomes more interesting paired with a CI server like Jenkins, TravisCI, etc. Given proper configuration, process becomes:

    1. A developer submits application code to source control
    2. Source control triggers a Jenkins build
    3. Jenkins builds the software containers necessary for the application
    4. Jenkins submits the container images to vulnerability scanning tool
    5. Tool identifies security vulnerabilities in the container
    6. Jenkins receives the security report, identifies a high vulnerability in the report, and stops the build

    Conclusion

    There are many solutions like ‘Docker scanning services’, ‘Twistlock Trust’, ‘Clair‘, etc to secure your containers. It’s critical for organizations to adopt such tools in their CI/CD pipelines. But this itself is not going to make containers secure. There are lot of guidelines available in the CIS Benchmark for containers like tuning kernel parameters, setting proper network configurations for inter-container connectivity, securing access to host level directories and others. I will cover these items in the next set of blogs. Stay tuned!

  • Building an Intelligent Recommendation Engine with Collaborative Filtering

    In this post, we will talk about building a collaborative recommendation system. For this, we will utilize patient ratings with a drug and medical condition dataset to generate treatment suggestions.

    Let’s take a practical scenario where multiple medical practitioners have treated patients with different medical conditions with the most suitable drugs available. For every prescribed drug, the patients are diagnosed and then suggested a treatment plan, which is our experiences.

    The purpose of the recommendation system is to understand and find patterns with the information provided by patients during the diagnosis, and then suggest a treatment plan, which most closely matches the pattern identified by the recommendation system. 

    At the end of this article, we are going deeper into how these recommendations work and how we can find one preferred suggestion and the next five closest suggestions for any treatment.

    Definitions

    A recommendation system suggests or predicts a user’s behaviour by observing patterns of their past behaviour compared to others.

    In simple terms, it is a filtering engine that picks more relevant information for specific users by using all the available information. It is often used in ecommerce like Amazon, Flipkart, Youtube, and Netflix and personalized user products like Alexa and Google Home Mini.

    For the medical industry, where suggestions must be most accurate, a recommendation system will also take experiences into account. So, we must use all our experiences, and such applications will use every piece of information for any treatment. 

    Recommendation systems use information like various medical conditions and their effect on each patient. They compare these patterns to every new treatment to find the closest similarity.

    Concepts and Technology

    To design the recommendation system, we need a few concepts, which are listed below.

    1. Concepts: Pattern Recognition, Correlation, Cosine Similarity, Vector norms (L1, L2, L-Infinity)

    2. Language: Python (library: Numpy & Pandas), Scipy, Sklearn

    As far as the prototype development is concerned, we have support of a library (Scipy & Sklearn) that executes all the algorithms for us. All we need is a little Python and to use library functions.

    Different Approaches for Recommendation Systems

    Below I have listed a few filtering approaches and examples:

    • Collaborative filtering: It is based on review or response of users for any entity. Here, the suggestion is based on the highest rated item by most of the users. E.g., movie or mobile suggestions.
    • Content-based filtering: It is based on the pattern of each user’s past activity. Here, the suggestion is based on the most preferred by similar users. E.g., food suggestions.
    • Popularity-based filtering: It is based on a pattern of popularity among all users. E.g., YouTube video suggestions  

    Based on these filtering approaches, there will be different approaches to recommender systems, which are explained below:

    • Multi-criteria recommender systems: Various conditions like age, gender, location, likes, and dislikes are used for categorization and then items are suggested. E.g., suggestion of apparel based on age and gender.
    • Risk-aware recommender systems: There is always uncertainty when users use Internet applications (website or mobile). Recommending any advertisement over the Internet must consider risk and users must be aware of this. E.g., advertisement display suggestion over Internet application. 
    • Mobile recommender systems: These are location-based suggestions that consist of users’ current location or future location and provide suggestions based on that. E.g., mostly preferred in traveling and tourism.
    • Hybrid recommender systems: These are the combination of multiple approaches for recommendations. E.g., suggestion of hotels and restaurants based on user preference and travel information.
    • Collaborative and content recommender systems: These are the combination of collaborative and content-based approaches. E.g., suggestion of the highest-rated movie of users’ preference along with their watch history.

    Practical Example with Implementation

    In this example, we have a sample dataset of drugs prescribed for various medical conditions and ratings given by patients. What we need here is for any medical condition we have to receive a suggestion for the most suitable prescribed drugs for treatment.

    Sample Dataset: 

    Below is the sample of the publicly available medical drug dataset used from the Winter 2018 Kaggle University Club Hackathon.

    drugNameconditionratingcondition_id
    MirtazapineDepression10201
    MesalamineCrohn’s Disease, Maintenance8185
    BactrimUrinary Tract Infection9657
    ContraveWeight Loss9677
    Cyclafem 1 / 35Birth Control9122
    ZyclaraKeratosis4365
    CopperBirth Control6122
    AmitriptylineMigraine Prevention9403
    MethadoneOpiate Withdrawal7460
    LevoraBirth Control2122
    ParoxetineHot Flashes1310
    MiconazoleVaginal Yeast Infection6664
    BelviqWeight Loss1677
    SeroquelSchizoaffective Disorde10575
    AmbienInsomnia2347
    NuvigilNarcolepsy9424
    ChantixSmoking Cessation10597
    Microgestin Fe 1 / 20Acne349
    KlonopinBipolar Disorde6121
    CiprofloxacinUrinary Tract Infection10657
    TrazodoneInsomnia1347
    EnteraGamIrritable Bowel Syndrome9356
    AripiprazoleBipolar Disorde1121
    CyclosporineKeratoconjunctivitis Sicca1364

    Sample Code: 

    We will do this in 5 steps:

    1. Importing required libraries

    2. Reading the drugsComTest_raw.csv file and creating a pivot matrix.

    3. Creating a KNN model using the NearestNeighbors function with distance metric- ‘cosine’ & algorithm- ‘brute’. Possible values for distance metric are ‘cityblock’, ‘euclidean’, ‘l1’, ‘l2’ & ‘manhattan’. Possible values for the algorithm are ‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’ & ‘cuml’.

    4. Selecting one medical condition randomly for which we have to suggest 5 drugs for treatment.

    5. Finding the 6 nearest neighbors for the sample, calling the kneighbors function with the trained KNN models created in step 3. The first k-neighbor for the sample medical condition is self with a distance of 0. The next 5 k-neighbors are drugs prescribed for the sample medical condition.

    #!/usr/bin/env python
    # coding: utf-8
    
    # Step 1
    import pandas as pd
    import numpy as np
    
    from scipy.sparse import csr_matrix
    from sklearn.neighbors import NearestNeighbors
    from sklearn.preprocessing import LabelEncoder
    encoder = LabelEncoder()
    
    
    # Step 2
    df = pd.read_csv('drugsComTest_raw.csv').fillna('NA')
    df['condition_id'] = pd.Series(encoder.fit_transform(df['condition'].values), index=df.index)
    df_medical = df.filter(['drugName', 'condition', 'rating', 'condition_id'], axis=1)
    df_medical_ratings_pivot=df_medical.pivot_table(index='drugName',columns='condition_id',values='rating').fillna(0)
    df_medical_ratings_pivot_matrix = csr_matrix(df_medical_ratings_pivot.values)
    
    
    # Step 3
    # distance =  [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’, ‘manhattan’]
    # algorithm = ['auto', 'ball_tree', 'kd_tree', 'brute', 'cuml']
    model_knn = NearestNeighbors(metric = 'cosine', algorithm = 'brute')
    model_knn.fit(df_medical_ratings_pivot_matrix)
    
    
    # Step 4
    sample_index = np.random.choice(df_medical_ratings_pivot.shape[0])
    sample_condition = df_medical_ratings_pivot.iloc[sample_index,:].values.reshape(1, -1)
    
    
    # Step 5
    distances, indices = model_knn.kneighbors(sample_condition, n_neighbors = 6)
    for i in range(0, len(distances.flatten())):
        if i == 0:
            print('Recommendations for {0}:n'.format(df_medical_ratings_pivot.index[sample_index]))
        else:
            recommendation = df_medical_ratings_pivot.index[indices.flatten()[i]]
            distanceFromSample = distances.flatten()[i]
            print('{0}: {1}, with distance of {2}:'.format(i, recommendation, distanceFromSample))

    Explanation:

    This is the collaborative-based recommendation system that uses the patients’ ratings of given drug treatments to find similarities in medical conditions. Here, we are matching the patterns for ratings given to drugs by patients. This system compares all the rating patterns and tries to find similarities (cosine similarity).

    Challenges of Recommendation System

    Any recommendation system requires a decent quantity of quality information to process. Before developing such a system, we must be aware of it. Acknowledging and handling such challenges improve the accuracy of recommendation.

    1. Cold Start: Recommending a new user or a user without any previous behavior is a problem. We can recommend the most popular options to them. E.g., YouTube videos suggestion for newly registered users.

    2. Not Enough Data: Having insufficient data provides recommendations with less certainty. E.g., suggestion of hotels or restaurants will not be accurate if systems are uncertain about users’ locations.

    3. Grey Sheep Problem: This problem occurs when the inconsistent behavior of a user makes it difficult to find a pattern. E.g., multiple users are using the same account, so user activity will be wide, and the system will have difficulty in mapping such patterns. 

    4. Similar items: In these cases, there is not enough data to separate similar items. For these situations, we can recommend all similar items randomly. E.g., apparel suggestions for users with color and sizes. All shirts are similar. 

    5. Shilling Attacks: Intentional negative behavior that leads to bad/unwanted recommendations. While immoral, we cannot deny the possibility of such attacks. E.g., user ratings and reviews over various social media platforms.

    Accuracy and Performance Measures

    Accuracy evaluation is important as we always follow and try to improve algorithms. The most preferred measures for improving algorithms are user studies, online evaluations, and offline evaluations. Our recommendation models must be ready to learn from users’ activity daily. For online evaluations, we have to regularly test our recommendation system.

    If we understand the challenges of the recommendation system, we can prepare such testing datasets to test its accuracy. With these variations of datasets, we can improve our approach of user studies and offline evaluations.

    1. Online Evaluations: In online evaluations, prediction models are updated frequently with the unmonitored data, which leads to the possibility of unexpected accuracy. To verify this, the prediction models are exposed to the unmonitored data with less uncertainty and then the uncertainty of unmonitored data is gradually increased. 

    2. Offline Evaluations: In offline evaluations, the prediction models are trained with a sample dataset that consists of all possible uncertainty with expected outcomes. To verify this, the sample dataset will be gradually updated and prediction models will be verified with predicted and actual outcomes. E.g., creating multiple users with certain activity and expecting genuine suggestions for them.

    Conclusion

    As a part of this article, we have learned about the approaches, challenges, and evaluation methods, and then we created a practical example of the collaboration-based recommendation system. We also explored various types and filtering approaches with real-world scenarios.

    We have also executed sample code with a publicly available medical drug dataset with patient ratings. We can opt for various options for distance matrix and algorithm for the NearestNeighbors calculation. We have also listed various challenges for this system and understood the accuracy evaluation measures and things that affect and improve them.

  • An Introduction To Cloudflare Workers And Cloudflare KV store

    Cloudflare Workers

    This post gives a brief introduction to Cloudflare Workers and Cloudflare KV store. They address a fairly common set of problems around scaling an application globally. There are standard ways of doing this but they usually require a considerable amount of upfront engineering work and developers have to be aware of the ‘scalability’ issues to some degree. Serverless application tools target easy scalability and quick response times around the globe while keeping the developers focused on the application logic rather than infra nitty-gritties.

    Global responsiveness

    When an application is expected to be accessed around the globe, requests from users sitting in different time-zones should take a similar amount of time. There can be multiple ways of achieving this depending upon how data intensive the requests are and what those requests actually do.

    Data intensive requests are harder and more expensive to globalize, but again not all the requests are same. On the other hand, static requests like getting a documentation page or a blog post can be globalized by generating markup at build time and deploying them on a CDN.

    And there are semi-dynamic requests. They render static content either with some small amount of data or their content change based on the timezone the request came from.

    The above is a loose classification of requests but there are exceptions, for example, not all the static requests are presentational.

    Serverless frameworks are particularly useful in scaling static and semi-static requests.

    Cloudflare Workers Overview

    Cloudflare worker is essentially a function deployment service. They provide a serverless execution environment which can be used to develop and deploy small(although not necessarily) and modular cloud functions with minimal effort.

    It is very trivial to start with workers. First, lets install wrangler, a tool for managing Cloudfare Worker projects.

    npm i @cloudflare/wrangler -g

    Wrangler handles all the standard stuff for you like project generation from templates, build, config, publishing among other things.

    A worker primarily contains 2 parts: an event listener that invokes a worker and an event handler that returns a response object. Creating a worker is as easy as adding an event listener to a button.

    addEventListener('fetch', event => {
        event.respondWith(handleRequest(event.request))
    })
    
    async function handleRequest(request) {
        return new Response("hello world")
    }

    Above is a simple hello world example. Wrangler can be used to build and get a live preview of your worker.

    wrangler build

    will build your worker. And 

    wrangler preview 

    can be used to take a live preview on the browser. The preview is only meant to be used for testing(either by you or others). If you want the workers to be triggered by your own domain or a workers.dev subdomain, you need to publish it.

    Publishing is fairly straightforward and requires very less configuration on both wrangler and your project.

    Wrangler Configuration

    Just create an account on Cloudflare and get API key. To configure wrangler, just do:

    wrangler config

    It will ask for the registered email and API key, and you are good to go.

    To publish your worker on a workers.dev subdomain, just fill your account ID in the wrangler.toml and hit wrangler publish. The worker will be deployed and live at a generated workers.dev subdomain.

    Regarding Routes

    When you publish on a {script-name}.{subdomain}.workers.dev domain, the script or project associated with script-name will be invoked. There is no way to call a script just from {subdomain}.workers.dev.

    Worker KV

    Workers alone can’t be used to make anything complex without any persistent storage, that’s where Workers KV comes into the picture. Workers KV as it sounds, is a low-latency, high-volume, key-value store that is designed for efficient reads.

    It optimizes the read latency by dynamically spreading the most frequently read entries to the edges(replicated in several regions) and storing less frequent entries centrally.

    Newly added keys(or a CREATE) are immediately reflected in every region while a value change in the keys(or an UPDATE) may take as long as 60 seconds to propagate, depending upon the region.

    Workers KV is only available to paid users of Cloudflare.

    Writing Data in Workers KV

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces" 
    -X POST 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 
    -H "Content-Type: application/json" 
    --data '{"title": "Requests"}'
    The above HTTP request will create a namespace by the name Requests. The response should look something like this:
    {
        "result": {
            "id": "30b52f55aafb41d88546d01d5f69440a",
            "title": "Requests",
            "supports_url_encoding": true
        },
        "success": true,
        "errors": [],
        "messages": []
    }

    Now we can write KV pairs in this namespace. The following HTTP requests will do the same:

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/values/first-key" 
    -X PUT 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 
    --data 'My first value!'

    Here the NAMESPACE_ID is the same ID that we received in the last request. First-key is the key name and the My first value is the value.

    Let’s complicate things a little

    Above overview just introduces the managed cloud workers with a ‘hello world’ app and basics of the Workers KV, but now let’s make something more complicated. We will make an app which will tell how many requests have been made from your country till now. For example, if you pinged the worker from the US then it will return number of requests made so far from the US.

    We will need: 

    • Some place to store the count of requests for each country. 
    • Find from which country the Worker was invoked.

    For the first part, we will use the Workers KV to store the count for every request.

    Let’s start

    First, we will create a new project using wrangler: wrangler generate request-count.

    We will be making HTTP calls to write values in the Workers KV, so let’s add ‘node-fetch’ to the project:

    npm install node-fetch

    Now, how do we find from which country each request is coming from? The answer is the cf object that is provided with each request to a worker.

    The cf object is a special object that is passed with each request and can be accessed with request.cf. This mainly contains region specific information along with TLS and Auth information. The details of what is provided in the cf, can be found here.

    As we can see from the documentation, we can get country from

    request.cf.country.

    The cf object is not correctly populated in the wrangler preview, you will need to publish your worker in order to test cf’s usage. An open issue mentioning the same can be found here.

    Now, the logic is pretty straightforward here. When we get a request from a country for which we don’t have an entry in the Worker’s KV, we make an entry with value 1, else we increment the value of the country key.

    To use Workers KV, we need to create a namespace. A namespace is just a collection of key-value pairs where all the keys have to be unique.

    A namespace can be created under the KV tab in Cloudflare web UI by giving the name or using the API call above. You can also view/browse all of your namespaces from the web UI. Following API call can be used to read the value of a key from a namespace:

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/values/first-key" 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 

    But, it is neither the fastest nor the easiest way. Cloudflare provides a better and faster way to read data from your namespaces. It’s called binding. Each KV namespace can be bound to a worker script so to make it available in the script by the variable name. Any namespace can be bound with any worker. A KV namespace can be bound to a worker by going to the editing menu of a worker from the Cloudflare UI. 

    Following steps show you how to bind a namespace to a worker:

    Go to the edit page of the worker in Cloudflare web UI and click on the KV tab:

    Then add a binding by clicking the ‘Add binding’ button.

    You can select the namespace name and the variable name by which it will be bound. More details can be found here. A binding that I’ve made can be seen in the above image.

    That’s all we need to get this to work. Following is the relevant part of the script:

    const fetch = require('node-fetch')
    
    addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request))
    })
    
    /**
    * Fetch and log a request
    * @param {Request} request
    */
    async function handleRequest(request) {
        const country = request.cf.country
    
        const url = `https://api.cloudflare.com/client/v4/accounts/account-id/storage/kv/namespaces/namespace-id/values/${country}`
    
        let count = await requests.get(country)
    
        if (!count) {
            count = 1
        } else {
            count = parseInt(count) + 1
        }
    
        try {
            response = await fetch(url, {
            method: 'PUT',
            headers: {"X-Auth-Email": "email", "X-Auth-Key": "auth-key"},
            body: `${count}`
            })
        } catch (error) {
            return new Response(error, { status: 500 })
        }
    
        return new Response(`${country}: ${count}`, { status: 200 }) 
    }

    In the above code, I bound the Requests namespace that we created by the requests variable that would be dynamically resolved when we publish.

    The full source of this can be found here.

    This small application also demonstrates some of the practical aspects of the workers. For example, you would notice that the updates take some time to get reflected and response time of the workers is quick, especially when they are deployed on a .workers.dev subdomain here.

    Side note: You will have to recreate the namespace-worker binding everytime you deploy the worker or you do wrangler publish.

    Workers vs. AWS Lambda

    AWS Lambda has been a major player in the serverless market for a while now. So, how is Cloudflare Workers as compared to it? Let’s see.

    Architecture:

    Cloudflare Workers `Isolates` instead of a container based underlying architecture. `Isolates` is the technology that allows V8(Google Chrome’s JavaScript Engine) to run thousands of processes on a single server in an efficient and secure manner. This effectively translates into faster code execution and lowers memory usage. More details can be found here.

    Price:

    The above mentioned architectural difference allows Workers to be significantly cheaper than Lambda. While a Worker offering 50 milliseconds of CPU costs $0.50 per million requests, the equivalent Lambda costs $1.84 per million. A more detailed price comparison can be found here.

    Speed:

    Workers also show significantly better performance numbers than Lambda and Lambda@Edge. Tests run by Cloudflare claim that they are 441% faster than Lambda and 192% faster than Lambda@Edge. A detailed performance comparison can be found here.

    This better performance is also confirmed by serverless-benchmark.

    Wrapping Up:

    As we have seen, Cloudflare Workers along with the KV Store does make it very easy to start with a serverless application. They provide fantastic performance while using less cost along with intuitive deployment. These properties make them ideal for making globally accessible serverless applications.

  • Cloud Native Applications — The Why, The What & The How

    Cloud-native is an approach to build & run applications that can leverage the advantages of the cloud computing model — On demand computing power & pay-as-you-go pricing model. These applications are built and deployed in a rapid cadence to the cloud platform and offer organizations greater agility, resilience, and portability across clouds.

    This blog explains the importance, the benefits and how to go about building Cloud Native Applications.

    CLOUD NATIVE – The Why?

    Early technology adapters like FANG (Facebook, Amazon, Netflix & Google) have some common themes when it comes to shipping software. They have invested heavily in building capabilities that enable them to release new features regularly (weekly, daily or in some cases even hourly). They have achieved this rapid release cadence while supporting safe and reliable operation of their applications; in turn allowing them to respond more effectively to their customers’ needs.

    They have achieved this level of agility by moving beyond ad-hoc automation and by adopting cloud native practices that deliver these predictable capabilities. DevOps,Continuous Delivery, micro services & containers form the 4 main tenets of Cloud Native patterns. All of them have the same overarching goal of making application development and operations team more efficient through automation.

    At this point though, these techniques have only been successfully proven at the aforementioned software driven companies. Smaller, more agile companies are also realising the value here. However, as per Joe Beda(creator of Kubernetes & CTO at Heptio) there are very few examples of this philosophy being applied outside these technology centric companies.

    Any team/company shipping products should seriously consider adopting Cloud Native practices if they want to ship software faster while reducing risk and in turn delighting their customers.

    CLOUD NATIVE – The What?

    Cloud Native practices comprise of 4 main tenets.

     

    Cloud native — main tenets
    • DevOps is the collaboration between software developers and IT operations with the goal of automating the process of software delivery & infrastructure changes.
    • Continuous Delivery enables applications to released quickly, reliably & frequently, with less risk.
    • Micro-services is an architectural approach to building an application as a collection of small independent services that run on their own and communicate over HTTP APIs.
    • Containers provide light-weight virtualization by dynamically dividing a single server into one or more isolated containers. Containers offer both effiiciency & speed compared to standard Virual Machines (VMs). Containers provide the ability to manage and migrate the application dependencies along with the application. while abstracting away the OS and the underlying cloud platform in many cases.

    The benefits that can be reaped by adopting these methodologies include:

    1. Self managing infrastructure through automation: The Cloud Native practice goes beyond ad-hoc automation built on top of virtualization platforms, instead it focuses on orchestration, management and automation of the entire infrastructure right upto the application tier.
    2. Reliable infrastructure & application: Cloud Native practice ensures that it much easier to handle churn, replace failed components and even easier to recover from unexpected events & failures.
    3. Deeper insights into complex applications: Cloud Native tooling provides visualization for health management, monitoring and notifications with audit logs making applications easy to audit & debug
    4. Security: Enable developers to build security into applications from the start rather than an afterthought.
    5. More efficient use of resources: Containers are lighter in weight that full systems. Deploying applications in containers lead to increased resource utilization.

    Software teams have grown in size and the amount of applications and tools that a company needs to be build has grown 10x over last few years. Microservices break large complex applications into smaller pieces so that they can be developed, tested and managed independently. This enables a single microservice to be updated or rolled-back without affecting other parts of the application. Also nowadays software teams are distributed and microservices enables each team to own a small piece with service contracts acting as the communication layer.

    CLOUD NATIVE – The How?

    Now, lets look at the various building blocks of the cloud native stack that help achieve the above described goals. Here, we have grouped tools & solutions as per the problem they solve. We start with the infrastructure layer at the bottom, then the tools used to provision the infrastructure, following which we have the container runtime environment; above that we have tools to manage clusters of container environments and then at the very top we have the tools, frameworks to develop the applications.

    1. Infrastructure: At the very bottom, we have the infrastructure layer which provides the compute, storage, network & operating system usually provided by the Cloud (AWS, GCP, Azure, Openstack, VMware).

    2. Provisioning: The provisioning layer consists of automation tools that help in provisioning the infrastructure, managing images and deploying the application. Chef, Puppet & Ansible are the DevOps tools that give the ability to manage their configuration & environments. Spinnaker, Terraform, Cloudformation provide workflows to provision the infrastructure. Twistlock, Clair provide the ability to harden container images.

    3. Runtime: The Runtime provides the environment in which the application runs. It consists of the Container Engines where the application runs along with the associated storage & networking. containerd & rkt are the most widely used Container engines. Flannel, OpenContrail provide the necessary overlay networking for containers to interact with each other and the outside world while Datera, Portworx, AppOrbit etc. provide the necessary persistent storage enabling easy movement of containers across clouds.

    4. Orchestration and Management: Tools like Kubernetes, Docker Swarm and Apache Mesos abstract the management container clusters allowing easy scheduling & orchestration of containers across multiple hosts. etcd, Consul provide service registries for discovery while AVI, Envoy provide proxy, load balancer etc. services.

    5. Application Definition & Development: We can build micro-services for applications across multiple langauges — Python, Spring/Java, Ruby, Node. Packer, Habitat & Bitnami provide image management for the application to run across all infrastructure — container or otherwise. 
    Jenkins, TravisCI, CircleCI and other build automation servers provide the capability to setup continuous integration and delivery pipelines.

    6. Monitoring, Logging & Auditing: One of the key features of managing Cloud Native Infrastructure is the ability to monitor & audit the applications & underlying infrastructure.

    All modern monitoring platforms like Datadog, Newrelic, AppDynamic support monitoring of containers & microservices.

    Splunk, Elasticsearch & fluentd help in log aggregration while Open Tracing and Zipkin help in debugging applications.

    7. Culture: Adopting cloud native practices needs a cultural change where teams no longer work in independent silos. End-to-end automation of software delivery pipelines is only possible when there is an increased collaboration between development and IT operations team with a shared responbility.

    When we put all the pieces together we get the complete Cloud Native Landscape as shown below.

    Cloud Native Landscape

    I hope this post gives an idea why Cloud Native is important and what the main benefits are. As you may have noticed in the above infographic, there are several projects, tools & companies trying to solve similar problems. The next questions in mind most likely will be How do i get started? Which tools are right for me? and so on. I will cover these topics and more in my following blog posts. Stay tuned!

    Please let us know what you think by adding comments to this blog or reaching out to chirag_jog or Velotio on Twitter.

    Learn more about what we do at Velotio here and how Velotio can get you started on your cloud native journey here.

    References: