Author: admin

  • Autoscaling in Kubernetes using HPA and VPA

    Autoscaling, a key feature of Kubernetes, lets you improve the resource utilization of your cluster by automatically adjusting the application’s resources or replicas depending on the load at that time.

    This blog talks about Pod Autoscaling in Kubernetes and how to set up and configure autoscalers to optimize the resource utilization of your application.

    Horizontal Pod Autoscaling

    What is the Horizontal Pod Autoscaler?

    The Horizontal Pod Autoscaler (HPA) scales the number of pods of a replica-set/ deployment/ statefulset based on per-pod metrics received from resource metrics API (metrics.k8s.io) provided by metrics-server, the custom metrics API (custom.metrics.k8s.io), or the external metrics API (external.metrics.k8s.io).

    Fig:- Horizontal Pod Autoscaling

    ‍Prerequisite

    Verify that the metrics-server is already deployed and running using the command below, or deploy it using instructions here.

    kubectl get deployment metrics-server -n kube-system

    HPA using Multiple Resource Metrics‍

    HPA fetches per-pod resource metrics (like CPU, memory) from the resource metrics API and calculates the current metric value based on the mean values of all targeted pods. It compares the current metric value with the target metric value specified in the HPA spec and produces a ratio used to scale the number of desired replicas.

    A. Setup: Create a Deployment and HPA resource

    In this blog post, I have used the config below to create a deployment of 3 replicas, with some memory load defined by “–vm-bytes”, “850M”.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
     name: autoscale-tester
    spec:
     replicas: 3
     selector:
       matchLabels:
         app: autoscale-tester
     template:
       metadata:
         labels:
           app: autoscale-tester
       spec:
         containers:
         - args: [ "--vm", "1", "--vm-bytes", "850M", "--vm-hang", "1"]
           command:
           - stress
           image: polinux/stress
           name: autoscale-tester
           resources:
             limits:
               cpu: "1"
               memory: 1000Mi
             requests:
               cpu: "1"
               memory: 1000Mi

    NOTE: It’s recommended not to use HPA and VPA on the same pods or deployments.

    kubectl top po
    NAME                            	CPU(cores)   MEMORY(bytes)   
    autoscale-tester-878b8c6c8-42gmk   326m     	853Mi      	 
    autoscale-tester-878b8c6c8-gp45f   410m     	852Mi      	 
    autoscale-tester-878b8c6c8-tz4mg   388m     	852Mi 

    Lets create an HPA resource for this deployment with multiple metric blocks defined. The HPA will consider each metric one-by-one and calculate the desired replica counts based on each of the metrics, and then select the one with the highest replica count.

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
     name: autoscale-tester
    spec:
     scaleTargetRef:
       apiVersion: apps/v1
       kind: Deployment
       name: autoscale-tester
     minReplicas: 1
     maxReplicas: 10
     metrics:
     - type: Resource
       resource:
         name: cpu
         target:
           type: Utilization
           averageUtilization: 50
     - type: Resource
       resource:
         name: memory
         target:
           type: AverageValue
           averageValue: 500Mi

    • We have defined the minimum number  of replicas HPA can scale down to as 1 and the maximum number that it can scale up to as 10.
    • Target Average Utilization and Target Average Values implies that the HPA should scale the replicas up/down to keep the Current Metric Value equal or closest to Target Metric Value.

    B. Understanding the HPA Algorithm

    kubectl describe hpa autoscale-tester
    Name:       autoscale-tester
    Namespace:  autoscale-tester
    ...
    Metrics:                                           	( current / target )
      resource memory on pods:                         	894188202666m / 500Mi
      resource cpu on pods  (as a percentage of request):  36% (361m) / 50%
    Min replicas:                                      	1
    Max replicas:                                      	10
    Deployment pods:                                   	3 current / 6 desired
    Conditions:
      Type        	Status  Reason          	Message
      ----        	------  ------          	-------
      AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 6
      ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from memory resource
      ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
    Events:
      Type	Reason         	Age   From                   	Message
      ----	------         	----  ----                   	-------
      Normal  SuccessfulRescale  7s	horizontal-pod-autoscaler  New size: 6; reason: memory resource above target

    • HPA calculates pod utilization as total usage of all containers in the pod divided by total request. It looks at all containers individually and returns if container doesn’t have request.
    • The calculated  Current Metric Value for memory, i,e., 894188202666m, is higher than the Target Average Value of 500Mi, so the replicas need to be scaled up.
    • The calculated  Current Metric Value for CPU i.e., 36%, is lower than the Target Average Utilization of 50, so  hence the replicas need to be scaled down.
    • Replicas are calculated based on both metrics and the highest replica count selected. So, the replicas are scaled up to 6 in this case.

    HPA using Custom metrics

    We will use the prometheus-adapter resource to expose custom application metrics to custom.metrics.k8s.io/v1beta1, which are retrieved by HPA. By defining our own metrics through the adapter’s configuration, we can let HPA perform scaling based on our custom metrics.

    A. Setup: Install Prometheus Adapter

    Create prometheus-adapter.yaml with the content below:

    prometheus:
     url: http://prometheus-server
     port: 0
    image:
     tag: latest
    rules:
     custom:
       - seriesQuery: 'container_network_receive_packets_total{namespace!="",pod!=""}'
         resources:
           overrides:
             namespace: {resource: "namespace"}
             pod: {resource: "pod"}
      	name:
          	matches: "container_network_receive_packets_total"
          	as: "packets_in"
      	metricsQuery: <<.Series>>{<<.LabelMatchers>>}

    helm install stable/prometheus -n prometheus --namespace prometheus
    helm install stable/prometheus-adapter -n prometheus-adapter --namespace prometheus -f prometheus-adapter.yaml

    Once the charts are deployed, verify the metrics are exposed at v1beta1.custom.metrics.k8s.io:

    kubectl get apiservice
    NAME                               	SERVICE                     	AVAILABLE   AGE
    v1beta1.custom.metrics.k8s.io      	prometheus/prometheus-adapter   True    	19m 
    
    
    kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/autoscale-hpa/pods/*/packets_in | jq
    {
      "kind": "MetricValueList",
      "apiVersion": "custom.metrics.k8s.io/v1beta1",
      "metadata": {
    	"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/autoscale-hpa/pods/%2A/packets_in"
      },
     "items": [
    	{
      	"describedObject": {
        	"kind": "Pod",
        	"namespace": "autoscale-hpa",
        	"name": "autoscale-tester-878b8c6c8-42gmk",
        	"apiVersion": "/v1"
      	},
      	"metricName": "packets_in",
      	"timestamp": "2020-07-31T05:59:33Z",
      	"value": "33",
      	"selector": null
    	},
    	{
      	"describedObject": {
        	"kind": "Pod",
        	"namespace": "autoscale-hpa",
        	"name": "autoscale-tester-878b8c6c8-hfts8",
        	"apiVersion": "/v1"
      	},
      	"metricName": "packets_in",
      	"timestamp": "2020-07-31T05:59:33Z",
      	"value": "11",
      	"selector": null
    	},
    	{
      	"describedObject": {
        	"kind": "Pod",
        	"namespace": "autoscale-hpa",
        	"name": "autoscale-tester-878b8c6c8-rb9v2",
        	"apiVersion": "/v1"
      	},
      	"metricName": "packets_in",
      	"timestamp": "2020-07-31T05:59:33Z",
      	"value": "10",
      	"selector": null
    	}
      ]
    }

    You can see the metrics value of all the replicas in the output.

    B. Understanding Prometheus Adapter Configuration

    The adapter considers metrics defined with the parameters below:

    1. seriesQuery tells the Prometheus Metric name to the adapter

    2. resources tells which Kubernetes resources each metric is associated with or which labels does the metric include, e.g., namespace, pod etc.

    3. metricsQuery is the actual Prometheus query that needs to be performed to calculate the actual values.

    4. name with which the metric should be exposed to the custom metrics API

    For instance, if we want to calculate the rate of container_network_receive_packets_total, we will need to write this query in Prometheus UI:

    sum(rate(container_network_receive_packets_total{namespace=”autoscale-tester”,pod=~”autoscale-tester.*”}[10m])) by (pod)

    This query is represented as below in the adapter configuration:

    metricsQuery: ‘sum(rate(<<.series>>{<<.labelmatchers>>}10m])) by (<<.groupby>>)'</.groupby></.labelmatchers></.series>

    C. Create an HPA resource

    Now, let’s create an HPA resource with the pod metric packets_in using the config below, and then describe the HPA resource.

    apiVersion: autoscaling/v2beta2
    kind: HorizontalPodAutoscaler
    metadata:
     name: autoscale-tester
    spec:
     scaleTargetRef:
       apiVersion: apps/v1
       kind: Deployment
       name: autoscale-tester
     minReplicas: 1
     maxReplicas: 10
     metrics:
     - type: Pods
       pods:
         metric:
           name: packets_in
         target:
           type: AverageValue
           averageValue: 50

    kubectl describe hpa autoscale-tester
    Name:                	autoscale-tester
    Namespace:           	autoscale-tester
    ...
    Metrics:             	( current / target )
      "packets_in" on pods:  18666m / 50
    Min replicas:        	1
    Max replicas:        	10
    Deployment pods:     	3 current / 3 desired
    Conditions:
      Type        	Status  Reason          	Message
      ----        	------  ------          	-------
      AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 2
      ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from pods metric packets_in
      ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
    Events:
      Type	Reason         	Age   From                   	Message
      ----	------         	----  ----                   	-------
      Normal  SuccessfulRescale  2s	horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
      Normal  SuccessfulRescale  2m51s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target 
    kubectl describe hpa autoscale-tester
    Name:                	autoscale-tester
    Namespace:           	autoscale-tester
    ...
    Metrics:             	( current / target )
      "packets_in" on pods:  18666m / 50
    Min replicas:        	1
    Max replicas:        	10
    Deployment pods:     	3 current / 3 desired
    Conditions:
      Type        	Status  Reason          	Message
      ----        	------  ------          	-------
      AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 2
      ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from pods metric packets_in
      ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
    Events:
      Type	Reason         	Age   From                   	Message
      ----	------         	----  ----                   	-------
      Normal  SuccessfulRescale  2s	horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
      Normal  SuccessfulRescale  2m51s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target 

    Here, the current calculated metric value is 18666m. The m represents milli-units. So, for example, 18666m means 18.666 which is what we expect ((33 + 11 + 10 )/3 = 18.666). Since it’s less than the target average value (i.e., 50), the HPA scales down the replicas to make the Current Metric Value : Target Metric Value ratio closest to 1. Hence, replicas are scaled down to 2 and later to 1.

    Fig:- container_network_receive_packets_total

     

    Fig:- Ratio to Target value

    ‍Vertical Pod Autoscaling

    What is Vertical Pod Autoscaler?

    Vertical Pod autoscaling (VPA) ensures that a container’s resources are not under- or over-utilized. It recommends optimized CPU and memory requests/limits values, and can also automatically update them for you so that the cluster resources are efficiently used.

    Fig:- Vertical Pod Autoscaling

    Architecture

    VPA consists of 3 components:

    • VPA admission controller
      Once you deploy and enable the Vertical Pod Autoscaler in your cluster, every pod submitted to the cluster goes through this webhook, which checks whether a VPA object is referencing it.
    • VPA recommender
      The recommender pulls the current and past resource consumption (CPU and memory) data for each container from metrics-server running in the cluster and provides optimal resource recommendations based on it, so that a container uses only what it needs.
    • VPA updater
      The updater checks at regular intervals if a pod is running within the recommended range. Otherwise, it accepts it for update, and the pod is evicted by the VPA updater to apply resource recommendation.

    Installation

    If you are on Google Cloud Platform, you can simply enable vertical-pod-autoscaling:

    gcloud container clusters update <cluster-name> --enable-vertical-pod-autoscaling

    To install it manually follow below steps:

    • Verify that the metrics-server deployment is running, or deploy it using instructions here.
    kubectl get deployment metrics-server -n kube-system

    • Also, verify the API below is enabled:
    kubectl api-versions | grep admissionregistration
    admissionregistration.k8s.io/v1beta1

    • Clone the kubernetes/autoscaler GitHub repository, and then deploy the Vertical Pod Autoscaler with the following command.
    git clone https://github.com/kubernetes/autoscaler.git
    ./autoscaler/vertical-pod-autoscaler/hack/vpa-up.sh

    Verify that the Vertical Pod Autoscaler pods are up and running:

    kubectl get po -n kube-system
    NAME                                        READY   STATUS    RESTARTS   AGE
    vpa-admission-controller-68c748777d-ppspd   1/1     Running   0          7s
    vpa-recommender-6fc8c67d85-gljpl            1/1     Running   0          8s
    vpa-updater-786b96955c-bgp9d                1/1     Running   0          8s
    
    kubectl get crd
    verticalpodautoscalers.autoscaling.k8s.io 

    VPA using Resource Metrics

    A. Setup: Create a Deployment and VPA resource

    Use the same deployment config to create a new deployment with “–vm-bytes”, “850M”. Then create a VPA resource in Recommendation Mode with updateMode : Off

    apiVersion: autoscaling.k8s.io/v1beta2
    kind: VerticalPodAutoscaler
    metadata:
     name: autoscale-tester-recommender
    spec:
     targetRef:
       apiVersion: "apps/v1"
       kind:       Deployment
       name:       autoscale-tester
     updatePolicy:
       updateMode: "Off"
     resourcePolicy:
       containerPolicies:
       - containerName: autoscale-tester
         minAllowed:
           cpu: "500m"
           memory: "500Mi"
         maxAllowed:
           cpu: "4"
           memory: "8Gi"

    • minAllowed is an optional parameter that specifies the minimum CPU request and memory request allowed for the container. 
    • maxAllowed is an optional parameter that specifies the maximum CPU request and memory request allowed for the container.

    B. Check the Pod’s Resource Utilization

    Check the resource utilization of the pods. Below, you can see only ~50 Mi memory is being used out of 1000Mi and only ~30m CPU out of 1000m. This clearly indicates that the pod resources are underutilized.

    Kubectl top po
    NAME                            	CPU(cores)   MEMORY(bytes)   
    autoscale-tester-5d6b48d64f-8zgb9   39m      	51Mi       	 
    autoscale-tester-5d6b48d64f-npts4   32m      	50Mi       	 
    autoscale-tester-5d6b48d64f-vctx5   35m      	50Mi 

    If you describe the VPA resource, you can see the Recommendations provided. (It may take some time to show them.)

    kubectl describe vpa autoscale-tester-recommender
    Name:     	autoscale-tester-recommender
    Namespace:	autoscale-tester
    ...
      Recommendation:
    	Container Recommendations:
      	Container Name:  autoscale-tester
      	Lower Bound:
        	Cpu: 	500m
        	Memory:  500Mi
      	Target:
        	Cpu: 	500m
        	Memory:  500Mi
      	Uncapped Target:
        	Cpu: 	93m
        	Memory:  262144k
      	Upper Bound:
        	Cpu: 	4
        	Memory:  4Gi

    C. Understand the VPA recommendations

    Target: The recommended CPU request and memory request for the container that will be applied to the pod by VPA.

    Uncapped Target: The recommended CPU request and memory request for the container if you didn’t configure upper/lower limits in the VPA definition. These values will not be applied to the pod. They’re used only as a status indication.

    Lower Bound: The minimum recommended CPU request and memory request for the container. There is a –pod-recommendation-min-memory-mb flag that determines the minimum amount of memory the recommender will set—it defaults to 250MiB.

    Upper Bound: The maximum recommended CPU request and memory request for the container.  It helps the VPA updater avoid eviction of pods that are close to the recommended target values. Eventually, the Upper Bound is expected to reach close to target recommendation.

     Recommendation:
    	Container Recommendations:
      	Container Name:  autoscale-tester
      	Lower Bound:
        	Cpu: 	500m
        	Memory:  500Mi
      	Target:
        	Cpu: 	500m
        	Memory:  500Mi
      	Uncapped Target:
        	Cpu: 	93m
        	Memory:  262144k
      	Upper Bound:
        	Cpu: 	500m
        	Memory:  1274858485 

    D. VPA processing with Update Mode Off/Auto

    Now, if you check the logs of vpa-updater, you can see it’s not processing VPA objects as the Update Mode is set as Off.

    kubectl logs -f vpa-updater-675d47464b-k7xbx
    1 updater.go:135] skipping VPA object autoscale-tester-recommender because its mode is not "Recreate" or "Auto"
    1 updater.go:151] no VPA objects to process

    VPA allows various Update Modes, detailed here.

    Let’s change the VPA updateMode to “Auto” to see the processing.

    As soon as you do that, you can see vpa-updater has started processing objects, and it’s terminating all 3 pods.

    kubectl logs -f vpa-updater-675d47464b-k7xbx
    1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-8zgb9 with priority 1
    1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-npts4 with priority 1
    1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-vctx5 with priority 1
    1 updater.go:193] evicting pod autoscale-tester-5d6b48d64f-8zgb9
    1 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"autoscale-tester", Name:"autoscale-tester-5d6b48d64f-8zgb9", UID:"ed8c54c7-a87a-4c39-a000-0e74245f18c6", APIVersion:"v1", ResourceVersion:"378376", FieldPath:""}): 
    type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.

    You can also check the logs of vpa-admission-controller:

    kubectl logs -f vpa-admission-controller-bbf4f4cc7-cb6pb
    Sending patches: [{add /metadata/annotations map[]} {add /spec/containers/0/resources/requests/cpu 500m} {add /spec/containers/0/resources/requests/memory 500Mi} {add /spec/containers/0/resources/limits/cpu 500m} {add /spec/containers/0/resources/limits/memory 500Mi} {add /metadata/annotations/vpaUpdates Pod resources updated by autoscale-tester-recommender: container 0: cpu request, memory request, cpu limit, memory limit} {add /metadata/annotations/vpaObservedContainers autoscale-tester}]

    NOTE: Ensure that you have more than 1 running replicas. Otherwise, the pods won’t be restarted, and vpa-updater will give you this warning:

    1 pods_eviction_restriction.go:209] too few replicas for ReplicaSet autoscale-tester/autoscale-tester1-7698974f6. Found 1 live pods

    Now, describe the new pods created and check that the resources match the Target recommendations:

    kubectl get po
    NAME                            	READY   STATUS    	RESTARTS   AGE
    autoscale-tester-5d6b48d64f-5dlb7   1/1 	Running   	0      	77s
    autoscale-tester-5d6b48d64f-9wq4w   1/1 	Running   	0      	37s
    autoscale-tester-5d6b48d64f-qrlxn   1/1 	Running   	0      	17s
    
    
    kubectl describe po autoscale-tester-5d6b48d64f-5dlb7
    Name:     	autoscale-tester-5d6b48d64f-5dlb7
    Namespace:	autoscale-tester
    ...
    	Limits:
      	cpu: 	500m
      	memory:  500Mi
    	Requests:
      	cpu:    	500m
      	memory: 	500Mi
    	Environment:  <none>

    The Target Recommendation can not go below the minAllowed defined in the VPA spec.

    Fig:- Prometheus: Memory Usage Ratio

    E. Stress Loading Pods

    Let’s recreate the deployment with memory request and limit set to 2000Mi and “–vm-bytes”, “500M”.

    Gradually stress load one of these pods to increase its memory utilization.
    You can login to the pod and run stress –vm 1 –vm-bytes 1400M –timeout 120000s.

    
    kubectl top po
    NAME                            	CPU(cores)   MEMORY(bytes)   
    autoscale-tester-5d6b48d64f-5dlb7   1000m     	1836Mi       	 
    autoscale-tester-5d6b48d64f-9wq4w   252m      	501Mi       	 
    autoscale-tester-5d6b48d64f-qrlxn   252m      	501Mi 	

    Fig:- Prometheus memory utilized by each Replica

    You will notice that the VPA recommendation is also calculated accordingly and applied to all replicas.

    kubectl describe vpa autoscale-tester-recommender
    Name:     	autoscale-tester-recommender
    Namespace:	autoscale-tester
    ...
      Recommendation:
    	Container Recommendations:
      	Container Name:  autoscale-tester
      	Lower Bound:
        	Cpu: 	500m
        	Memory:  500Mi
      	Target:
        	Cpu: 	500m
        	Memory:  628694953
      	Uncapped Target:
        	Cpu: 	49m
        	Memory:  628694953
      	Upper Bound:
        	Cpu: 	500m
        	Memory:  1553712527

    Limits v/s Request
    VPA always works with the requests defined for a container and not the limits. So, the VPA recommendations are also applied to the container requests, and it maintains a limit to request ratio specified for all containers.

    For example, if the initial container configuration defines a 100m Memory Request and 300m Memory Limit, then when the VPA target recommendation is 150m Memory, the container Memory Request will be updated to 150m and Memory Limit to 450m.

    Selective Container Scaling

    If you have a pod with multiple containers and you want to opt-out some of them, you can use the “Off” mode to turn off recommendations for a container.

    You can also set containerName: “*” to include all containers.

    spec:
     targetRef:
       apiVersion: "apps/v1"
       kind:       Deployment
       name:       autoscale-tester
     updatePolicy:
       updateMode: "Auto"
     resourcePolicy:
       containerPolicies:
       - containerName: autoscale-tester
         minAllowed:
           cpu: "500m"
           memory: "500Mi"
         maxAllowed:
           cpu: "4"
           memory: "4Gi"
       - containerName: opt-out-container
         mode: "Off"

    Conclusion

    Both the Horizontal Pod Autoscaler and the Vertical Pod Autoscaler serve different purposes and one can be more useful than the other depending on your application’s requirement.

    The HPA can be useful when, for example, your application is serving a large number of lightweight (low resource-consuming) requests. In that case, scaling number of replicas can distribute the workload on each of the pod. The VPA, on the other hand, can be useful when your application serves heavyweight requests, which requires higher resources.

    Related Articles:

    1. A Practical Guide to Deploying Multi-tier Applications on Google Container Engine (GKE)

    2. Know Everything About Spinnaker & How to Deploy Using Kubernetes Engine

  • Getting Started With Kubernetes Operators (Helm Based) – Part 1

    Introduction

    The concept of operators was introduced by CoreOs in the last quarter of  2016 and post the introduction of operator framework last year, operators are rapidly becoming the standard way of managing applications on Kubernetes especially the ones which are stateful in nature. In this blog post, we will learn what an operator is. Why they are needed and what problems do they solve. We will also create a helm based operator as an example.

    This is the first part of our Kubernetes Operator Series. In the second part, getting started with Kubernetes operators (Ansible based), and the third part, getting started with Kubernetes operators (Golang based), you can learn how to build Ansible and Golang based operators.

    What is an Operator?

    Whenever we deploy our application on Kubernetes we leverage multiple Kubernetes objects like deployment, service, role, ingress, config map, etc. As our application gets complex and our requirements become non-generic, managing our application only with the help of native Kubernetes objects becomes difficult and we often need to introduce manual intervention or some other form of automation to make up for it.

    Operators solve this problem by making our application first class Kubernetes objects that is we no longer deploy our application as a set of native Kubernetes objects but a custom object/resource of its kind, having a more domain-specific schema and then we bake the “operational intelligence” or the “domain-specific knowledge” into the controller responsible for maintaining the desired state of this object. For example, etcd operator has made etcd-cluster a first class object and for deploying the cluster we create an object of Etcd Cluster kind. With operators, we are able to extend Kubernetes functionalities for custom use cases and manage our applications in a Kubernetes specific way allowing us to leverage Kubernetes APIs and Kubectl tooling.

    Operators combine crds and custom controllers and intend to eliminate the requirement for manual intervention (human operator) while performing tasks like an upgrade, handling failure recovery, scaling in case of complex (often stateful) applications and make them more resilient and self-sufficient.

    How to Build Operators ?

    For building and managing operators we mostly leverage the Operator Framework which is an open source tool kit allowing us to build operators in a highly automated, scalable and effective way.  Operator framework comprises of three subcomponents:

    1. Operator SDK: Operator SDK is the most important component of the operator framework. It allows us to bootstrap our operator project in minutes. It exposes higher level APIs and abstraction and saves developers the time to dig deeper into kubernetes APIs and focus more on building the operational logic. It performs common tasks like getting the controller to watch the custom resource (cr) for changes etc as part of the project setup process.
    2. Operator Lifecycle Manager:  Operators also run on the same kubernetes clusters in which they manage applications and more often than not we create multiple operators for multiple applications. Operator lifecycle manager (OLM) provides us a declarative way to install, upgrade and manage all the operators and their dependencies in our cluster.
    3. Operator Metering:  Operator metering is currently an alpha project. It records historical cluster usage and can generate usage reports showing usage breakdown by pod or namespace over arbitrary time periods.

    Types of Operators

    Currently there are three different types of operator we can build:

    1. Helm based operators: Helm based operators allow us to use our existing Helm charts and build operators using them. Helm based operators are quite easy to build and are preferred to deploy a stateless application using operator pattern.
    2. Ansible based Operator: Ansible based operator allows us to use our existing ansible playbooks and roles and build operators using them. There are also easy to build and generally preferred for stateless applications.
    3. Go based operators: Go based operators are built to solve the most complex use cases and are generally preferred for stateful applications. In case of an golang based operator, we build the controller logic ourselves providing it with all our custom requirements. This type of operators is also relatively complex to build.

    Building a Helm based operator

    1. Let’s first install the operator sdk

    go get -d github.com/operator-framework/operator-sdk
    cd $GOPATH/src/github.com/operator-framework/operator-sdk
    git checkout master
    make dep
    make install

    Now we will have the operator-sdk binary in the $GOPATH/bin folder.      

    2.  Setup the project

    For building a helm based operator we can use an existing Helm chart. We will be using the book-store Helm chart which deploys a simple python app and mongodb instances. This app allows us to perform crud operations via. rest endpoints.

    Now we will use the operator-sdk to create our Helm based bookstore-operator project.

    operator-sdk new bookstore-operator --api-version=velotio.com/v1alpha1 --kind=BookStore --type=helm --helm-chart=book-store
      --helm-chart-repo=https://akash-gautam.github.io/helmcharts/

    In the above command, the bookstore-operator is the name of our operator/project. –kind is used to specify the kind of objects this operator will watch and –api-verison is used for versioning of this object. The operator sdk takes only this much information and creates the custom resource definition (crd) and also the custom resource (cr) of its type for us (remember we talked about high-level abstraction operator sdk provides). The above command bootstraps a project with below folder structure.

    bookstore-operator/
    |
    |- build/ # Contains the Dockerfile to build the operator image
    |- deploy/ # Contains the crd,cr and manifest files for deploying operator
    |- helm-charts/ # Contains the helm chart we used while creating the project
    |- watches.yaml # Specifies the resource the operator watches (maintains the state of)

    We had discussed the operator-sdk automates setting up the operator projects and that is exactly what we can observe here. Under the build folder, we have the Dockerfile to build our operator image. Under deploy folder we have a crd folder containing both the crd and the cr. This folder also has operator.yaml file using which we will run the operator in our cluster, along with this we have manifest files for role, rolebinding and service account file to be used while deploying the operator.  We have our book-store helm chart under helm-charts. In the watches.yaml file.

    ---
    - version: v1alpha1
      group: velotio.com
      kind: BookStore
      chart: /opt/helm/helm-charts/book-store

    We can see that the bookstore-operator watches events related to BookStore kind objects and executes the helm chart specified.

    If we take a look at the cr file under deploy/crds (velotio_v1alpha1_bookstore_cr.yaml) folder then we can see that it looks just like the values.yaml file of our book-store helm chart.

    apiVersion: velotio.com/v1alpha1
    kind: BookStore
    metadata:
      name: example-bookstore
    spec:
      # Default values copied from <project_dir>/helm-charts/book-store/values.yaml
      
      # Default values for book-store.
      # This is a YAML-formatted file.
      # Declare variables to be passed into your templates.
      
      replicaCount: 1
      
      image:
        app:
          repository: akash125/pyapp
          tag: latest
          pullPolicy: IfNotPresent
        mongodb:
          repository: mongo
          tag: latest
          pullPolicy: IfNotPresent
          
      service:
        app:
          type: LoadBalancer
          port: 80
          targetPort: 3000
        mongodb:
          type: ClusterIP
          port: 27017
          targetPort: 27017
      
      
      resources: {}
        # We usually recommend not to specify default resources and to leave this as a conscious
        # choice for the user. This also increases chances charts run on environments with little
        # resources, such as Minikube. If you do want to specify resources, uncomment the following
        # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
        # limits:
        #  cpu: 100m
        #  memory: 128Mi
        # requests:
        #  cpu: 100m
        #  memory: 128Mi
      
      nodeSelector: {}
      
      tolerations: []
      
      affinity: {}

    In the case of Helm charts, we use the values.yaml file to pass the parameter to our Helm releases, Helm based operator converts all these configurable parameters into the spec of our custom resource. This allows us to express the values.yaml with a custom resource (CR) which, as a native Kubernetes object, enables the benefits of RBAC applied to it and an audit trail. Now when we want to update out deployed we can simply modify the CR and apply it, and the operator will ensure that the changes we made are reflected in our app.

    For each object of  `BookStore` kind  the bookstore-operator will perform the following actions:

    1. Create the bookstore app deployment if it doesn’t exists.
    2. Create the bookstore app service if it doesn’t exists.
    3. Create the mongodb deployment if it doesn’t exists.
    4. Create the mongodb service if it doesn’t exists.
    5. Ensure deployments and services match their desired configurations like the replica count, image tag, service port etc.  

    3. Build the Bookstore-operator Image

    The Dockerfile for building the operator image is already in our build folder we need to run the below command from the root folder of our operator project to build the image.

    operator-sdk build akash125/bookstore-operator:v0.0.1

    4. Run the Bookstore-operator

    As we have our operator image ready we can now go ahead and run it. The deployment file (operator.yaml under deploy folder) for the operator was created as a part of our project setup we just need to set the image for this deployment to the one we built in the previous step.

    After updating the image in the operator.yaml we are ready to deploy the operator.

    kubectl create -f deploy/service_account.yaml
    kubectl create -f deploy/role.yaml
    kubectl create -f deploy/role_binding.yaml
    kubectl create -f deploy/operator.yaml

    Note: The role created might have more permissions then actually required for the operator so it is always a good idea to review it and trim down the permissions in production setups.

    Verify that the operator pod is in running state.

    5. Deploy the Bookstore App

    Now we have the bookstore-operator running in our cluster we just need to create the custom resource for deploying our bookstore app.

    First, we can create bookstore cr we need to register its crd.

    kubectl apply -f deploy/crds/velotio_v1alpha1_bookstore_crd.yaml

    Now we can create the bookstore object.

    kubectl apply -f deploy/crds/velotio_v1alpha1_bookstore_cr.yaml

    Now we can see that our operator has deployed out book-store app.

    Now let’s grab the external IP of the app and make some requests to store details of books.

    Let’s hit the external IP on the browser and see if it lists the books we just stored:

    The bookstore operator build is available here.

    Conclusion

    Since its early days Kubernetes was believed to be a great tool for managing stateless application but the managing stateful applications on Kubernetes was always considered difficult. Operators are a big leap towards managing stateful applications and other complex distributed, multi (poly) cloud workloads with the same ease that we manage the stateless applications. In this blog post, we learned the basics of Kubernetes operators and build a simple helm based operator. In the next installment of this blog series, we will build an Ansible based Kubernetes operator and then in the last blog we will build a full-fledged Golang based operator for managing stateful workloads.

    Related Reads:

  • Building an Intelligent Chatbot Using Botkit and Rasa NLU

    Introduction

    Bots are the flavor of the season. Everyday, we hear about a new bot catering to domains like travel, social, legal, support, sales, etc. being launched. Facebook Messenger alone has more than 11,000 bots when I last checked and must have probably added thousands of them as I write this article.

    The first generation of bots were dumb since they could understand only a limited set of queries based on keywords in the conversation. But the commoditization of NLP(Natural Language Processing) and machine learning by services like Wit.ai, API.ai, Luis.ai, Amazon Lex, IBM Watson, etc. has resulted in the growth of intelligent bots like donotpay, chatShopper. I don’t know if bots are just hype or the real deal. But I can say with certainty that building a bot is fun and challenging at the same time. In this article, I would like to introduce you to some of the tools to build an intelligent chatbot.

    The title of the blog clearly tells that we have used Botkit and Rasa (NLU) to build our bot. Before getting into the technicalities, I would like to share the reason for choosing these two platforms and how they fit our use case. Also read – How to build a serverless chatbot with Amazon Lex.

    Bot development Framework — Howdy, Botkit and Microsoft (MS) Bot Framework were good contenders for this. Both these frameworks:
    – are open source
    – have integrations with popular messaging platforms like Slack, Facebook Messenger, Twilio etc
    – have good documentation
    – have an active developer community

    Due to compliance issues, we had chosen AWS to deploy all our services and we wanted the same with the bot as well.

    NLU (Natural Language Understanding) — API.ai (acquired by google) and Wit.ai (acquired by Facebook) are two popular NLU tools in the bot industry which we first considered for this task. Both the solutions:
    – are hosted as a cloud service
    – have Nodejs, Python SDK and a REST interface
    – have good documentation
    – support for state or contextual intents which makes it very easy to build a conversational platform on top of it.

    As stated before, we couldn’t use any of these hosted solutions due to compliance and that is where we came across an open source NLU called Rasa which was a perfect replacement for API.ai and Wit.ai and at the same time, we could host and manage it on AWS.

    You would now be wondering why I used the term NLU for Api.ai and Wit.ai and not NLP (Natural Language Processing). 
    * NLP refers to all the systems which handle the interactions with humans in the way humans find it natural. It means that we could converse with a system just the way we talk to other human beings. 
    * NLU is a subfield of NLP which handles a narrow but complex challenge of converting unstructured inputs into a structured form which a machine can understand and act upon. So when you say “Book a hotel for me in San Francisco on 20th April 2017”, the bot uses NLU to extract
    date=20th April 2017, location=San Francisco and action=book hotel
    which the system can understand.

    RASA NLU

    In this section, I would like to explain Rasa in detail and some terms used in NLP which you should be familiar with.
    * Intent: This tells us what the user would like to do. 
    Ex :  Raise a complaint, request for refund etc

    * Entities: These are the attributes which gives details about the user’s task. Ex — Complaint regarding service disruptions, refund cost etc

    * Confidence Score : This is a distance metric which indicates how closely the NLU could classify the result into the list of intents.

    Here is an example to help you understand the above mentioned terms — 
    Input: “My internet isn’t working since morning”.
        –  intent: 
          “service_interruption” 
         – entities: “service=internet”, 
          “duration=morning”.
         – confidence score: 0.84 (This could vary based on your training)

    NLU’s job (Rasa in our case) is to accept a sentence/statement and give us the intent, entities and a confidence score which could be used by our bot. Rasa basically provides a high level API over various NLP and ML libraries which does intent classification and entity extraction. These NLP and ML libraries are called as backend in Rasa which brings the intelligence in Rasa. These are some of the backends used with Rasa

    • MITIE — This is an all inclusive library meaning that it has NLP library for entity extraction as well as ML library for intent classification built into it.
    • spaCy + sklearn — spaCy is a NLP library which only does entity extraction. sklearn is used with spaCy to add ML capabilities for intent classification.
    • MITIE + sklearn — This uses best of both the worlds. This uses good entity recognition available in MITIE along with fast and good intent classification in sklearn.

    I have used MITIE backend to train Rasa. For the demo, I’ve taken a “Live Support ChatBot” which is trained for messages like this:
    * My phone isn’t working.
    * My phone isn’t turning on.
    * My phone crashed and isn’t working anymore.

    My training data looks like this:

    {
      "rasa_nlu_data": {
        "common_examples": [
        {
            "text": "hi",
            "intent": "greet",
            "entities": []
          },
          {
            "text": "my phone isn't turning on.",
            "intent": "device_failure",
            "entities": [
              {
                "start": 3,
                "end": 8,
                "value": "phone",
                "entity": "device"
              }
            ]
          },
          {
            "text": "my phone is not working.",
            "intent": "device_failure",
            "entities": [
              {
                "start": 3,
                "end": 8,
                "value": "phone",
                "entity": "device"
              }
            ]
          },
          {
            "text": "My phone crashed and isn’t working anymore.",
            "intent": "device_failure",
            "entities": [
              {
                "start": 3,
                "end": 8,
                "value": "phone",
                "entity": "device"
              }
            ]
          }
        ]
      }
    }

    NOTE — We have observed that MITIE gives better accuracy than spaCy + sklearn for a small training set but as you keep adding more intents, training on MITIE gets slower and slower. For a training set of 200+ examples with about 10–15 intents, MITIE takes about 35–45 minutes for us to train on a C4.4xlarge instance(16 cores, 30 GB RAM) on AWS.

    Botkit-Rasa Integration

    Botkit is an open source bot development framework designed by the creators of Howdy. It basically provides a set of tools for building bots on Facebook Messenger, Slack, Twilio, Kik and other popular platforms. They have also come up with an IDE for bot development called Botkit Studio. To summarize, Botkit is a tool which allows us to write the bot once and deploy it on multiple messaging platforms.

    Botkit also has a support for middleware which can be used to extend the functionality of botkit. Integrations with database, CRM, NLU and statistical tools are provided via middleware which makes the framework extensible. This design also allows us to easily add integrations with other tools and software by just writing middleware modules for them.

    I’ve integrated Slack and botkit for this demo. You can use this boilerplate template to setup botkit for Slack. We have extended Botkit-Rasa middleware which you can find here.

    Botkit-Rasa has 2 functions: receive and hears which override the default botkit behaviour.
    1. receive — This function is invoked when botkit receives a message. It sends the user’s message to Rasa and stores the intent and entities into the botkit message object.

    2. hears — This function overrides the default botkit hears method i.e controller.hears. The default hears method uses regex to search the given patterns in the user’s message while the hears method from Botkit-Rasa middleware searches for the intent.

    let Botkit = require('botkit');
    let rasa = require('./Middleware/rasa')({rasa_uri: 'http://localhost:5000'});
    
    let controller = Botkit.slackbot({
      clientId: process.env.clientId,
      clientSecret: process.env.clientSecret,
      scopes: ['bot'],
      json_file_store: __dirname + '/.db/'
    });
    
    // Override receive method in botkit
    controller.middleware.receive.use(rasa.receive);
    
    // Override hears method in botkit
    controller.changeEars(function (patterns, message) {
      return rasa.hears(patterns, message);
    });
    
    controller.setupWebserver(3000, function (err, webserver) {
      // Configure a route to receive webhooks from slack
      controller.createWebhookEndpoints(webserver);
    });

    Let’s try an example — my phone is not turning on”.
    Rasa will return the following
    1. Intent — device_failure
    2. Entites — device=phone

    If you notice carefully, the input I gave i.e my phone is not turning on” is a not present in my training file. Rasa has some intelligence built into it to identify the intent and entities correctly for such combinations. 

    We need to add a hears method listening to intent “device_failure” to process this input. Remember that intent and entities returned by Rasa will be stored in the message object by Rasa-Botkit middleware.

    let Botkit = require('botkit');
    let rasa = require('./Middleware/rasa')({rasa_uri: 'http://localhost:5000'});
    
    let controller = Botkit.slackbot({
      clientId: process.env.clientId,
      clientSecret: process.env.clientSecret,
      scopes: ['bot'],
      json_file_store: __dirname + '/.db/'
    });
    
    // Override receive method in botkit
    controller.middleware.receive.use(rasa.receive);
    
    // Override hears method in botkit
    controller.changeEars(function (patterns, message) {
      return rasa.hears(patterns, message);
    });
    
    controller.setupWebserver(3000, function (err, webserver) {
      // Configure a route to receive webhooks from slack
      controller.createWebhookEndpoints(webserver);
    });

    You should be able run this bot with slack and see the output as shown below (support_bot is the name of my bot).

    Conclusion

    You are now familiar with the process of building chatbots with a bot development framework and a NLU. Hope this helps you get started on your bot very quickly. If you have any suggestions, questions, feedback then tweet me @harjun1601. Keep following our blogs for more articles on bot development, ML and AI.

  • Benefits of Using Chatbots: How Companies Are Using Them to Their Advantange

    Bots are the new black! The entire tech industry seems to be buzzing with “bot” fever. Me and my co-founders often see a “bot” company and discuss its business model. Chirag Jog has always been enthusiastic about the bot wave while I have been mostly pessimistic, especially about B2C bots. We should consider that there are many types of “bots” —chat bots, voice bots, AI assistants, robotic process automation(RPA) bots, conversational agents within apps or websites, etc.

    Over the last year, we have been building some interesting chat and voice based bots which has given me some interesting insights. I hope to lay down my thoughts on bots in some detail and with some structure.

    What are bots?

    Bots are software programs which automate tasks that humans would otherwise do themselves. Bots are developed using machine learning software and are expected to aggregate data to make the interface more intelligent and intuitive. There have always been simple rule-based bots which provide a very specific service with low utility. In the last couple of years, we are seeing emergence of intelligent bots that can serve more complex use-cases.

    Why now?

    Machine learning, NLP and AI technologies have matured enabling practical applications where bots can actually do intelligent work >75% of the times. Has general AI been solved? No. But is it good enough to do the simple things well and give hope for more complex things? Yes.

    Secondly, there are billions of DAUs on Whatsapp & Facebook Messenger. There are tens of millions of users on enterprise messaging platforms like Slack, Skype & Microsoft Teams. Startups and enterprises want to use this distribution channel and will continue to experiment aggressively to find relevant use-cases. Millennials are very comfortable using the chat and voice interfaces for a broader variety of use-cases since they used chat services as soon as they came online. As millennials become a growing part of the workforce, the adoption of bots may increase.

    Thirdly, software is becoming more prevalent and more complex. Data is exploding and making sense of this data is getting harder and requiring more skill. Companies are experimenting with bots to provide an “easy to consume” interface to casual users. So non-experts can use the bot interface while experts can use the mobile or web application for the complex workflows. This is mostly true for B2B & enterprise. A good example is how Slack has become the system of engagement for many companies (including at @velotiotech). We require all the software we use (Gitlab, Asana, Jira, Google Docs, Zoho, Marketo, Zendesk, etc.) to provide notifications into Slack. Over time, we expect to start querying the respective Slack bots for information. Only domain experts will log into the actual SaaS applications.

    Types of Bots

    B2C Chat-Bots

    Consumer focused bots use popular messaging and social platforms like Facebook, Telegram, Kik, WeChat, etc. Some examples of consumer bots include weather, e-commerce, travel bookings, personal finance, fitness, news. These are mostly inspired by WeChat which owns the China market and is the default gateway to various internet services. These bots show up as “contacts” in these messenger platforms.

    Strategically, the B2C bots are basically trying to get around the distribution monopoly of Apple & Google Android. As many studies have indicated, getting mobile users to install apps is getting extremely hard. Facebook, Skype, Telegram hope to become the system of engagement and distribution for various apps thereby becoming an alternate “App Store” or “Bot Store”.

    I believe that SMS is a great channel for basic chatbot functionality. Chatbots with SMS interface can be used by all age groups and in remote parts of the world where data infrastructure is lacking. I do expect to see some interesting companies use SMS chatbots to build new business models. Also mobile bots that sniff or integrate with as many of your mobile apps to provide cross-platform and cross-app “intelligence” will succeed — Google Now is a good example.

    An often cited example is the DoNotPay chatbot which helps people contest parking tickets in the UK. In my opinion, the novelty is in the service and it’s efficiency and not in the chatbot interface as such. Also, I have not met anyone who uses a B2C chatbot even on a weekly or monthly basis.

    B2B Bots

    Enterprise bots are available through platforms and interfaces like Slack, Skype, Microsoft Teams, website chat windows, email assistants, etc. They are focused on collaboration, replacing/augmenting emails, information assistants, support, and speeding up decision-making/communications.

    Most of the enterprise bots solve niche and specific problems. This is a great advantage considering the current state of AI/ML technologies. Many of these enterprise bot companies are also able to augment their intelligence with human agents thereby providing better experiences to users.

    Some of the interesting bots and services in the enterprise space include:

    • x.ai and Clara Labs which provide a virtual assistant to help you setup and manage your meetings.
    • Gong.io and Chorus provide a bot that listens in on sales calls and uses voice-to-text and other machine learning algorithms to help your sales teams get better and close more deals.
    • Astro is building an AI assisted email app which will have multiple interfaces including voice (Echo).
    • Twyla is helping to make chatbots on website more intelligent using ML. It integrates with your existing ZenDesk, LivePerson or Salesforce support.
    • Clarke.ai is a bot which uses AI to take notes for your meeting so you can focus better.
    • Smacc provides AI assisted automated book-keeping for SMBs.
    • Slack is one of the fastest growing SaaS companies and has the most popular enterprise bot store. Slack bots are great for pushing and pulling information & data. All SaaS services and apps should have bots that can emit useful updates, charts, data, links, etc to a specific set of users. This is much better than sending emails to an email group. Simple decisions can be taken within a chat interface using something like Slack Buttons. Instead of receiving an email and opening a web page, most people would prefer approving a leave or an expense right within Slack. Slack/Skype/etc will add the ability to embed “cards” or “webviews” or “interactive sections” within chats. This will enable some more complex use-cases to be served via bots. Most enterprise services have Slack bots and are allowing Slack to be a basic system of engagement.
    • Chatbots or even voice-based bots on websites will be a big deal. Imagine that each website has a virtual support rep or a sales rep available to you 24×7 in most popular languages. All business would want such “agents” or “bots” for greater sales conversions and better support.
    • Automation of backoffice tasks can be a HUGE business. KPOs & BPOs are a huge market sp if you can build software or software-enabled processes to reduce costs, then you can build a significant sized company. Some interesting examples here Automation Anywhere and WorkFusion.

    Voice based Bots

    Amazon had a surprise hit in the consumer electronics space with their Amazon Echo device which is a voice-based assistant. Google recently releases their own voice enabled apps to complete with Echo/Alexa. Voice assistants provide weather, music, searches, e-commerce ordering via NLP voice interface. Apple’s Siri should have been leading this market but as usual Apple is following rather leading the market.

    Voice bots have one great advantage- with miniaturization of devices (Apple Watch, Earpods, smaller wearables), the only practical interface is voice. The other option is pairing the device with your mobile phone — which is not a smooth and intuitive process. Echo is already a great device for listening to music with its Spotify integration — just this feature is enough of a reason to buy it for most families.

    Conclusion

    Bots are useful and here to stay. I am not sure about the form or the distribution channel through which bots will become prevalent. In my opinion, bots are an additional interface to intelligence and application workflows. They are not disrupting any process or industry. Consumers will not shop more due to chat or voice interface bots, employees will not collaborate as desired due to bots, information discovery within your company will not improve due to bots. Actually, existing software and SaaS services are getting more intelligent, predictive and prescriptive. So this move towards “intelligent interfaces” is the real disruption.

    So my concluding predictions:

    • B2C chatbots will turn out to be mostly hype and very few practical scalable use-cases will emerge.
    • Voice bots will see increasing adoption due to smaller device sizes. IoT, wearables and music are excellent use-cases for voice based interfaces. Amazon’s Alexa will become the dominant platform for voice controlled apps and devices. Google and Microsoft will invest aggressively to take on Alexa.
    • B2B bots can be intelligent interfaces on software platforms and SaaS products. Or they can be agents that solve very specific vertical use-cases. I am most bullish about these enterprise focused bots which are helping enterprises become more productive or to increase efficiency with intelligent assistants for specific job functions.

    If you’d like to chat about anything related to this article, what tools we use to build bots, or anything else, get in touch.

    PS: Velotio is helping to bring the power of machine learning to enterprise software businesses. Click here to learn more about Velotio.

  • Building Google Photos Alternative Using AWS Serverless

    Being an avid Google Photos user, I really love some of its features, such as album, face search, and unlimited storage. However, when Google announced the end of unlimited storage on June 1st, 2021, I started thinking about how I could create a cheaper solution that would meet my photo backup requirement.

    “Taking an image, freezing a moment, reveals how rich reality truly is.”

    – Anonymous

    Google offers 100 GB of storage for 130 INR. This storage can be used across various Google applications. However, I don’t use all the space in one go. For me, I snap photos randomly. Sometimes, I visit places and take random snaps with my DSLR and smartphone. So, in general, I upload approximately 200 photos monthly. The size of these photos varies in the range of 4MB to 30MB. On average, I may be using 4GB of monthly storage for backup on my external hard drive to keep raw photos, even the bad ones. Photos backed up on the cloud should be visually high-quality, and it’s good to have a raw copy available at the same time, so that you may do some lightroom changes (although I never touch them 😛). So, here is my minimal requirement:

    • Should support social authentication (Google sign-in preferred).
    • Photos should be stored securely in raw format.
    • Storage should be scaled with usage.
    • Uploading and downloading photos should be easy.
    • Web view for preview would be a plus.
    • Should have almost no operations headache and solution should be as cheap as possible 😉.

    Selecting Tech Stack

    To avoid operation headaches with servers going down, scaling, or maybe application crashing and overall monitoring, I opted for a serverless solution with AWS. The AWS S3 is infinite scalable storage and you only pay for the amount of storage you used. On top of that, you can opt for the S3 storage class, which is efficient and cost-effective.

    – Infrastructure Stack

    1. AWS API Gateway (http api)
    2. AWS Lambda (for processing images and API gateway queries)
    3. Dynamodb (for storing image metadata)
    4. AWS Cognito (for authentication)
    5. AWS S3 Bucket (for storage and web application hosting)
    6. AWS Certificate Manager (to use SSL certificate for a custom domain with API gateway)

    – Software Stack

    1. NodeJS
    2. ReactJS and Material-UI (front-end framework and UI)
    3. AWS Amplify (for simplifying auth flow with cognito)
    4. Sharp (high-speed nodejs library for converting images)
    5. Express and serversless-http
    6. Infinite Scroller (for gallery view)
    7. Serverless Framework (for ease of deployment and Infrastructure as Code)

    Create S3 Buckets:

    We will create three S3 buckets. Create one for hosting a frontend application (refer to architecture diagram, more on this discussed later in the build and hosting part). The second one is for temporarily uploading images. The third one is for actual backup and storage (enable server-side encryption on this bucket). A temporary upload bucket will process uploaded images. 

    During pre-processing, we will resize the original image into two different sizes. One is for thumbnail purposes (400px width), another one is for viewing purposes, but with reduced quality (webp format). Once images are resized, upload all three (raw, thumbnail, and webview) to the third S3 bucket and create a record in dynamodb. Set up object expiry policy on the temporary bucket for 1 day. This way, uploaded objects are automatically deleted from the temporary bucket.

    Setup trigger on the temporary bucket for uploaded images:

    We will need to set up an S3 PUT event, which will trigger our Lambda function to download and process images. We will filter the suffix jpg (and jpeg) for an event trigger, meaning that any file with extension .jpg and .jpeg uploaded to our temporary bucket will automatically invoke a lambda function with the event payload. The lambda function with the help of the event payload will download the uploaded file and perform processing. Your serverless function definition would look like:

    functions:
     lambda:
       handler: index.handler
       memorySize: 512
       timeout: 60
       layers:
         - {Ref: PhotoParserLibsLambdaLayer}
       events:
         - s3:
             bucket: your-temporary-bucket-name
             event: s3:ObjectCreated:*
             rules:
               - suffix: .jpg
             existing: true
         - s3:
             bucket: your-temporary-bucket-name
             event: s3:ObjectCreated:*
             rules:
               - suffix: .jpeg
             existing: true

    Notice that in the YAML events section, we set “existing:true”. This ensures that the bucket will not be created during the serverless deployment. However, if you plan not to manually create your s3 bucket, you can let the framework create a bucket for you.

    DynamoDB as metadatadb:

    AWS dynamodb is a key-value document db that is suitable for our use case. Dynamodb will help us retrieve the list of photos available in the time series. Dynamodb uses a primary key for uniquely identifying each record. A primary key can be composed of a hash key and range key (also called a sort key). A range key is optional. We will use a federated identity ID (discussed in setup authorization) as the hash key (partition key) and name it the username for attribute definition with the type string. We will use the timestamp attribute definition name as a range key with a type number. Range key will help us query results with time-series (Unix epoch). We can also use dynamodb secondary indexes to sort results more specifically. However, to keep the application simple, we’re going to opt-out of this feature for now. Your serverless resource definition would look like:

    resources:
     Resources:
       MetaDataDB:
         Type: AWS::DynamoDB::Table
         Properties:
           TableName: your-dynamodb-table-name
           AttributeDefinitions:
             - AttributeName: username
               AttributeType: S
             - AttributeName: timestamp
               AttributeType: N
           KeySchema:
             - AttributeName: username
               KeyType: HASH
             - AttributeName: timestamp
               KeyType: RANGE
           BillingMode: PAY_PER_REQUEST

    Finally, you also need to set up the IAM role so that the process image lambda function would have access to the S3 bucket and dynamodb. Here is the serverless definition for the IAM role.

    # you can add statements to the Lambda function's IAM Role here
     iam:
       role:
         statements:
         - Effect: "Allow"
           Action:
             - "s3:ListBucket"
           Resource:
             - arn:aws:s3:::your-temporary-bucket-name
             - arn:aws:s3:::your-actual-photo-bucket-name
         - Effect: "Allow"
           Action:
             - "s3:GetObject"
             - "s3:DeleteObject"
           Resource: arn:aws:s3:::your-temporary-bucket-name/*
         - Effect: "Allow"
           Action:
             - "s3:PutObject"
           Resource: arn:aws:s3:::your-actual-photo-bucket-name/*
         - Effect: "Allow"
           Action:
             - "dynamodb:PutItem"
           Resource:
             - Fn::GetAtt: [ MetaDataDB, Arn ]

    Setup Authentication:

    Okay, to set up a Cognito user pool, head to the Cognito console and create a user pool with below config:

    1. Pool Name: photobucket-users

    2. How do you want your end-users to sign in?

    • Select: Email Address or Phone Number
    • Select: Allow Email Addresses
    • Check: (Recommended) Enable case insensitivity for username input

    3. Which standard attributes are required?

    • email

    4. Keep the defaults for “Policies”

    5. MFA and Verification:

    • I opted to manually reset the password for each user (since this is internal app)
    • Disabled user verification

    6. Keep the default for Message Customizations, tags, and devices.

    7. App Clients :

    • App client name: myappclient
    • Let the refresh token, access token, and id token be default
    • Check all “Auth flow configurations”
    • Check enable token revocation

    8. Skip Triggers

    9. Review and create the pool

    Once created, goto app integration -> domain name. Create a domain Cognito subdomain of your choice and note this. Next, I plan to use the Google sign-in feature with Cognito Federation Identity Providers. Use this guide to set up a Google social identity with Cognito.

    Setup Authorization:

    Once the user identity is verified, we need to allow them to access the s3 bucket with limited permissions. Head to the Cognito console, select federated identities, and create a new identity pool. Follow these steps to configure:

    1. Identity pool name: photobucket_auth

    2. Keep Unauthenticated and Authentication flow settings unchecked.

    3. Authentication providers:

    • User Pool I: Enter the user pool ID obtained during authentication setup
    • App Client I: Enter the app client ID generated during the authentication setup. (Cognito user pool -> App Clients -> App client ID)

    4. Setup permissions:

    • Expand view details (Role Summary)
    • For authenticated identities: edit policy document and use the below JSON policy and skip unauthenticated identities with the default configuration.
    {
       "Version": "2012-10-17",
       "Statement": [
           {
               "Effect": "Allow",
               "Action": [
                   "mobileanalytics:PutEvents",
                   "cognito-sync:*",
                   "cognito-identity:*"
               ],
               "Resource": [
                   "*"
               ]
           },
           {
               "Sid": "ListYourObjects",
               "Effect": "Allow",
               "Action": "s3:ListBucket",
               "Resource": [
                   "arn:aws:s3:::your-actual-photo-bucket-name"
               ],
               "Condition": {
                   "StringLike": {
                       "s3:prefix": [
                           "${cognito-identity.amazonaws.com:sub}/",
                           "${cognito-identity.amazonaws.com:sub}/*"
                       ]
                   }
               }
           },
           {
               "Sid": "ReadYourObjects",
               "Effect": "Allow",
               "Action": [
                   "s3:GetObject"
               ],
               "Resource": [
                   "arn:aws:s3:::your-actual-photo-bucket-name/${cognito-identity.amazonaws.com:sub}",
                   "arn:aws:s3:::your-actual-photo-bucket-name/${cognito-identity.amazonaws.com:sub}/*"
               ]
           }
       ]
    }

    ${cognito-identity.amazonaws.com:sub} is a special AWS variable. When a user is authenticated with a federated identity, each user is assigned a unique identity. What the above policy means is that any user who is authenticated should have access to objects prefixed by their own identity ID. This is how we intend users to gain authorization in a limited area within the S3 bucket.

    Copy the Identity Pool ID (from sample code section). You will need this in your backend to get the identity id of the authenticated user via JWT token.

    Amplify configuration for the frontend UI sign-in:

    This object helps you set up the minimal configuration for your application. This is all that we need to sign in via Cognito and access the S3 photo bucket.

    const awsconfig = {
       Auth : {
           identityPoolId: "idenity pool id created during authorization setup",
           region : "your aws region",
           identityPoolRegion: "same as above if cognito is in same region",
           userPoolId : "cognito user pool id created during authentication setup",
           userPoolWebClientId : "cognito app client id",
           cookieStorage : {
               domain : "https://your-app-domain-name", //this is very important
               secure: true
           },
           oauth: {
               domain : "{cognito domain name}.auth.{cognito region name}.amazoncognito.com",
               scope : ["profile","email","openid"],
               redirectSignIn: 'https://your-app-domain-name',
               redirectSignOut: 'https://your-app-domain-name',
               responseType : "token"
           }
       },
       Storage: {
           AWSS3 : {
               bucket: "your-actual-bucket-name",
               region: "region-of-your-bucket"
           }
       }
    };
    export default awsconfig;

    You can then use the below code to configure and sign in via social authentication.

    import Amplify, {Auth} from 'aws-amplify';
    import awsconfig from './aws-config';
    Amplify.configure(awsconfig);
    //once the amplify is configured you can use below call with onClick event of buttons or any other visual component to sign in.
    //Example
    <Button startIcon={<img alt="Sigin in With Google" src={logo} />} fullWidth variant="outlined" color="primary" onClick={() => Auth.federatedSignIn({provider: 'Google'})}>
       Sign in with Google
    </Button>

    Gallery View:

    When the application is loaded, we use the PhotoGallery component to load photos and view thumbnails on-page. The Photogallery component is a wrapper around the InfinityScoller component, which keeps loading images as the user scrolls. The idea here is that we query a max of 10 images in one go. Our backend returns a list of 10 images (just the map and metadata to the S3 bucket). We must load these images from the S3 bucket and then show thumbnails on-screen as a gallery view. When the user reaches the bottom of the screen or there is empty space left, the InfiniteScroller component loads 10 more images. This continues untill our backend replies with a stop marker.

    The key point here is that we need to send the JWT Token as a header to our backend service via an ajax call. The JWT Token is obtained post a sign-in from Amplify framework. An example of obtaininga JWT token:

    let authsession = await Auth.currentSession();
    let jwtToken = authsession.getIdToken().jwtToken;
    let photoList = await axios.get(url,{
       headers : {
           Authorization: jwtToken
       },
       responseType : "json"
    });

    An example of an infinite scroller component usage is given below. Note that “gallery” is JSX composed array of photo thumbnails. The “loadMore” method calls our ajax function to the server-side backend and updates the “gallery” variable and sets the “hasMore” variable to true/false so that the infinite scroller component can stop queering when there are no photos left to display on the screen.

    <InfiniteScroll
       loadMore={this.fetchPhotos}
       hasMore={this.state.hasMore}
       loader={<div style={{padding:"70px"}} key={0}><LinearProgress color="secondary" /></div>}
    >
       <div style={{ marginTop: "80px", position: "relative", textAlign: "center" }}>
           <div className="image-grid" style={{ marginTop: "30px" }}>
               {gallery}
           </div>
           {this.state.openLightBox ?
           <LightBox src={this.state.lightBoxImg} callback={this.closeLightBox} />
           : null}
       </div>
    </InfiniteScroll>

    The Lightbox component gives a zoom effect to the thumbnail. When the thumbnail is clicked, a higher resolution picture (webp version) is downloaded from the S3 bucket and shown on the screen. We use a storage object from the Amplify library. Downloaded content is a blob and must be converted into image data. To do so, we use the javascript native method, createObjectURL. Below is the sample code that downloads the object from the s3 bucket and then converts it into a viewable image for the HTML IMG tag.

    thumbClick = (index) => {
       const urlCreater = window.URL || window.webkitURL;
       try {
           this.setState({
               openLightBox: true
           });
           Storage.get(this.state.photoList[index].src,{download: true}).then(data=>{
               let image = urlCreater.createObjectURL(data.Body);
               this.setState({
                   lightBoxImg : image
               });
           });
              
       } catch (error) {
           console.log(error);
           this.setState({
               openLightBox: false,
               lightBoxImg : null
           });
       }
    };

    Uploading Photos:

    The S3 SDK lets you generate a pre-signed POST URL. Anyone who gets this URL will be able to upload objects to the S3 bucket directly without needing credentials. Of course, we can actually set up some boundaries, like a max object size, key of the uploaded object, etc. Refer to this AWS blog for more on pre-signed URLs. Here is the sample code to generate a pre-signed URL.

    let s3Params = {
       Bucket: "your-temporary-bucket-name,
       Conditions : [
           ["content-length-range",1,31457280]
       ],
       Fields : {
           key: "path/to/your/object"
       },
       Expires: 300 //in seconds
    };
    const s3 = new S3({region : process.env.AWSREGION });
    s3.createPresignedPost(s3Params)

    For a better UX, we can allow our users to upload more than one photo at a time. However, a pre-signed URL lets you upload a single object at a time. To overcome this, we generate multiple pre-signed URLs. Initially, we send a request to our backend asking to upload photos with expected keys. This request is originated once the user selects photos to upload. Our backend then generates pre-signed URLs for us. Our frontend React app then provides the illusion that all photos are being uploaded as a whole.

    When the upload is successful, the S3 PUT event is triggered, which we discussed earlier. The complete flow of the application is given in a sequence diagram. You can find the complete source code here in my GitHub repository.

    React Build Steps and Hosting:

    The ideal way to build the react app is to execute an npm run build. However, we take a slightly different approach. We are not using the S3 static website for serving frontend UI. For one reason, S3 static websites are non-SSL unless we use CloudFront. Therefore, we will make the API gateway our application’s entry point. Thus, the UI will also be served from the API gateway. However, we want to reduce calls made to the API gateway. For this reason, we will only deliver the index.html file hosted with the help API gateway/Lamda, and the rest of the static files (react supporting JS files) from S3 bucket.

    Your index.html should have all the reference paths pointed to the S3 bucket. The build mustexclusively specify that static files are located in a different location than what’s relative to the index.html file. Your S3 bucket needs to be public with the right bucket policy and CORS set so that the end-user can only retrieve files and not upload nasty objects. Those who are confused about how the S3 static website and S3 public bucket differ may refer to here. Below are the react build steps, bucket policy, and CORS.

    PUBLIC_URL=https://{your-static-bucket-name}.s3.{aws_region}.amazonaws.com/ npm run build
    //Bucket Policy
    {
       "Version": "2012-10-17",
       "Id": "http referer from your domain only",
       "Statement": [
           {
               "Sid": "Allow get requests originating from",
               "Effect": "Allow",
               "Principal": "*",
               "Action": "s3:GetObject",
               "Resource": "arn:aws:s3:::{your-static-bucket-name}/static/*",
               "Condition": {
                   "StringLike": {
                       "aws:Referer": [
                           "https://your-app-domain-name"
                       ]
                   }
               }
           }
       ]
    }
    //CORS
    [
       {
           "AllowedHeaders": [
               "*"
           ],
           "AllowedMethods": [
               "GET"
           ],
           "AllowedOrigins": [
               "https://your-app-domain-name"
           ],
           "ExposeHeaders": []
       }
    ]

    Once a build is complete, upload index.html to a lambda that serves your UI. Run the below shell commands to compress static contents and host them on our static S3 bucket.

    #assuming you are in your react app directory
    mkdir /tmp/s3uploads
    cp -ar build/static /tmp/s3uploads/
    cd /tmp/s3uploads
    #add gzip encoding to all the files
    gzip -9 `find ./ -type f`
    #remove .gz extension from compressed files
    for i in `find ./ -type f`
    do
       mv $i ${i%.*}
    done
    #sync your files to s3 static bucket and mention that these files are compressed with gzip encoding
    #so that browser will not treat them as regular files
    aws s3 --region $AWSREGION sync . s3://${S3_STATIC_BUCKET}/static/ --content-encoding gzip --delete --sse
    cd -
    rm -rf /tmp/s3uploads

    Our backend uses nodejs express framework. Since this is a serverless application, we need to wrap express with a serverless-http framework to work with lambda. Sample source code is given below, along with serverless framework resource definition. Notice that, except for the UI home endpoint ( “/” ), the rest of the API endpoints are authenticated with Cognito on the API gateway itself.

    const serverless = require("serverless-http");
    const express = require("express");
    const app = express();
    .
    .
    .
    .
    .
    .
    app.get("/",(req,res)=> {
     res.sendFile(path.join(__dirname + "/index.html"));
    });
    module.exports.uihome = serverless(app);

    provider:
     name: aws
     runtime: nodejs12.x
     lambdaHashingVersion: 20201221
     httpApi:
       authorizers:
         cognitoJWTAuth:
           identitySource: $request.header.Authorization
           issuerUrl: https://cognito-idp.{AWS_REGION}.amazonaws.com/{COGNITO_USER_POOL_ID}
           audience:
             - COGNITO_APP_CLIENT_ID
    .
    .
    .
    .
    .
    .
    .
    functions:
     react-serve-ui:
       handler: handler.uihome
       memorySize: 256
       timeout: 29
       layers:
         - {Ref: CommonLibsLambdaLayer}
       events:
         - httpApi:
             path: /prep/photoupload
             method: post
             authorizer:
               name: cognitoJWTAuth
         - httpApi:
             path: /list/photos
             method: get
             authorizer:
               name: cognitoJWTAuth
         - httpApi:
             path: /
             method: get

    Final Steps :

    Lastly, we will setup up a custom domain so that we don’t need to use the gibberish domain name generated by the API gateway and certificate for our custom domain. You don’t need to use route53 for this part. If you have an existing domain, you can create a subdomain and point it to the API gateway. First things first: head to the AWS ACM console and generate a certificate for the domain name. Once the request is generated, you need to validate your domain by creating a TXT record as per the ACM console. The ACM is a free service. Domain verification may take few minutes to several hours. Once you have the certificate ready, head back to the API gateway console. Navigate to “custom domain names” and click create.

    1. Enter your application domain name
    2. Check TLS 1.2 as TLS version
    3. Select Endpoint type as Regional
    4. Select ACM certificate from dropdown list
    5. Create domain name

    Select the newly created custom domain. Note the API gateway domain name from Domain Details -> Configuration tab. You will need this to map a CNAME/ALIAS record with your DNS provider. Click on the API mappings tab. Click configure API mappings. From the dropdown, select your API gateway, select stage as default, and click save. You are done here.

    Future Scope and Improvements :

    To improve application latency, we can use CloudFront as CDN. This way, our entry point could be S3, and we no longer need to use API gateway regional endpoint. We can also add AWS WAF as an added security in front of our API gateway to inspect incoming requests and payloads. We can also use Dynamodb secondary indexes so that we can efficiently search metadata in the table. Adding a lifecycle rule on raw photos which have not been accessed for more than a year can be transited to the S3 Glacier storage class. You can further add glacier deep storage transition to save more on storage costs.

  • Building High-performance Apps: A Checklist To Get It Right

    An app is only as good as the problem it solves. But your app’s performance can be extremely critical to its success as well. A slow-loading web app can make users quit and try out an alternative in no time. Testing an app’s performance should thus be an integral part of your development process and not an afterthought.

    In this article, we will talk about how you can proactively monitor and boost your app’s performance as well as fix common issues that are slowing down the performance of your app.

    I’ll use the following tools for this blog.

    • Lighthouse – A performance audit tool, developed by Google
    • Webpack – A JavaScript bundler

    You can find similar tools online, both free and paid. So let’s give our Vue a new Angular perspective to make our apps React faster.

    Performance Metrics

    First, we need to understand which metrics play an important role in determining an app’s performance. Lighthouse helps us calculate a score based on a weighted average of the below metrics:

    1. First Contentful Paint (FCP) – 15%
    2. Speed Index (SI) – 15%
    3. Largest Contentful Paint (LCP) – 25%
    4. Time to Interactive (TTI) – 15%
    5. Total Blocking Time (TBT) – 25%
    6. Cumulative Layout Shift (CLS) – 5%

    By taking the above stats into account, Lighthouse gauges your app’s performance as such:

    • 0 to 49 (slow): Red
    • 50 to 89 (moderate): Orange
    • 90 to 100 (fast): Green

    I would recommend going through Lighthouse performance scoring to learn more. Once you understand Lighthouse, you can audit websites of your choosing.

    I gathered audit scores for a few websites, including Walmart, Zomato, Reddit, and British Airways. Almost all of them had a performance score below 30. A few even secured a single-digit. 

    To attract more customers, businesses fill their apps with many attractive features. But they ignore the most important thing: performance, which degrades with the addition of each such feature.

    As I said earlier, it’s all about the user experience. You can read more about why performance matters and how it impacts the overall experience.

    Now, with that being said, I want to challenge you to conduct a performance test on your favorite app. Let me know if it receives a good score. If not, then don’t feel bad.

    Follow along with me. 

    Let’s get your app fixed!

    Source: Giphy

    Exploring Opportunities

    If you’re still reading this blog, I expect that your app received a low score, or maybe, you’re just curious.

    Source: Giphy

    Whatever the reason, let’s get started.

    Below your scores are the possible opportunities suggested by Lighthouse. Fixing these affects the performance metrics above and eventually boosts your app’s performance. So let’s check them out one-by-one.

    Here are all the possible opportunities listed by Lighthouse:

    1. Eliminate render-blocking resources
    2. Properly size images
    3. Defer offscreen images
    4. Minify CSS & JavaScript
    5. Serve images in the next-gen formats
    6. Enable text compression
    7. Preconnect to required origins
    8. Avoid multiple page redirects
    9. Use video formats for animated content

    A few other opportunities won’t be covered in this blog, but they are just an extension of the above points. Feel free to read them under the further reading section.

    Eliminate Render-blocking Resources

    Source: Giphy

    This section lists down all the render-blocking resources. The main goal is to reduce their impact by:

    • removing unnecessary resources,
    • deferring non-critical resources, and
    • in-lining critical resources.

    To do that, we need to understand what a render-blocking resource is.

    Render-blocking resource and how to identify

    As the name suggests, it’s a resource that prevents a browser from rendering processed content. Lighthouse identifies the following as render-blocking resources:

    • A <script> </script>tag in <head> </head>that doesn’t have a defer or async attribute
    • A <link rel=””stylesheet””> tag that doesn’t have a media attribute to match a user’s device or a disabled attribute to hint browser to not download if unnecessary
    • A <link rel=””import””> that doesn’t have an async attribute

    To reduce the impact, you need to identify what’s critical and what’s not. You can read how to identify critical resources using the Chrome dev tool.

    Classify Resources

    Classify resources as critical and non-critical based on the following color code:

    • Green (critical): Needed for the first paint.
    • Red (non-critical): Not needed for the first paint but will be needed later.

    Solution

    Now, to eliminate render-blocking resources:

    Extract the critical part into an inline resource and add the correct attributes to the non-critical resources. These attributes will indicate to the browser what to download asynchronously. This can be done manually or by using a JS bundler.

    Webpack users can use the libraries below to do it in a few easy steps:

    • For extracting critical CSS, you can use html-critical-webpack-plugin or critters-webpack-plugin. It’ll generate an inline <style></style> tag in the <head></head> with critical CSS stripped out of the main CSS chunk and preloading the main file
    • For extracting CSS depending on media queries, use media-query-splitting-plugin or media-query-plugin
    • The first paint doesn’t need to be dependent on the JavaScript files. Use lazy loading and code splitting techniques to achieve lazy loading resources (downloading only when requested by the browser). The magic comments in lazy loading make it easy
    • And finally, for the main chunk, vendor chunk, or any other external scripts (included in index.html), you can defer them using script-ext-html-webpack-plugin

    There are many more libraries for inlining CSS and deferring external scripts. Feel free to use as per the use case.

    Use Properly Sized Images

    This section lists all the images used in a page that aren’t properly sized, along with the stats on potential savings for each image.

    How Lighthouse Calculate Oversized Images? 

    Lighthouse calculates potential savings by comparing the rendered size of each image on the page with its actual size. The rendered image varies based on the device pixel ratio. If the size difference is at least 25 KB, the image will fail the audit.

    Solution 

    DO NOT serve images that are larger than their rendered versions! The wasted size just hampers the load time. 

    Alternatively,

    • Use responsive images. With this technique, create multiple versions of the images to be used in the application and serve them depending on the media queries, viewport dimensions, etc
    • Use image CDNs to optimize images. These are like a web service API for transforming images
    • Use vector images, like SVG. These are built on simple primitives and can scale without losing data or change in the file size

    You can resize images online or on your system using tools. Learn how to serve responsive images.

    Learn more about replacing complex icons with SVG. For browsers that don’t support SVG format, here’s A Complete Guide to SVG fallbacks.

    Defer Offscreen Images

    An offscreen image is an image located outside of the visible browser viewport. 

    The audit fails if the page has offscreen images. Lighthouse lists all offscreen or hidden images in your page, along with the potential savings. 

    Solution 

    Load offscreen images only when the user focuses on that part of the viewport. To achieve this, lazy-load these images after loading all critical resources.

    There are many libraries available online that will load images depending on the visible viewport. Feel free to use them as per the use case.

    Minify CSS and JavaScript

    Lighthouse identifies all the CSS and JS files that are not minified. It will list all of them along with potential savings.

    Solution 

    Do as the heading says!

    Source: Giphy

    Minifiers can do it for you. Webpack users can use mini-css-extract-plugin and terser-webpack-plugin for minifying CSS and JS, respectively.

    Serve Images in Next-gen Formats

    Following are the next-gen image formats:

    • Webp
    • JPEG 2000
    • JPEG XR

    The image formats we use regularly (i.e., JPEG and PNG) have inferior compression and quality characteristics compared to next-gen formats. Encoding images in these formats can load your website faster and consume less cellular data.

    Lighthouse converts each image of the older format to Webp format and reports those which ones have potential savings of more than 8 KB.

    Solution 

    Convert all, or at least the images Lighthouse recommends, into the above formats. Use your converted images with the fallback technique below to support all browsers.

    <picture>
      <source type="image/jp2" srcset="my-image.jp2">
      <source type="image/jxr" srcset="my-image.jxr">
      <source type="image/webp" srcset="my-image.webp">
      <source type="image/jpeg" srcset="my-image.jpg">
      <img src="my-image.jpg" alt="">
    </picture>

    Enable Text Compression

    Source: Giphy

    This technique of compressing the original textual information uses compression algorithms to find repeated sequences and replace them with shorter representations. It’s done to further minimize the total network bytes.

    Lighthouse lists all the text-based resources that are not compressed. 

    It computes the potential savings by identifying text-based resources that do not include a content-encoding header set to br, gzip or deflate and compresses each of them with gzip.

    If the potential compression savings is more than 10% of the original size, then the file fails the audit.

    Solution

    Webpack users can use compression-webpack-plugin for text compression. 

    The best part about this plugin is that it supports Google’s Brotli compression algorithm which is superior to gzip. Alternatively, you can also use brotli-webpack-plugin. All you need to do is configure your server to return Content-Encoding as br.

    Brotli compresses faster than gzip and produces smaller files (up to 20% smaller). As of June 2020, Brotli is supported by all major browsers except Safari on iOS and desktop and Internet Explorer.

    Don’t worry. You can still use gzip as a fallback.

    Preconnect to Required Origins

    This section lists all the key fetch requests that are not yet prioritized using <link rel=””preconnect””>.

    Establishing connections often involves significant time, especially when it comes to secure connections. It encounters DNS lookups, redirects, and several round trips to the final server handling the user’s request.

    Solution

    Establish an early connection to required origins. Doing so will improve the user experience without affecting bandwidth usage. 

    To achieve this connection, use preconnect or dns-prefetch. This informs the browser that the app wants to establish a connection to the third-party origin as soon as possible.

    Use preconnect for most critical connections. For non-critical connections, use dns-prefetch. Check out the browser support for preconnect. You can use dns-prefetch as the fallback.

    Avoid Multiple Page Redirects

    Source: Giphy

    This section focuses on requesting resources that have been redirected multiple times. One must avoid multiple redirects on the final landing pages.

    A browser encounters this response from a server in case of HTTP-redirect:

    HTTP/1.1 301 Moved Permanently
    Location: /path/to/new/location

    A typical example of a redirect looks like this:

    example.com → www.example.com → m.example.com – very slow mobile experience.

    This eventually makes your page load more slowly.

    Solution

    Don’t leave them hanging!

    Source: Giphy

    Point all your flagged resources to their current location. It’ll help you optimize your pages’ Critical Rendering Path.

    Use Video Formats for Animated Content

    This section lists all the animated GIFs on your page, along with the potential savings. 

    Large GIFs are inefficient when delivering animated content. You can save a significant amount of bandwidth by using videos over GIFs.

    Solution

    Consider using MPEG4 or WebM videos instead of GIFs. Many tools can convert a GIF into a video, such as FFmpeg.

    Use the code below to replicate a GIF’s behavior using MPEG4 and WebM. It’ll be played silent and automatically in an endless loop, just like a GIF. The code ensures that the unsupported format has a fallback.

    <video autoplay loop muted playsinline>  
      <source src="my-funny-animation.webm" type="video/webm">
      <source src="my-funny-animation.mp4" type="video/mp4">
    </video>

    Note: Do not use video formats for a small batch of GIF animations. It’s not worth doing it. It comes in handy when your website makes heavy use of animated content.

    Final Thoughts

    I found a great result in my app’s performance after trying out the techniques above.

    Source: Giphy

    While they may not all fit your app, try it and see what works and what doesn’t. I have compiled a list of some resources that will help you enhance performance. Hopefully, they help.

    Do share your starting and final audit scores with me.

    Happy optimized coding!

    Source: Giphy

    Further Reading

    Learn more – web.dev

    Other opportunities to explore:

    1. Remove unused CSS
    2. Efficiently encode images
    3. Reduce server response times (TTFB)
    4. Preload key requests
    5. Reduce the impact of third-party code

  • Implementing Async Features in Python – A Step-by-step Guide

    Asynchronous programming is a characteristic of modern programming languages that allows an application to perform various operations without waiting for any of them. Asynchronicity is one of the big reasons for the popularity of Node.js.

    We have discussed Python’s asynchronous features as part of our previous post: an introduction to asynchronous programming in Python. This blog is a natural progression on the same topic. We are going to discuss async features in Python in detail and look at some hands-on examples.

    Consider a traditional web scraping application that needs to open thousands of network connections. We could open one network connection, fetch the result, and then move to the next ones iteratively. This approach increases the latency of the program. It spends a lot of time opening a connection and waiting for others to finish their bit of work.

    On the other hand, async provides you a method of opening thousands of connections at once and swapping among each connection as they finish and return their results. Basically, it sends the request to a connection and moves to the next one instead of waiting for the previous one’s response. It continues like this until all the connections have returned the outputs.  

    Source: phpmind

    From the above chart, we can see that using synchronous programming on four tasks took 45 seconds to complete, while in asynchronous programming, those four tasks took only 20 seconds.

    Where Does Asynchronous Programming Fit in the Real-world?

    Asynchronous programming is best suited for popular scenarios such as:

    1. The program takes too much time to execute.

    2. The reason for the delay is waiting for input or output operations, not computation.

    3. For the tasks that have multiple input or output operations to be executed at once.

    And application-wise, these are the example use cases:

    • Web Scraping
    • Network Services

    Difference Between Parallelism, Concurrency, Threading, and Async IO

    Because we discussed this comparison in detail in our previous post, we will just quickly go through the concept as it will help us with our hands-on example later.

    Parallelism involves performing multiple operations at a time. Multiprocessing is an example of it. It is well suited for CPU bound tasks.

    Concurrency is slightly broader than Parallelism. It involves multiple tasks running in an overlapping manner.

    Threading – a thread is a separate flow of execution. One process can contain multiple threads and each thread runs independently. It is ideal for IO bound tasks.

    Async IO is a single-threaded, single-process design that uses cooperative multitasking. In simple words, async IO gives a feeling of concurrency despite using a single thread in a single process.

     

    Fig:- A comparison in concurrency and parallelism

     

    Components of Async IO Programming

    Let’s explore the various components of Async IO in depth. We will also look at an example code to help us understand the implementation.

    1. Coroutines

    Coroutines are mainly generalization forms of subroutines. They are generally used for cooperative tasks and behave like Python generators.

    An async function uses the await keyword to denote a coroutine. When using the await keyword, coroutines release the flow of control back to the event loop.

    To run a coroutine, we need to schedule it on the event loop. After scheduling, coroutines are wrapped in Tasks as a Future object.

    Example:

    In the below snippet, we called async_func from the main function. We have to add the await keyword while calling the sync function. As you can see, async_func will do nothing unless the await keyword implementation accompanies it.

    import asyncio
    async def async_func():
        print('Velotio ...')
        await asyncio.sleep(1)
        print('... Technologies!')
    
    async def main():
        async_func()#this will do nothing because coroutine object is created but not awaited
        await async_func()
    
    asyncio.run(main())

    Output

    RuntimeWarning: coroutine 'async_func' was never awaited
     async_func()#this will do nothing because coroutine object is created but not awaited
    RuntimeWarning: Enable tracemalloc to get the object allocation traceback
    Velotio ...
    ... Blog!

    2. Tasks

    Tasks are used to schedule coroutines concurrently.

    When submitting a coroutine to an event loop for processing, you can get a Task object, which provides a way to control the coroutine’s behavior from outside the event loop.

    Example:

    In the snippet below, we are creating a task using create_task (an inbuilt function of asyncio library), and then we are running it.

    import asyncio
    async def async_func():
        print('Velotio ...')
        await asyncio.sleep(1)
        print('... Blog!')
    
    async def main():
        task = asyncio.create_task (async_func())
        await task
    asyncio.run(main())

    Output

    Velotio ...
    ... Blog!

    3 Event Loops

    This mechanism runs coroutines until they complete. You can imagine it as while(True) loop that monitors coroutine, taking feedback on what’s idle, and looking around for things that can be executed in the meantime.

    It can wake up an idle coroutine when whatever that coroutine is waiting on becomes available.

    Only one event loop can run at a time in Python.

    Example:

    In the snippet below, we are creating three tasks and then appending them in a list and executing all tasks asynchronously using get_event_loop, create_task and the await function of the asyncio library.

    import asyncio
    async def async_func(task_no):
        print(f'{task_no} :Velotio ...')
        await asyncio.sleep(1)
        print(f'{task_no}... Blog!')
    
    async def main():
        taskA = loop.create_task (async_func('taskA'))
        taskB = loop.create_task(async_func('taskB'))
        taskC = loop.create_task(async_func('taskC'))
        await asyncio.wait([taskA,taskB,taskC])
    
    if __name__ == "__main__":
        try:
            loop = asyncio.get_event_loop()
            loop.run_until_complete(main())
        except :
            pass

    Output

    taskA :Velotio ...
    taskB :Velotio ...
    taskC :Velotio ...
    taskA... Blog!
    taskB... Blog!
    taskC... Blog!

    Future

    A future is a special, low-level available object that represents an eventual result of an asynchronous operation.

    When a Future object is awaited, the co-routine will wait until the Future is resolved in some other place.

    We will look into the sample code for Future objects in the next section.

    A Comparison Between Multithreading and Async IO

    Before we get to Async IO, let’s use multithreading as a benchmark and then compare them to see which is more efficient.

    For this benchmark, we will be fetching data from a sample URL (the Velotio Career webpage) with different frequencies, like once, ten times, 50 times, 100 times, 500 times, respectively.

    We will then compare the time taken by both of these approaches to fetch the required data.

    Implementation

    Code of Multithreading:

    import requests
    import time
    from concurrent.futures import ProcessPoolExecutor
    
    
    def fetch_url_data(pg_url):
        try:
            resp = requests.get(pg_url)
        except Exception as e:
            print(f"Error occured during fetch data from url{pg_url}")
        else:
            return resp.content
            
    
    def get_all_url_data(url_list):
        with ProcessPoolExecutor() as executor:
            resp = executor.map(fetch_url_data, url_list)
        return resp
        
    
    if __name__=='__main__':
        url = "https://www.velotio.com/careers"
        for ntimes in [1,10,50,100,500]:
            start_time = time.time()
            responses = get_all_url_data([url] * ntimes)
            print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

    Output

    Fetch total 1 urls and process takes 1.8822264671325684 seconds
    Fetch total 10 urls and process takes 2.3358211517333984 seconds
    Fetch total 50 urls and process takes 8.05638575553894 seconds
    Fetch total 100 urls and process takes 14.43302869796753 seconds
    Fetch total 500 urls and process takes 65.25404500961304 seconds

    ProcessPoolExecutor is a Python package that implements the Executor interface. The fetch_url_data is a function to fetch the data from the given URL using the requests python package, and the get_all_url_data function is used to map the fetch_url_data function to the lists of URLs.

    Async IO Programming Example:

    import asyncio
    import time
    from aiohttp import ClientSession, ClientResponseError
    
    
    async def fetch_url_data(session, url):
        try:
            async with session.get(url, timeout=60) as response:
                resp = await response.read()
        except Exception as e:
            print(e)
        else:
            return resp
        return
    
    
    async def fetch_async(loop, r):
        url = "https://www.velotio.com/careers"
        tasks = []
        async with ClientSession() as session:
            for i in range(r):
                task = asyncio.ensure_future(fetch_url_data(session, url))
                tasks.append(task)
            responses = await asyncio.gather(*tasks)
        return responses
    
    
    if __name__ == '__main__':
        for ntimes in [1, 10, 50, 100, 500]:
            start_time = time.time()
            loop = asyncio.get_event_loop()
            future = asyncio.ensure_future(fetch_async(loop, ntimes))
            loop.run_until_complete(future) #will run until it finish or get any error
            responses = future.result()
            print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

    Output

    Fetch total 1 urls and process takes 1.3974951362609863 seconds
    Fetch total 10 urls and process takes 1.4191942596435547 seconds
    Fetch total 50 urls and process takes 2.6497368812561035 seconds
    Fetch total 100 urls and process takes 4.391665458679199 seconds
    Fetch total 500 urls and process takes 4.960426330566406 seconds

    We need to use the get_event_loop function to create and add the tasks. For running more than one URL, we have to use ensure_future and gather function.

    The fetch_async function is used to add the task in the event_loop object and the fetch_url_data function is used to read the data from the URL using the session package. The future_result method returns the response of all the tasks.

    Results:

    As you can see from the plot, async programming is much more efficient than multi-threading for the program above. 

    The graph of the multithreading program looks linear, while the asyncio program graph is similar to logarithmic.

     

    Conclusion

    As we saw in our experiment above, Async IO showed better performance with the efficient use of concurrency than multi-threading.

    Async IO can be beneficial in applications that can exploit concurrency. Though, based on what kind of applications we are dealing with, it is very pragmatic to choose Async IO over other implementations.

    We hope this article helped further your understanding of the async feature in Python and gave you some quick hands-on experience using the code examples shared above.

  • Building a Progressive Web Application in React [With Live Code Examples]

    What is PWA:

    A Progressive Web Application or PWA is a web application that is built to look and behave like native apps, operates offline-first, is optimized for a variety of viewports ranging from mobile, tablets to FHD desktop monitors and more. PWAs are built using front-end technologies such as HTML, CSS and JavaScript and bring native-like user experience to the web platform. PWAs can also be installed on devices just like native apps.

    For an application to be classified as a PWA, it must tick all of these boxes:

    • PWAs must implement service workers. Service workers act as a proxy between the web browsers and API servers. This allows web apps to manage and cache network requests and assets
    • PWAs must be served over a secure network, i.e. the application must be served over HTTPS
    • PWAs must have a web manifest definition, which is a JSON file that provides basic information about the PWA, such as name, different icons, look and feel of the app, splash screen, version of the app, description, author, etc

    Why build a PWA?

    Businesses and engineering teams should consider building a progressive web app instead of a traditional web app. Here are some of the most prominent arguments in favor of PWAs:

    • PWAs are responsive. The mobile-first design approach enables PWAs to support a variety of viewports and orientation
    • PWAs can work on slow Internet or no Internet environment. App developers can choose how a PWA will behave when there’s no Internet connectivity, whereas traditional web apps or websites simply stop working without an active Internet connection
    • PWAs are secure because they are always served over HTTPs
    • PWAs can be installed on the home screen, making the application more accessible
    • PWAs bring in rich features, such as push notification, application updates and more

    PWA and React

    There are various ways to build a progressive web application. One can just use Vanilla JS, HTML and CSS or pick up a framework or library. Some of the popular choices in 2020 are Ionic, Vue, Angular, Polymer, and of course React, which happens to be my favorite front-end library.

    Building PWAs with React

    To get started, let’s create a PWA which lists all the users in a system.

    npm init react-app users
    cd users
    yarn add react-router-dom
    yarn run start

    Next, we will replace the default App.js file with our own implementation.

    import React from "react";
    import { BrowserRouter, Route } from "react-router-dom";
    import "./App.css";
    const Users = () => {
     // state
     const [users, setUsers] = React.useState([]);
     // effects
     React.useEffect(() => {
       fetch("https://jsonplaceholder.typicode.com/users")
         .then((res) => res.json())
         .then((users) => {
           setUsers(users);
         })
         .catch((err) => {});
     }, []);
     // render
     return (
       <div>
         <h2>Users</h2>
         <ul>
           {users.map((user) => (
             <li key={user.id}>
               {user.name} ({user.email})
             </li>
           ))}
         </ul>
       </div>
     );
    };
    const App = () => (
     <BrowserRouter>
       <Route path="/" exact component={Users} />
     </BrowserRouter>
    );
    export default App;

    This displays a list of users fetched from the server.

    Let’s also remove the logo.svg file inside the src directory and truncate the App.css file that is populated as a part of the boilerplate code.

    To make this app a PWA, we need to follow these steps:

    1. Register service worker

    • In the file /src/index.js, replace serviceWorker.unregister() with serviceWorker.register().
    import React from 'react';
    import ReactDOM from 'react-dom';
    import './index.css';
    import App from './App';
    import * as serviceWorker from './serviceWorker';
    ReactDOM.render(
     <React.StrictMode>
       <App />
     </React.StrictMode>,
     document.getElementById('root')
    );
    serviceWorker.register();

    • The default behavior here is to not set up a service worker, i.e. the CRA boilerplate allows the users to opt-in for the offline-first experience.

    2. Update the manifest file

    • The CRA boilerplate provides a manifest file out of the box. This file is located at /public/manifest.json and needs to be modified to include the name of the PWA, description, splash screen configuration and much more. You can read more about available configuration options in the manifest file here.

    Our modified manifest file looks like this:

    {
     "short_name": "User Mgmt.",
     "name": "User Management",
     "icons": [
       {
         "src": "favicon.ico",
         "sizes": "64x64 32x32 24x24 16x16",
         "type": "image/x-icon"
       },
       {
         "src": "logo192.png",
         "type": "image/png",
         "sizes": "192x192"
       },
       {
         "src": "logo512.png",
         "type": "image/png",
         "sizes": "512x512"
       }
     ],
     "start_url": ".",
     "display": "standalone",
     "theme_color": "#aaffaa",
     "background_color": "#ffffff"
    }

    PWA Splash Screen

    Here the display mode selected is “standalone” which tells the web browsers to give this PWA the same look and feel as that of a standalone app. Other display options include, “browser,” which is the default mode and launches the PWA like a traditional web app and “fullscreen,” which opens the PWA in fullscreen mode – hiding all other elements such as navigation, the address bar and the status bar.

    The manifest can be inspected using Chrome dev tools > Application tab > Manifest.

    1. Test the PWA:

    • To test a progressive web app, build it completely first. This is because PWA features, such as caching aren’t enabled while running the app in dev mode to ensure hassle-free development  
    • Create a production build with: npm run build
    • Change into the build directory: cd build
    • Host the app locally: http-server or python3 -m http.server 8080
    • Test the application by logging in to http://localhost:8080

    2. Audit the PWA: If you are testing the app for the first time on a desktop or laptop browser, PWA may look like just another website. To test and audit various aspects of the PWA, let’s use Lighthouse, which is a tool built by Google specifically for this purpose.

    PWA on mobile

    At this point, we already have a simple PWA which can be published on the Internet and made available to billions of devices. Now let’s try to enhance the app by improving its offline viewing experience.

    1. Offline indication: Since service workers can operate without the Internet as well, let’s add an offline indicator banner to let users know the current state of the application. We will use navigator.onLine along with the “online” and “offline” window events to detect the connection status.

     // state
      const [offline, setOffline] = React.useState(false);
      // effects
      React.useEffect(() => {
        window.addEventListener("offline", offlineListener);
        return () => {
          window.removeEventListener("offline", offlineListener);
        };
      }, []);
      
      {/* add to jsx */}
      {offline ? (
        <div className="banner-offline">The app is currently offline</div>
      ) : null}

    The easiest way to test this is to just turn off the Wi-Fi on your dev machine. Chrome dev tools also provide an option to test this without actually going offline. Head over to Dev tools > Network and then select “Offline” from the dropdown in the top section. This should bring up the banner when the app is offline.

    2. Let’s cache a network request using service worker

    CRA comes with its own service-worker.js file which caches all static assets such as JavaScript and CSS files that are a part of the application bundle. To put custom logic into the service worker, let’s create a new file called ‘custom-service-worker.js’ and combine the two.

    • Install react-app-rewired and update package.json:
    1. yarn add react-app-rewired
    2. Update the package.json as follows:
    "scripts": {
       "start": "react-app-rewired start",
       "build": "react-app-rewired build",
       "test": "react-app-rewired test",
       "eject": "react-app-rewired eject"
    },

    • Create a config file to override how CRA generates service workers and inject our custom service worker, i.e. combine the two service worker files.
    const WorkboxWebpackPlugin = require("workbox-webpack-plugin");
    module.exports = function override(config, env) {
      config.plugins = config.plugins.map((plugin) => {
        if (plugin.constructor.name === "GenerateSW") {
          return new WorkboxWebpackPlugin.InjectManifest({
           swSrc: "./src/service-worker-custom.js",
           swDest: "service-worker.js"
          });
        }
        return plugin;
      });
      return config;
    };

    • Create service-worker-custom.js file and cache network request in there:
    workbox.skipWaiting();
    workbox.clientsClaim();
    workbox.routing.registerRoute(
      new RegExp("/users"),
      workbox.strategies.NetworkFirst()
    );
    workbox.precaching.precacheAndRoute(self.__precacheManifest || [])

    Your app should now work correctly in the offline mode.

    Distributing and publishing a PWA

    PWAs can be published just like any other website and only have one additional requirement, i.e. it must be served over HTTPs. When a user visits PWA from mobile or tablet, a pop-up is displayed asking the user if they’d like to install the app to their home screen.

    Conclusion

    Building PWAs with React enables engineering teams to develop, deploy and publish progressive web apps for billions of devices using technologies they’re already familiar with. Existing React apps can also be converted to a PWA. PWAs are fun to build, easy to ship and distribute, and add a lot of value to customers by providing native-live experience, better engagement via features, such as add to homescreen, push notifications and more without any installation process. 

  • Building a Collaborative Editor Using Quill and Yjs

    “Hope this email finds you well” is how 2020-2021 has been in a nutshell. Since we’ve all been working remotely since last year, actively collaborating with teammates became one notch harder, from activities like brainstorming a topic on a whiteboard to building documentation.

    Having tools powered by collaborative systems had become a necessity, and to explore the same following the principle of build fast fail fast, I started building up a collaborative editor using existing available, open-source tools, which can eventually be extended for needs across different projects.

    Conflicts, as they say, are inevitable, when multiple users are working on the same document constantly modifying it, especially if it’s the same block of content. Ultimately, the end-user experience is defined by how such conflicts are resolved.

    There are various conflict resolution mechanisms, but two of the most commonly discussed ones are Operational Transformation (OT) and Conflict-Free Replicated Data Type (CRDT). So, let’s briefly talk about those first.

    Operational Transformation

    The order of operations matter in OT, as each user will have their own local copy of the document, and since mutations are atomic, such as insert V at index 4 and delete X at index 2. If the order of these operations is changed, the end result will be different. And that’s why all the operations are synchronized through a central server. The central server can then alter the indices and operations and then forward to the clients. For example, in the below image, User2 makes a delete(0) operation, but as the OT server realizes that User1 has made an insert operation, the User2’s operation needs to be changed as delete(1) before applying to User1.

    OT with a central server is typically easier to implement. Plain text operations with OT in its basic form only has three defined operations: insert, delete, and apply.

    Source: Conclave

    “Fully distributed OT and adding rich text operations are very hard, and that’s why there’s a million papers.”

    CRDT

    Instead of performing operations directly on characters like in OT, CRDT uses a complex data structure to which it can then add/update/remove properties to signify transformation, enabling scope for commutativity and idempotency. CRDTs guarantee eventual consistency.

    There are different algorithms, but in general, CRDT has two requirements: globally unique characters and globally ordered characters. Basically, this involves a global reference for each object, instead of positional indices, in which the ordering is based on the neighboring objects. Fractional indices can be used to assign index to an object.

    Source: Conclave

    As all the objects have their own unique reference, delete operation becomes idempotent. And giving fractional indices is one way to give unique references while insertion and updation.

    There are two types of CRDT, one is state-based, where the whole state (or delta) is shared between the instances and merged continuously. The other is operational based, where only individual operations are sent between replicas. If you want to dive deep into CRDT, here’s a nice resource.

    For our purposes, we choose CRDT since it can also support peer-to-peer networks. If you directly want to jump to the code, you can visit the repo here.

    Tools used for this project:

    As our goal was for a quick implementation, we targeted off-the-shelf tools for editor and backend to manage collaborative operations.

    • Quill.js is an API-driven WYSIWYG rich text editor built for compatibility and extensibility. We choose Quill as our editor because of the ease to plug it into your application and availability of extensions.
    • Yjs is a framework that provides shared editing capabilities by exposing its different shared data types (Array, Map, Text, etc) that are synced automatically. It’s also network agnostic, so the changes are synced when a client is online. We used it because it’s a CRDT implementation, and surprisingly had readily available bindings for quill.js.

    Prerequisites:

    To keep it simple, we’ll set up a client and server both in the same code base. Initialize a project with npm init and install the below dependencies:

    npm i quill quill-cursors webpack webpack-cli webpack-dev-server y-quill y-websocket yjs

    • Quill: Quill is the WYSIWYG rich text editor we will use as our editor.
    • quill-cursors is an extension that helps us to display cursors of other connected clients to the same editor room.
    • Webpack, webpack-cli, and webpack-dev-server are developer utilities, webpack being the bundler that creates a deployable bundle for your application.
    • The Y-quill module provides bindings between Yjs and QuillJS with use of the SharedType y.Text. For more information, you can check out the module’s source on Github.
    • Y-websocket provides a WebsocketProvider to communicate with Yjs server in a client-server manner to exchange awareness information and data.
    • Yjs, this is the CRDT framework which orchestrates conflict resolution between multiple clients. 

    Code to use

    const path = require('path');
    
    module.exports = {
      mode: 'development',
      devtool: 'source-map',
      entry: {
        index: './index.js'
      },
      output: {
        globalObject: 'self',
        path: path.resolve(__dirname, './dist/'),
        filename: '[name].bundle.js',
        publicPath: '/quill/dist'
      },
      devServer: {
        contentBase: path.join(__dirname),
        compress: true,
        publicPath: '/dist/'
      }
    }

    This is a basic webpack config where we have provided which file is the starting point of our frontend project, i.e., the index.js file. Webpack then uses that file to build the internal dependency graph of your project. The output property is to define where and how the generated bundles should be saved. And the devServer config defines necessary parameters for the local dev server, which runs when you execute “npm start”.

    We’ll first create an index.html file to define the basic skeleton:

    <!DOCTYPE html>
    <html>
      <head>
        <title>Yjs Quill Example</title>
        <script src="./dist/index.bundle.js" async defer></script>
        <link rel=stylesheet href="//cdn.quilljs.com/1.3.6/quill.snow.css" async defer>
      </head>
      <body>
        <button type="button" id="connect-btn">Disconnect</button>
        <div id="editor" style="height: 500px;"></div>
      </body>
    </html>

    The index.html has a pretty basic structure. In <head>, we’ve provided the path of the bundled js file that will be created by webpack, and the css theme for the quill editor. And for the <body> part, we’ve just created a button to connect/disconnect from the backend and a placeholder div where the quill editor will be plugged.

    • Here, we’ve just made the imports, registered quill-cursors extension, and added an event listener for window load:
    import Quill from "quill";
    import * as Y from 'yjs';
    import { QuillBinding } from 'y-quill';
    import { WebsocketProvider } from 'y-websocket';
    import QuillCursors from "quill-cursors";
    
    // Register QuillCursors module to add the ability to show multiple cursors on the editor.
    Quill.register('modules/cursors', QuillCursors);
    
    window.addEventListener('load', () => {
      // We'll add more blocks as we continue
    });

    • Let’s initialize the Yjs document, socket provider, and load the document:
    window.addEventListener('load', () => {
      const ydoc = new Y.Doc();
      const provider = new WebsocketProvider('ws://localhost:3312', 'velotio-demo', ydoc);
      const type = ydoc.getText('Velotio-Blog');
    });

    • We’ll now initialize and plug the Quill editor with its bindings:
    window.addEventListener('load', () => {
      // ### ABOVE CODE HERE ###
    
      const editorContainer = document.getElementById('editor');
      const toolbarOptions = [
        ['bold', 'italic', 'underline', 'strike'],  // toggled buttons
        ['blockquote', 'code-block'],
        [{ 'header': 1 }, { 'header': 2 }],               // custom button values
        [{ 'list': 'ordered' }, { 'list': 'bullet' }],
        [{ 'script': 'sub' }, { 'script': 'super' }],      // superscript/subscript
        [{ 'indent': '-1' }, { 'indent': '+1' }],          // outdent/indent
        [{ 'direction': 'rtl' }],                         // text direction
        // array for drop-downs, empty array = defaults
        [{ 'size': [] }],
        [{ 'header': [1, 2, 3, 4, 5, 6, false] }],
        [{ 'color': [] }, { 'background': [] }],          // dropdown with defaults from theme
        [{ 'font': [] }],
        [{ 'align': [] }],
        ['image', 'video'],
        ['clean']                                         // remove formatting button
      ];
    
      const editor = new Quill(editorContainer, {
        modules: {
          cursors: true,
          toolbar: toolbarOptions,
          history: {
            userOnly: true  // only user changes will be undone or redone.
          }
        },
        placeholder: "collab-edit-test",
        theme: "snow"
      });
    
      const binding = new QuillBinding(type, editor, provider.awareness);
    });

    • Finally, let’s implement the Connect/Disconnect button and complete the callback:
    window.addEventListener('load', () => {
      // ### ABOVE CODE HERE ###
    
      const connectBtn = document.getElementById('connect-btn');
      connectBtn.addEventListener('click', () => {
    	if (provider.shouldConnect) {
      	  provider.disconnect();
      	  connectBtn.textContent = 'Connect'
    	} else {
      	  provider.connect();
      	  connectBtn.textContent = 'Disconnect'
    	}
      });
    
      window.example = { provider, ydoc, type, binding, Y }
    });

    Steps to run:

    • Server:

    For simplicity, we’ll directly use the y-websocket-server out of the box.

    NOTE: You can either let it run and open a new terminal for the next commands, or let it run in the background using `&` at the end of the command.

    • Client:

    Start the client by npm start. On successful compilation, it should open on your default browser, or you can just go to http://localhost:8080.

    Show me the repo

    You can find the repository here.

    Conclusion:

    Conflict resolution approaches are not relatively new, but with the trend of remote culture, it is important to have good collaborative systems in place to enhance productivity.

    Although this example was just on rich text editing capabilities, we can extend existing resources to build more features and structures like tabular data, graphs, charts, etc. Yjs shared types can be used to define your own data format based on how your custom editor represents data internally.

  • BigQuery 101: All the Basics You Need to Know

    Google BigQuery is an enterprise data warehouse built using BigTable and Google Cloud Platform. It’s serverless and completely managed. BigQuery works great with all sizes of data, from a 100 row Excel spreadsheet to several Petabytes of data. Most importantly, it can execute a complex query on those data within a few seconds.

    We need to note before we proceed, BigQuery is not a transactional database. It takes around 2 seconds to run a simple query like ‘SELECT * FROM bigquery-public-data.object LIMIT 10’ on a 100 KB table with 500 rows. Hence, it shouldn’t be thought of as OLTP (Online Transaction Processing) database. BigQuery is for Big Data!

    BigQuery supports SQL-like query, which makes it user-friendly and beginner friendly. It’s accessible via its web UI, command-line tool, or client library (written in C#, Go, Java, Node.js, PHP, Python, and Ruby). You can also take advantage of its REST APIs and get our job` done by sending a JSON request.

    Now, let’s dive deeper to understand it better. Suppose you are a data scientist (or a startup which analyzes data) and you need to analyze terabytes of data. If you choose a tool like MySQL, the first step before even thinking about any query is to have an infrastructure in place, that can store this magnitude of data.

    Designing this setup itself will be a difficult task because you have to figure out what will be the RAM size, DCOS or Kubernetes, and other factors. And if you have streaming data coming, you will need to set up and maintain a Kafka cluster. In BigQuery, all you have to do is a bulk upload of your CSV/JSON file, and you are done. BigQuery handles all the backend for you. If you need streaming data ingestion, you can use Fluentd. Another advantage of this is that you can connect Google Analytics with BigQuery seamlessly.

    BigQuery is serverless, highly available, and petabyte scalable service which allows you to execute complex SQL queries quickly. It lets you focus on analysis rather than handling infrastructure. The idea of hardware is completely abstracted and not visible to us, not even as virtual machines.

    Architecture of Google BigQuery

    You don’t need to know too much about the underlying architecture of BigQuery. That’s actually the whole idea of it – you don’t need to worry about architecture and operation.

    However, understanding BigQuery Architecture helps us in controlling costs, optimizing query performance, and optimizing storage. BigQuery is built using the Google Dremel paper.

    Quoting an Abstract from the Google Dremel Paper –

    “Dremel is a scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system scales to thousands of CPUs and petabytes of data and has thousands of users at Google. In this paper, we describe the architecture and implementation of Dremel and explain how it complements MapReduce-based computing. We present a novel columnar storage representation for nested records and discuss experiments on few-thousand node instances of the system.”

    Dremel was in production at Google since 2006. Google used it for the following tasks –

    • Analysis of crawled web documents.
    • Tracking install data for applications on Android Market.
    • Crash reporting for Google products.
    • OCR results from Google Books.
    • Spam analysis.
    • Debugging of map tiles on Google Maps.
    • Tablet migrations in managed Bigtable instances.
    • Results of tests run on Google’s distributed build system.
    • Disk I/O statistics for hundreds of thousands of disks.
    • Resource monitoring for jobs run in Google’s data centers.
    • Symbols and dependencies in Google’s codebase.

    BigQuery is much more than Dremel. Dremel is just a query execution engine, whereas Bigquery is based on interesting technologies like Borg (predecessor of Kubernetes) and Colossus. Colossus is the successor to the Google File System (GFS) as mentioned in Google Spanner Paper.

    How BigQuery Stores Data?

    BigQuery stores data in a columnar format – Capacitor (which is a successor of ColumnarIO). BigQuery achieves very high compression ratio and scan throughput. Unlike ColumnarIO, now on BigQuery, you can directly operate on compressed data without decompressing it.

    Columnar storage has the following advantages:

    • Traffic minimization – When you submit a query, the required column values on each query are scanned and only those are transferred on query execution. E.g., a query `SELECT title FROM Collection` would access the title column values only.
    • Higher compression ratio – Columnar storage can achieve a compression ratio of 1:10, whereas ordinary row-based storage can compress at roughly 1:3.

    (Image source:  Google Dremel Paper)

    Columnar storage has the disadvantage of not working efficiently when updating existing records. That is why Dremel doesn’t support any update queries.  

    How the Query Gets Executed?

    BigQuery depends on Borg for data processing. Borg simultaneously instantiates hundreds of Dremel jobs across required clusters made up of thousands of machines. In addition to assigning compute capacity for Dremel jobs, Borg handles fault-tolerance as well.

    Now, how do you design/execute a query which can run on thousands of nodes and fetches the result? This challenge was overcome by using the Tree Architecture. This architecture forms a gigantically parallel distributed tree for pushing down a query to the tree and aggregating the results from the leaves at a blazingly fast speed.

    (Image source: Google Dremel Paper)

    BigQuery vs. MapReduce

    The key differences between BigQuery and MapReduce are –

    • Dremel is designed as an interactive data analysis tool for large datasets
    • MapReduce is designed as a programming framework to batch process large datasets

    Moreover, Dremel finishes most queries within seconds or tens of seconds and can even be used by non-programmers, whereas MapReduce takes much longer (sometimes even hours or days) to process a query.

    Following is a comparison on running MapReduce on a row and columnar DB:

    (Image source: Google Dremel Paper)

    Another important thing to note is that BigQuery is meant to analyze structured data (SQL) but in MapReduce, you can write logic for unstructured data as well.

    Comparing BigQuery and Redshift

    In Redshift, you need to allocate different instance types and create your own clusters. The benefit of this is that it lets you tune the compute/storage to meet your needs. However, you have to be aware of (virtualized) hardware limits and scale up/out based on that. Note that you are charged by the hour for each instance you spin up.

    In BigQuery, you just upload the data and query it. It is a truly managed service. You are charged by storage, streaming inserts, and queries.

    There are more similarities in both the data warehouses than the differences.

    A smart user will definitely take advantage of the hybrid cloud (GCE+AWS) and leverage different services offered by both the ecosystems. Check out your quintessential guide to AWS Athena here.

    Getting Started With Google BigQuery

    Following is a quick example to show how you can quickly get started with BigQuery:

    1. There are many public datasets available on bigquery, you are going to play with ‘bigquery-public-data:stackoverflow’ dataset. You can click on the “Add Data” button on the left panel and select datasets.

    2. Next, find a language that has the best community, based on the response time. You can write the following query to do that.

    WITH question_answers_join AS (
      SELECT *
        , GREATEST(1, TIMESTAMP_DIFF(answers.first, creation_date, minute)) minutes_2_answer
      FROM (
        SELECT id, creation_date, title
          , (SELECT AS STRUCT MIN(creation_date) first, COUNT(*) c
             FROM `bigquery-public-data.stackoverflow.posts_answers` 
             WHERE a.id=parent_id
          ) answers
          , SPLIT(tags, '|') tags
        FROM `bigquery-public-data.stackoverflow.posts_questions` a
        WHERE EXTRACT(year FROM creation_date) > 2014
      )
    )
    SELECT COUNT(*) questions, tag
      , ROUND(EXP(AVG(LOG(minutes_2_answer))), 2) mean_geo_minutes
      , APPROX_QUANTILES(minutes_2_answer, 100)[SAFE_OFFSET(50)] median
    FROM question_answers_join, UNNEST(tags) tag
    WHERE tag IN ('javascript', 'python', 'rust', 'java', 'scala', 'ruby', 'go', 'react', 'c', 'c++')
    AND answers.c > 0
    GROUP BY tag
    ORDER BY mean_geo_minutes

    3. Now you can execute the query and get results –

    You can see that C has the best community followed by JavaScript!

    How to do Machine Learning on BigQuery?

    Now that you have a sound understanding of BigQuery. It’s time for some real action.

    As discussed above, you can connect Google Analytics with BigQuery by going to the Google Analytics Admin panel, then enable BigQuery by clicking on PROPERTY column, click All Products, then click Link BigQuery. After that, you need to enter BigQuery ID (or project number) and then BigQuery will be linked to Google Analytics. Note – Right now BigQuery integration is only available to Google Analytics 360.

    Assuming that you already have uploaded your google analytics data, here is how you can create a logistic regression model. Here, you are predicting whether a website visitor will make a transaction or not.

    CREATE MODEL `velotio_tutorial.sample_model`
    OPTIONS(model_type='logistic_reg') AS
    SELECT
      IF(totals.transactions IS NULL, 0, 1) AS label,
      IFNULL(device.operatingSystem, "") AS os,
      device.isMobile AS is_mobile,
      IFNULL(geoNetwork.country, "") AS country,
      IFNULL(totals.pageviews, 0) AS pageviews
    FROM
      `bigquery-public-data.google_analytics_sample.ga_sessions_*`
    WHERE
      _TABLE_SUFFIX BETWEEN '20190401' AND '20180630'

    Create a model named ‘velotio_tutorial.sample_model’. Now set the ‘model_type’ as ‘logistic_reg’ because you want to train a logistic regression model. A logistic regression model splits input data into two classes and gives the probability that the data is in one of the classes. Usually, in “spam or not spam” type of problems, you use logistic regression. Here, the problem is similar – a transaction will be made or not.

    The above query gets the total number of page views, the country from where the session originated, the operating system of visitors device, the total number of e-commerce transactions within the session, etc.

    Now you just press run query to execute the query.

    Conclusion

    BigQuery is a query service that allows us to run SQL-like queries against multiple terabytes of data in a matter of seconds. If you have structured data, BigQuery is the best option to go for. It can help even a non-programmer to get the analytics right!

    Learn how to build an ETL Pipeline for MongoDB & Amazon Redshift using Apache Airflow.

    If you need help with using machine learning in product development for your organization, connect with experts at Velotio!