Category: Services

  • Eliminate Render-blocking Resources using React and Webpack

    In the previous blog, we learned how a browser downloads many scripts and useful resources to render a webpage. But not all of them are necessary to show a page’s content. Because of this, the page rendering is delayed. However, most of them will be needed as the user navigates through the website’s various pages.

    In this article, we’ll learn to identify such resources and classify them as critical and non-critical. Once identified, we’ll inline the critical resources and defer the non-critical resources.

    For this blog, we’ll use the following tools:

    • Google Lighthouse and other Chrome DevTools to identify render-blocking resources.
    • Webpack and CRACO to fix it.

    Demo Configuration

    For the demo, I have added the JavaScript below to the <head></head> of index.html as a render-blocking JS resource. This script loads two more CSS resources on the page.

    https://use.fontawesome.com/3ec06e3d93.js

    Other configurations are as follows:

    • Create React App v4.0
    • Formik and Yup for handling form validations
    • Font Awesome and Bootstrap
    • Lazy loading and code splitting using Suspense, React lazy, and dynamic import
    • CRACO
    • html-critical-webpack-plugin
    • ngrok and serve for serving build

    Render-Blocking Resources

    A render-blocking resource typically refers to a script or link that prevents a browser from rendering the processed content.

    Lighthouse will flag the below as render-blocking resources:

    • A <script></script> tag in <head></head> that doesn’t have a defer or async attribute.
    • A <link rel=””stylesheet””> tag that doesn’t have a media attribute to match a user’s device or a disabled attribute to hint browser to not download if unnecessary.
    • A <link rel=””import””> that doesn’t have an async attribute.

    Identifying Render-Blocking Resources

    To reduce the impact of render-blocking resources, find out what’s critical for loading and what’s not.

    To do that, we’re going to use the Coverage Tab in Chrome DevTools. Follow the steps below:

    1. Open the Chrome DevTools (press F12)

    2. Go to the Sources tab and press the keys to Run command

    The below screenshot is taken on a macOS.

    3. Search for Show Coverage and select it, which will show the Coverage tab below. Expand the tab.

    4. Click on the reload button on the Coverage tab to reload the page and start instrumenting the coverage of all the resources loading on the current page.

    5. After capturing the coverage, the resources loaded on the page will get listed (refer to the screenshot below). This will show you the code being used vs. the code loaded on the page.

    The list will display coverage in 2 colors:

    a. Green (critical) – The code needed for the first paint

    b. Red (non-critical) – The code not needed for the first paint.

    After checking each file and the generated index.html after the build, I found three primary non-critical files –

    a. 5.20aa2d7b.chunk.css98% non-critical code

    b. https://use.fontawesome.com/3ec06e3d93.js – 69.8% non-critical code. This script loads below CSS –

    1. font-awesome-css.min.css – 100% non-critical code

    2. https://use.fontawesome.com/3ec06e3d93.css – 100% non-critical code

    c. main.6f8298b5.chunk.css – 58.6% non-critical code

    The above resources satisfy the condition of a render-blocking resource and hence are prompted by the Lighthouse Performance report as an opportunity to eliminate the render-blocking resources (refer screenshot). You can reduce the page size by only shipping the code that you need.

    Solution

    Once you’ve identified critical and non-critical code, it is time to extract the critical part as an inline resource in index.html and deferring the non-critical part by using the webpack plugin configuration.

    For Inlining and Preloading CSS: 

    Use html-critical-webpack-plugin to inline the critical CSS into index.html. This will generate a <style></style> tag in the <head> with critical CSS stripped out of the main CSS chunk and preloading the main file.</head>

    const path = require('path');
    const { whenProd } = require('@craco/craco');
    const HtmlCriticalWebpackPlugin = require('html-critical-webpack-plugin');
    
    module.exports = {
      webpack: {
        configure: (webpackConfig) => {
          return {
            ...webpackConfig,
            plugins: [
              ...webpackConfig.plugins,
              ...whenProd(
                () => [
                  new HtmlCriticalWebpackPlugin({
                    base: path.resolve(__dirname, 'build'),
                    src: 'index.html',
                    dest: 'index.html',
                    inline: true,
                    minify: true,
                    extract: true,
                    width: 320,
                    height: 565,
                    penthouse: {
                      blockJSRequests: false,
                    },
                  }),
                ],
                []
              ),
            ],
          };
        },
      },
    };

    Once done, create a build and deploy. Here’s a screenshot of the improved opportunities:

    To use CRACO, refer to its README file.

    NOTE: If you’re planning to use the critters-webpack-plugin please check these issues first: Could not find HTML asset and Incompatible with html-webpack-plugin v4.

    For Deferring Routes/Pages:

    Use lazy-loading and code-splitting techniques along with webpack’s magic comments as below to preload or prefetch a route/page according to your use case.

    import { Suspense, lazy } from 'react';
    import { Redirect, Route, Switch } from 'react-router-dom';
    import Loader from '../../components/Loader';
    
    import './style.scss';
    
    const Login = lazy(() =>
      import(
        /* webpackChunkName: "login" */ /* webpackPreload: true */ '../../containers/Login'
      )
    );
    const Signup = lazy(() =>
      import(
        /* webpackChunkName: "signup" */ /* webpackPrefetch: true */ '../../containers/Signup'
      )
    );
    
    const AuthLayout = () => {
      return (
        <Suspense fallback={<Loader />}>
          <Switch>
            <Route path="/auth/login" component={Login} />
            <Route path="/auth/signup" component={Signup} />
            <Redirect from="/auth" to="/auth/login" />
          </Switch>
        </Suspense>
      );
    };
    
    export default AuthLayout;

    The magic comments enable webpack to add correct attributes to defer the scripts according to the use-case.

    For Deferring External Scripts:

    For those who are using a version of webpack lower than 5, use script-ext-html-webpack-plugin or resource-hints-webpack-plugin.

    I would recommend following the simple way given below to defer an external script.

    // Add defer/async attribute to external render-blocking script
    <script async defer src="https://use.fontawesome.com/3ec06e3d93.js"></script>

    The defer and async attributes can be specified on an external script. The async attribute has a higher preference. For older browsers, it will fallback to the defer behaviour.

    If you want to know more about the async/defer, read the further reading section.

    Along with defer/async, we can also use media attributes to load CSS conditionally.

    It’s also suggested to load fonts locally instead of using full CDN in case we don’t need all the font-face rules added by Font providers.

    Now, let’s create and deploy the build once more and check the results.

    The opportunity to eliminate render-blocking resources shows no more in the list.

    We have finally achieved our goal!

    Final Thoughts

    The above configuration is a basic one. You can read the libraries’ docs for more complex implementation.

    Let me know if this helps you eliminate render-blocking resources from your app.

    If you want to check out the full implementation, here’s the link to the repo. I have created two branches—one with the problem and another with the solution. Read the further reading section for more details on the topics.

    Hope this helps.

    Happy Coding!

    Further Reading

  • Installing Redis Cluster with Persistent Storage on Mesosphere DC/OS

    In the first part of this blog, we saw how to install standalone redis service on DC/OS with Persistent storage using RexRay and AWS EBS volumes.

    A single server is a single point of failure in every system, so to ensure high availability of redis database, we can deploy a master-slave cluster of Redis servers. In this blog, we will see how to setup such 6 node (3 master, 3 slave) Redis cluster and persist data using RexRay and AWS EBS volumes. After that we will see how to import existing data into this cluster.

    Redis Cluster

    It is form of replicated Redis servers in multi-master architecture. All the data is sharded into 16384 buckets, where every master node is assigned subset of buckets out of them (generally evenly sharded) and each master replicated by its slaves.  It provides more resilience and scaling for production grade deployments where heavy workload is expected. Applications can connect to any node in cluster mode and the request will be redirected to respective master node.

     Source:  Octo

         

    Objective: To create a Redis cluster with number of services in DCOC environment with persistent storage and import the existing Redis dump.rdb data to the cluster.

    Prerequisites :  

    • Make sure rexray component is running and is in a healthy state for DCOS cluster.

    Steps:

    • As per Redis doc, the minimal cluster should have at least 3 master and 3 slave nodes, so making it a total 6 Redis services.
    • All services will use similar json configuration except changes in names of service, external volume, and port mappings.
    • We will deploy one Redis service for each Redis cluster node and once all services are running, we will form cluster among them.
    • We will use host network for Redis node containers, for that we will restrict Redis nodes to run on particular node. This will help us to troubleshoot cluster (fixed IP, so we can restart Redis node any time without data loss).
    • Using host network adds a prerequisites that number of dcos nodes >= number of Redis nodes.
    1. First create Redis node services on DCOS:
    2. Click on the Add button in Services tab of DCOS UI
    • Click on JSON configuration
    • Add below json config for Redis service, change the values which are written in BLOCK letters with # as prefix and suffix.
    • #NODENAME# – Name of Redis node (Ex. redis-node-1)
    • #NODEHOSTIP# – IP of dcos node on which this Redis node will run. This ip must be unique for each Redis node. (Ex. 10.2.12.23)
    • #VOLUMENAME# – Name of persistent volume, Give name to identify volume on AWS EBS (Ex. <dcos cluster=”” name=””>-redis-node-<node number=””>)</node></dcos>
    • #NODEVIP# – VIP For the Redis node. It must be ‘Redis’ for first Redis node, for others it can be the same as NODENAME (Ex. redis-node-2)
    {
       "id": "/#NODENAME#",
       "backoffFactor": 1.15,
       "backoffSeconds": 1,
       "constraints": [
         [
           "hostname",
           "CLUSTER",
           "#NODEHOSTIP#"
         ]
       ],
       "container": {
         "type": "DOCKER",
         "volumes": [
           {
             "external": {
               "name": "#VOLUMENAME#",
               "provider": "dvdi",
               "options": {
                 "dvdi/driver": "rexray"
               }
             },
             "mode": "RW",
             "containerPath": "/data"
           }
         ],
         "docker": {
           "image": "parvezkazi13/redis:latest",
           "forcePullImage": false,
           "privileged": false,
           "parameters": []
         }
       },
       "cpus": 0.5,
       "disk": 0,
       "fetch": [],
       "healthChecks": [],
       "instances": 1,
       "maxLaunchDelaySeconds": 3600,
       "mem": 4096,
       "gpus": 0,
       "networks": [
         {
           "mode": "host"
         }
       ],
       "portDefinitions": [
         {
           "labels": {
             "VIP_0": "/#NODEVIP#:6379"
           },
           "name": "#NODEVIP#",
           "protocol": "tcp",
           "port": 6379
         }
       ],
       "requirePorts": true,
       "upgradeStrategy": {
         "maximumOverCapacity": 0,
         "minimumHealthCapacity": 0.5
       },
       "killSelection": "YOUNGEST_FIRST",
       "unreachableStrategy": {
         "inactiveAfterSeconds": 300,
         "expungeAfterSeconds": 600
       }
     }

    • After updating the highlighted fields, copy above json to json configuration box, click on ‘Review & Run’ button in the right corner, this will start the service with above configuration.
    • Once above service is UP and Running, then repeat the step 2 to 4 for each Redis node with respective values for highlighted fields.
    • So if we go with 6 node cluster, at the end we will have 6 Redis nodes UP and Running, like:

    Note: Since we are using external volume for persistent storage, we can not scale our services, i.e. each service will only one instance max. If we try to scale, we will get below error :

    2. Form the Redis cluster between Redis node services:

    • To create or manage Redis-cluster, first deploy redis-cluster-util container on DCOS using below json config:
    {
     "id": "/infrastructure/redis-cluster-util",
     "backoffFactor": 1.15,
     "backoffSeconds": 1,
     "constraints": [],
     "container": {
       "type": "DOCKER",
       "volumes": [
         {
           "containerPath": "/backup",
           "hostPath": "backups",
           "mode": "RW"
         }
       ],
       "docker": {
         "image": "parvezkazi13/redis-util",
         "forcePullImage": true,
         "privileged": false,
         "parameters": []
       }
     },
     "cpus": 0.25,
     "disk": 0,
     "fetch": [],
     "instances": 1,
     "maxLaunchDelaySeconds": 3600,
     "mem": 4096,
     "gpus": 0,
     "networks": [
       {
         "mode": "host"
       }
     ],
     "portDefinitions": [],
     "requirePorts": true,
     "upgradeStrategy": {
       "maximumOverCapacity": 0,
       "minimumHealthCapacity": 0.5
     },
     "killSelection": "YOUNGEST_FIRST",
     "unreachableStrategy": {
       "inactiveAfterSeconds": 300,
       "expungeAfterSeconds": 600
     },
     "healthChecks": []
    }

    This will run service as :

    • Get the IP addresses of all Redis nodes to form the cluster, as Redis-cluster can not be created with node’s hostname / dns. This is an open issue.

    Since we are using host network, we need the dcos node IP on which Redis nodes are running.

    Get all Redis nodes IP using:

    NODE_BASE_NAME=redis-nodedcos task $NODE_BASE_NAME | grep -E "$NODE_BASE_NAME-<[0-9]>" | awk '{print $2":6379"}' | paste -s -d' '  

    Here Redis-node is the prefix used for all Redis nodes.

    Note the output of this command, we will use it in further steps.

    • Get the node where redis-cluster-util container is running and ssh to dcos node using:
    dcos node ssh --master-proxy --private-ip $(dcos task | grep "redis-cluster-util" | awk '{print $2}')

    • Now find the docker container id of redis-cluster-util and exec it using:
    docker exec -it $(docker ps -qf ancestor="parvezkazi13/redis-util") bash  

    • No we are inside the redis-cluster-util container. Run below command to form Redis cluster.
    redis-trib.rb create --replicas 1 <Space separated IP address:PORT pair of all Redis nodes>

    • Here use the Redis nodes IP addresses which retrieved in step 2.
    redis-trib.rb create --replicas 1 10.0.1.90:6379 10.0.0.19:6379 10.0.9.203:6379 10.0.9.79:6379 10.0.3.199:6379 10.0.9.104:6379

    • Parameters:
    • The option –replicas 1 means that we want a slave for every master created.
    • The other arguments are the list of addresses (host:port) of the instances we want to use to create the new cluster.
    • Output:
    • Select ‘yes’ when it prompts to set the slot configuration shown.
    • Run below command to check the status of the newly create cluster
    redis-trib.rb check <Any redis node host:PORT>
    Ex:
    redis-trib.rb check 10.0.1.90:6379

    • Parameters:
    • host:port of any node from the cluster.
    • Output:
    • If all OK, it will show OK with status, else it will show ERR with the error message.

    3. Import existing dump.rdb to Redis cluster

    • At this point, all the Redis nodes should be empty and each one should have an ID and some assigned slots:

    Before reuse an existing dump data, we have to reshard all slots to one instance. We specify the number of slots to move (all, so 16384), the id we move to (here Node 1 – 10.0.1.90:6379) and where we take these slots from (all other nodes).

    redis-trib.rb reshard 10.0.1.90:6379  

    Parameters:

    host:port of any node from the cluster.

    Output:

    It will prompt for number of slots to move – here all. i.e 16384

    Receiving node id – here id of node 10.0.1.90:6379 (redis-node-1)

    Source node IDs  – here all, as we want to shard  all slots to one node.

    And prompt to proceed – press ‘yes’

    • Now check again node 10.0.1.90:6379  
    redis-trib.rb check 10.0.1.90:6379  

    Parameters: host:port of any node from the cluster.

    Output: it will show all (16384) slots moved to node 10.0.1.90:6379

    • Next step is Importing our existing Redis dump data.  

    Now copy the existing dump.rdb to our redis-cluster-util container using below steps:

    – Copy existing dump.rdb to dcos node on which redis-cluster-util container is running. Can use scp from any other public server to dcos node.

    – Now we have dump .rdb in our dcos node, copy this dump.rdb to redis-cluster-util container using below command:

    docker cp dump.rdb $(docker ps -qf ancestor="parvezkazi13/redis-util"):/data

    Now we have dump.rdb in our redis-cluster-util container, we can import it to our Redis cluster. Execute and go to the redis-cluster-util container using:

    docker exec -it $(docker ps -qf ancestor="parvezkazi13/redis-util") bash

    It will execute redis-cluster-util container which is already running and start its bash cmd.

    Run below command to import dump.rdb to Redis cluster:

    rdb --command protocol /data/dump.rdb | redis-cli --pipe -h 10.0.1.90 -p 6379

    Parameters:

    Path to dump.rdb

    host:port of any node from the cluster.

    Output:

    If successful, you’ll see something like:

    All data transferred. Waiting for the last reply...Last reply received from server.errors: 0, replies: 4341259  

    as well as this in the Redis server logs:

    95086:M 01 Mar 21:53:42.071 * 10000 changes in 60 seconds. Saving...95086:M 01 Mar 21:53:42.072 * Background saving started by pid 9822398223:C 01 Mar 21:53:44.277 * DB saved on disk

    WARNING:
    Like our Oracle DB instance can have multiple databases, similarly Redis saves keys in keyspaces.
    Now when Redis is in cluster mode, it does not accept the dumps which has more than one keyspaces. As per documentation:

    Redis Cluster does not support multiple databases like the stand alone version of Redis. There is just database 0 and the SELECT command is not allowed. “

    So while importing such multi-keyspace Redis dump, server fails while starting on below issue :

    23049:M 16 Mar 17:21:17.772 * DB loaded from disk: 5.222 seconds
    23049:M 16 Mar 17:21:17.772 # You can't have keys in a DB different than DB 0 when in Cluster mode. Exiting.
    Solution / WA :

    There is redis-cli command “MOVE” to move keys from one keyspace to another keyspace.

    Also can run below command to move all the keys from keyspace 1 to keyspace 0 :

    redis-cli -h "$HOST" -p "$PORT" -n 1 --raw keys "*" |  xargs -I{} redis-cli -h "$HOST" -p "$PORT" -n 1 move {} 0

    • Verify import status, using below commands : (inside redis-cluster-util container)
    redis-cli -h 10.0.1.90 -p 6379 info keyspace

    It will run Redis info command on node 10.0.1.90:6379 and fetch keyspace information, like below:

    # Keyspace
    db0:keys=33283,expires=0,avg_ttl=0

    • Now reshard all the slots to all instances evenly

    The reshard command will again list the existing nodes, their IDs and the assigned slots.

    redis-trib.rb reshard 10.0.1.90:6379

    Parameters:

    host:port of any node from the cluster.

    Output:

    It will prompt for number of slots to move – here (16384 /3 Masters = 5461)

    Receiving node id – here id of master node 2  

    Source node IDs  – id of first instance which has currently all the slots. (master 1)

    And prompt to proceed – press ‘yes’

    Repeat above step and for receiving node id, give id of master node 3.

    • After the above step, all 3 masters will have equal slots and imported keys will be distributed among the master nodes.
    • Put keys to cluster for verification
    redis-cli -h 10.0.1.90 -p 6379 set foo bar
    OK
    redis-cli -h 10.0.1.90 -p 6379 set foo bar
    (error) MOVED 4813 10.0.9.203:6379

    Above error shows that server saved this key to instance 10.0.9.203:6379, so client redirected it. To follow redirection, use flag “-c” which says it is a cluster mode, like:

    redis-cli -h 10.0.1.90 -p 6379 -c set foo bar
    OK

    Redis Entrypoint

    Application entrypoint for Redis cluster is mostly depends how your Redis client handles cluster support. Generally connecting to one of master nodes should do the work.

    Use below host:port in your applications :

    redis.marathon.l4lb.thisdcos.directory:6379

    Automation of Redis Cluster Creation

    We have automation script in place to deploy 6 node Redis cluster and form a cluster between them.

    Script location: Github

    • It deploys 6 marathon apps for 6 Redis nodes. All nodes are deployed on different nodes with CLUSTER_NAME as prefix to volume name.
    • Once all nodes are up and running, it deploys redis-cluster-util app which will be used to form Redis cluster.
    • Then it will print the Redis nodes and their IP addresses and prompt the user to proceed cluster creation.
    • If user selects to proceed, it will run redis-cluster-util app and create the cluster using IP addresses collected. Util container will prompt for some input that the user has to select.

    Conclusion

    We learned about Redis cluster deployment on DCOS with Persistent storage using RexRay. We also learned how rexray automatically manages volumes over aws ebs and how to integrate them in DCOS apps/services. We saw how to use redis-cluster-util container to manage Redis cluster for different purposes, like forming cluster, resharding, importing existing dump.rdb data etc. Finally, we looked at the automation part of whole cluster setup using dcos cli and bash.

    Reference

  • Micro Frontends: Reinventing UI In The Microservices World

    It is amazing how the software industry has evolved. Back in the day, a software was a simple program. Some of the first software applications like The Apollo Missions Landing modules and Manchester Baby were basic stored procedures. Software was primarily used for research and mathematical purposes.

    The invention of personal computers and the prominence of the Internet changed the software world. Desktop applications like word processors, spreadsheets, and games grew. Websites gradually emerged. Back then, simple pages were delivered to the client as static documents for viewing. By the mid-1990s, with Netscape introducing client-side scripting language, JavaScript and Macromedia bringing in Flash, the browser became more powerful, allowing websites to become richer and more interactive. In 1999, the Java language introduced Servlets. And thus born the Web Application. Nevertheless, these developments and applications were still simpler. Engineers didn’t emphasize enough on structuring them and mostly built unstructured monolithic applications.

    The advent of disruptive technologies like cloud computing and Big data paved the way for more intricate, convolute web and native mobile applications. From e-commerce and video streaming apps to social media and photo editing, we had applications doing some of the most complicated data processing and storage tasks. The traditional monolithic way now posed several challenges in terms of scalability, team collaboration and integration/deployment, and often led to huge and messy The Ball Of Mud codebases.

    Fig: Monolithic Application Problems – Source

    To untangle this ball of software, came in a number of service-oriented architectures. The most promising of them was Microservices – breaking an application into smaller chunks that can be developed, deployed and tested independently but worked as a single cohesive unit. Its benefits of scalability and ease of deployment by multiple teams proved as a panacea to most of the architectural problems. A few front-end architectures also came up, such as MVC, MVVM, Web Components, to name a few. But none of them were fully able to reap the benefits of Microservices.

    Fig: Microservice Architecture – Source

    ‍Micro Frontends: The Concept‍

    Micro Frontends first came up in ThoughtWorks Technology Radar where they assessed, tried and eventually adopted the technology after noticing significant benefits. It is a Microservice approach to front-end web development where independently deliverable front-end applications are composed as a whole. 

    With Microservices, Micro Frontends breaks the last monolith to create a complete Micro-Architecture design pattern for web applications. It is entirely composed of loosely coupled vertical slices of business functionality rather than in horizontals. We can term these verticals as ‘Microapps’. This concept is not new and has appeared in Scaling with Microservices and Vertical Decomposition. It first presented the idea of every vertical being responsible for a single business domain and having its presentation layer, persistence layer, and a separate database. From the development perspective, every vertical is implemented by exactly one team and no code is shared among different systems.

    Fig: Micro Frontends with Microservices (Micro-architecture)

    Why Micro Frontends?

    A microservice architecture has a whole slew of advantages when compared to monolithic architectures.

    Ease of Upgrades – Micro Frontends build strict bounded contexts in the application. Applications can be updated in a more incremental and isolated fashion without worrying about the risks of breaking up another part of the application.

    Scalability – Horizontal scaling is easy for Micro Frontends. Each Micro Frontend has to be stateless for easier scalability.

    Ease of deployability: Each Micro Frontend has its CI/CD pipeline, that builds, tests and deploys it to production. So it doesn’t matter if another team is working on a feature and has pushed a bug fix or if a cutover or refactoring is taking place. There should be no risks involved in pushing changes done on a Micro Frontend as long as there is only one team working on it.

    Team Collaboration and Ownership: The Scrum Guide says that “Optimal Development Team size is small enough to remain nimble and large enough to complete significant work within a Sprint”. Micro Frontends are perfect for multiple cross-functional teams that can completely own a stack (Micro Frontend) of an application from UX to Database design. In case of an E-commerce site, the Product team and the Payment team can concurrently work on the app without stepping on each other’s toes.

    Micro Frontend Integration Approaches

    There is a multitude of ways to implement Micro Frontends. It is recommended that any approach for this should take a Runtime integration route instead of a Build Time integration, as the former has to re-compile and release on every single Micro Frontend to release any one of the Micro Frontend’s changes.

    We shall learn some of the prominent approaches of Micro Frontends by building a simple Pet Store E-Commerce site. The site has the following aspects (or Microapps, if you will) – Home or Search, Cart, Checkout, Product, and Contact Us. We shall only be working on the Front-end aspect of the site. You can assume that each Microapp has a microservice dedicated to it in the backend. You can view the project demo here and the code repository here. Each way of integration has a branch in the repo code that you can check out to view.

    Single Page Frontends –

    The simplest way (but not the most elegant) to implement Micro Frontends is to treat each Micro Frontend as a single page.

    Fig: Single Page Micro Frontends: Each HTML file is a frontend.
    !DOCTYPE html>
    <html lang="zxx">
    <head>
    	<title>The MicroFrontend - eCommerce Template</title>
    </head>
    <body>
      <header class="header-section header-normal">
        <!-- Header is repeated in each frontend which is difficult to maintain -->
        ....
        ....
      </header
      <main>
      </main>
      <footer
        <!-- Footer is repeated in each frontend which means we have to multiple changes across all frontends-->
      </footer>
      <script>
        <!-- Cross Cutting features like notification, authentication are all replicated in all frontends-->
      </script>
    </body>

    It is one of the purest ways of doing Micro Frontends because no container or stitching element binds the front ends together into an application. Each Micro Frontend is a standalone app with each dependency encapsulated in it and no coupling with the others. The flipside of this approach is that each frontend has a lot of duplication in terms of cross-cutting concerns like headers and footers, which adds redundancy and maintenance burden.

    JavaScript Rendering Components (Or Web Components, Custom Element)-

    As we saw above, single-page Micro Frontend architecture has its share of drawbacks. To overcome these, we should opt for an architecture that has a container element that builds the context of the app and the cross-cutting concerns like authentication, and stitches all the Micro Frontends together to create a cohesive application.

    // A virtual class from which all micro-frontends would extend
    class MicroFrontend {
      
      beforeMount() {
        // do things before the micro front-end mounts
      }
    
      onChange() {
        // do things when the attributes of a micro front-end changes
      }
    
      render() {
        // html of the micro frontend
        return '<div></div>';
      }
    
      onDismount() {
        // do things after the micro front-end dismounts 
      }
    }

    class Cart extends MicroFrontend {
      beforeMount() {
        // get previously saved cart from backend
      }
    
      render() {
        return `<!-- Page -->
        <div class="page-area cart-page spad">
          <div class="container">
            <div class="cart-table">
              <table>
                <thead>
                .....
                
         `
      }
    
      addItemToCart(){
        ...
      }
        
      deleteItemFromCart () {
        ...
      }
    
      applyCouponToCart() {
        ...
      }
        
      onDismount() {
        // save Cart for the user to get back to afterwards
      }
    }

    class Product extends MicroFrontend {
      static get productDetails() {
        return {
          '1': {
            name: 'Cat Table',
            img: 'img/product/cat-table.jpg'
          },
          '2': {
            name: 'Dog House Sofa',
            img: 'img/product/doghousesofa.jpg'
          },
        }
      }
      getProductDetails() {
        var urlParams = new URLSearchParams(window.location.search);
        const productId = urlParams.get('productId');
        return this.constructor.productDetails[productId];
      }
      render() {
        const product = this.getProductDetails();
        return `	<!-- Page -->
        <div class="page-area product-page spad">
          <div class="container">
            <div class="row">
              <div class="col-lg-6">
                <figure>
                  <img class="product-big-img" src="${product.img}" alt="">`
      }
      selectProductColor(color) {}
    
      selectProductSize(size) {}
     
      addToCart() {
        // delegate call to MicroFrontend Cart.addToCart function
      }
      
    }

    <!DOCTYPE html>
    <html lang="zxx">
    <head>
    	<title>PetStore - because Pets love pampering</title>
    	<meta charset="UTF-8
      <link rel="stylesheet" href="css/style.css"/>
    
    </head>
    <body>
    	<!-- Header section -->
    	<header class="header-section">
      ....
      </header>
    	<!-- Header section end -->
    	<main id='microfrontend'>
        <!-- This is where the Micro-frontend gets rendered by utility renderMicroFrontend.js-->
    	</main>
                                    <!-- Header section -->
    	<footer class="header-section">
      ....
      </footer>
    	<!-- Footer section end -->
      	<script src="frontends/MicroFrontend.js"></script>
    	<script src="frontends/Home.js"></script>
    	<script src="frontends/Cart.js"></script>
    	<script src="frontends/Checkout.js"></script>
    	<script src="frontends/Product.js"></script>
    	<script src="frontends/Contact.js"></script>
    	<script src="routes.js"></script>
    	<script src="renderMicroFrontend.js"></script>

    function renderMicroFrontend(pathname) {
      const microFrontend = routes[pathname || window.location.hash];
      const root = document.getElementById('microfrontend');
      root.innerHTML = microFrontend ? new microFrontend().render(): new Home().render();
      $(window).scrollTop(0);
    }
    
    $(window).bind( 'hashchange', function(e) { renderFrontend(window.location.hash); });
    renderFrontend(window.location.hash);
    
    utility routes.js (A map of the hash route to the Microfrontend class)
    const routes = {
      '#': Home,
      '': Home,
      '#home': Home,
      '#cart': Cart,
      '#checkout': Checkout,
      '#product': Product,
      '#contact': Contact,
    };

    As you can see, this approach is pretty neat and encapsulates a separate class for Micro Frontends. All other Micro Frontends extend from this. Notice how all the functionality related to Microapp is encapsulated in the respective Micro Frontend. This makes sure that concurrent work on a Micro Frontend doesn’t mess up some other Micro Frontends.

    Everything will work in a similar paradigm when it comes to Web Components and Custom Elements.

    React

    With the client-side JavaScript frameworks being very popular, it is impossible to leave React from any Front End discussion. React being a component-based JS library, much of the things discussed above will also apply to React. I am going to discuss some of the technicalities and challenges when it comes to Micro Frontends with React.

    Styling

    Since there should be minimum sharing of code between any Micro Frontend, styling the React components can be challenging, considering the global and cascading nature of CSS. We should make sure styles are targeted on a specific Micro Frontend without spilling over to other Micro Frontends. Inline CSS, CSS in JS libraries like Radium,  and CSS Modules, can be used with React.

    Redux

    Using React with Redux is kind of a norm in today’s front-end world. The convention is to use Redux as a single global store for the entire app for cross application communication. A Micro Frontend should be self-contained with no dependencies. Hence each Micro Frontend should have its own Redux store, moving towards a multiple Redux store architecture. 

    Other Noteworthy Integration Approaches  –

    Server-side Rendering – One can use a server to assemble Micro Frontend templates before dispatching it to the browser. SSI techniques can be used too.

    iframes – Each Micro Frontend can be an iframe. They also offer a good degree of isolation in terms of styling, and global variables don’t interfere with each other.

    Summary

    With Microservices, Micro Frontends promise to  bring in a lot of benefits when it comes to structuring a complex application and simplifying its development, deployment and maintenance.

    But there is a wonderful saying that goes “there is no one-size-fits-all approach that anyone can offer you. The same hot water that softens a carrot hardens an egg”. Micro Frontend is no silver bullet for your architectural problems and comes with its own share of downsides. With more repositories, more tools, more build/deploy pipelines, more servers, more domains to maintain, Micro Frontends can increase the complexity of an app. It may render cross-application communication difficult to establish. It can also lead to duplication of dependencies and an increase in application size.

    Your decision to implement this architecture will depend on many factors like the size of your organization and the complexity of your application. Whether it is a new or legacy codebase, it is advisable to apply the technique gradually over time and review its efficacy over time.

  • Build and Deploy a Real-Time React App Using AWS Amplify and GraphQL

    GraphQL is becoming a popular way to use APIs in modern web and mobile apps.

    However, learning new things is always time-consuming and without getting your hands dirty, it’s very difficult to understand the nuances of a new technology.

    So, we have put together a powerful and concise tutorial that will guide you through setting up a GraphQL backend and integration into your React app in the shortest time possible. This tutorial is light on opinions, so that once you get a hang of the fundamentals, you can go on and tailor your workflow.

    Key topics and takeaways:

    • Authentication
    • GraphQL API with AWS AppSync
    • Hosting
    • Working with multiple environments
    • Removing services

    What will we be building?

    We will build a basic real-time Restaurant CRUD app using authenticated GraphQL APIs. Click here to try the deployed version of the app to see what we’ll be building.

    Will this tutorial teach React or GraphQL concepts as well?

    No. The focus is to learn how to use AWS Amplify to build cloud-enabled, real-time web applications. If you are new to React or GraphQL, we recommend going through the official documentation and then coming back here.

    What do I need to take this tutorial?

    • Node >= v10.9.0
    • NPM >= v6.9.0 packaged with Node.

    Getting started – Creating the application

    To get started, we first need to create a React project using the create-react-app boilerplate:

    npx create-react-app amplify-app --typescript
    cd amplify-app

    Let’s now install the AWS Amplify and AWS Amplify React bindings and try running the application:

    npm install --save aws-amplify aws-amplify-react
    npm start

    If you have initialized the app with Typescript and see errors while using

    aws-amplify-react, add aws-amplify-react.d.ts to src with:

    declare module 'aws-amplify-react';

    Installing the AWS Amplify CLI and adding it to the project

    To install the CLI:

    npm install -g @aws-amplify/cli

    Now we need to configure the CLI with our credentials:

    amplify configure

    If you’d like to see a video walkthrough of this process, click here

    Here we’ll walk you through the amplify configure setup. After you sign in to the AWS console, follow these steps:

    • Specify the AWS region: ap-south-1 (Mumbai) <Select the region based on your location. Click here for reference>
    • Specify the username of the new IAM user: amplify-app <name of=”” your=”” app=””></name>

    In the AWS Console, click Next: Permissions, Next: Tags, Next: Review, and Create User to create the new IAM user. Then, return to the command line and press Enter.

    • Enter the credentials of the newly created user:
      accessKeyId: <your_access_key_id> </your_access_key_id>
      secretAccessKey: <your_secret_access_key></your_secret_access_key>
    • Profile Name: default

    To view the newly created IAM user, go to the dashboard. Also, make sure that your region matches your selection.

    To add amplify to your project:

    amplify init

    Answer the following questions:

    • Enter a name for the project: amplify-app <name of=”” your=”” app=””></name>
    • Enter a name for the environment: dev <name of=”” your=”” environment=””></name>
    • Choose your default editor: Visual Studio Code <your default editor=””></your>
    • Choose the type of app that you’re building: javaScript
    • What JavaScript framework are you using: React
    • Source Directory Path: src
    • Distribution Directory Path: build
    • Build Command: npm run build (for macOS/Linux), npm.cmd run-script build (for Windows)
    • Start Command: npm start (for macOS/Linux), npm.cmd run-script start (for Windows)
    • Do you want to use an AWS profile: Yes
    • Please choose the profile you want to use: default

    Now, the AWS Amplify CLI has initialized a new project and you will see a new folder: amplify. This folder has files that hold your project configuration.

    <amplify-app>
    |_ amplify
    |_ .config
    |_ #current-cloud-backend
    |_ backend
    team-provider-info.json

    Adding Authentication

    To add authentication:

    amplify add auth

    When prompted, choose:

    • Do you want to use default authentication and security configuration: Default configuration
    • How do you want users to be able to sign in when using your Cognito User Pool: Username
    • What attributes are required for signing up: Email

    Now, let’s run the push command to create the cloud resources in our AWS account:

    amplify push

    To quickly check your newly created Cognito User Pool, you can run

    amplify status

    To access the AWS Cognito Console at any time, go to the dashboard. Also, ensure that your region is set correctly.

    Now, our resources are created and we can start using them.

    The first thing is to connect our React application to our new AWS Amplify project. To do this, reference the auto-generated aws-exports.js file that is now in our src folder.

    To configure the app, open App.tsx and add the following code below the last import:

    import Amplify from 'aws-amplify';
    import awsConfig from './aws-exports';
    
    Amplify.configure(awsConfig);

    Now, we can start using our AWS services.
    To add the Authentication flow to the UI, export the app component by wrapping it with the authenticator HOC:

    import { withAuthenticator } from 'aws-amplify-react';
    ...
    // app component
    ...
    export default withAuthenticator(App);

    Now, let’s run the app to check if an Authentication flow has been added before our App component is rendered.

    This flow gives users the ability to sign up and sign in. To view any users that were created, go back to the Cognito dashboard. Alternatively, you can also use:

    amplify console auth

    The withAuthenticator HOC is a really easy way to get up and running with authentication, but in a real-world application, we probably want more control over how our form looks and functions. We can use the aws-amplify/Auth class to do this. This class has more than 30 methods including signUp, signIn, confirmSignUp, confirmSignIn, and forgotPassword. These functions return a promise, so they need to be handled asynchronously.

    Adding and Integrating the GraphQL API

    To add GraphQL API, use the following command:

    amplify add api

    Answer the following questions:

    • Please select from one of the below mentioned services: GraphQL
    • Provide API name: RestaurantAPI
    • Choose an authorization type for the API: API key
    • Do you have an annotated GraphQL schema: No
    • Do you want a guided schema creation: Yes
    • What best describes your project: Single object with fields (e.g., “Todo” with ID, name, description)
    • Do you want to edit the schema now: Yes

    When prompted, update the schema to the following:

    type Restaurant @model {
      id: ID!
      name: String!
      description: String!
      city: String!
    }

    Next, let’s run the push command to create the cloud resources in our AWS account:

    amplify push

    • Are you sure you want to continue: Yes
    • Do you want to generate code for your newly created GraphQL API: Yes
    • Choose the code generation language target: typescript
    • Enter the file name pattern of graphql queries, mutations and subscriptions: src/graphql/**/*.ts
    • Do you want to generate/update all possible GraphQL operations – queries, mutations and subscriptions: Yes
    • Enter maximum statement depth [increase from default if your schema is deeply nested]: 2
    • Enter the file name for the generated code: src/API.ts

    Notice your GraphQL endpoint and API KEY. This step has created a new AWS AppSync API and generated the GraphQL queries, mutations, and subscriptions on your local. To check, see src/graphql or visit the AppSync dashboard. Alternatively, you can use:

    amplify console api

    Please select from one of the below mentioned services: GraphQL

    Now, in the AppSync console, on the left side click on Queries. Execute the following mutation to create a restaurant in the API:

    mutation createRestaurant {
      createRestaurant(input: {
        name: "Nobu"
        description: "Great Sushi"
        city: "New York"
      }) {
        id name description city
      }
    }

    Now, let’s query for the restaurant:

    query listRestaurants {
      listRestaurants {
        items {
          id
          name
          description
          city
        }
      }
    }

    We can even search / filter data when querying:

    query searchRestaurants {
      listRestaurants(filter: {
        city: {
          contains: "New York"
        }
      }) {
        items {
          id
          name
          description
          city
        }
      }
    }

    Now that the GraphQL API is created, we can begin interacting with it from our client application. Here is how we’ll add queries, mutations, and subscriptions:

    import Amplify, { API, graphqlOperation } from 'aws-amplify';
    import { withAuthenticator } from 'aws-amplify-react';
    import React, { useEffect, useReducer } from 'react';
    import { Button, Col, Container, Form, Row, Table } from 'react-bootstrap';
    
    import './App.css';
    import awsConfig from './aws-exports';
    import { createRestaurant } from './graphql/mutations';
    import { listRestaurants } from './graphql/queries';
    import { onCreateRestaurant } from './graphql/subscriptions';
    
    Amplify.configure(awsConfig);
    
    type Restaurant = {
      name: string;
      description: string;
      city: string;
    };
    
    type AppState = {
      restaurants: Restaurant[];
      formData: Restaurant;
    };
    
    type Action =
      | {
          type: 'QUERY';
          payload: Restaurant[];
        }
      | {
          type: 'SUBSCRIPTION';
          payload: Restaurant;
        }
      | {
          type: 'SET_FORM_DATA';
          payload: { [field: string]: string };
        };
    
    type SubscriptionEvent<D> = {
      value: {
        data: D;
      };
    };
    
    const initialState: AppState = {
      restaurants: [],
      formData: {
        name: '',
        city: '',
        description: '',
      },
    };
    const reducer = (state: AppState, action: Action) => {
      switch (action.type) {
        case 'QUERY':
          return { ...state, restaurants: action.payload };
        case 'SUBSCRIPTION':
          return { ...state, restaurants: [...state.restaurants, action.payload] };
        case 'SET_FORM_DATA':
          return { ...state, formData: { ...state.formData, ...action.payload } };
        default:
          return state;
      }
    };
    
    const App: React.FC = () => {
      const createNewRestaurant = async (e: React.SyntheticEvent) => {
        e.stopPropagation();
        const { name, description, city } = state.formData;
        const restaurant = {
          name,
          description,
          city,
        };
        await API.graphql(graphqlOperation(createRestaurant, { input: restaurant }));
      };
    
      const [state, dispatch] = useReducer(reducer, initialState);
    
      useEffect(() => {
        getRestaurantList();
    
        const subscription = API.graphql(graphqlOperation(onCreateRestaurant)).subscribe({
          next: (eventData: SubscriptionEvent<{ onCreateRestaurant: Restaurant }>) => {
            const payload = eventData.value.data.onCreateRestaurant;
            dispatch({ type: 'SUBSCRIPTION', payload });
          },
        });
    
        return () => subscription.unsubscribe();
      }, []);
    
      const getRestaurantList = async () => {
        const restaurants = await API.graphql(graphqlOperation(listRestaurants));
        dispatch({
          type: 'QUERY',
          payload: restaurants.data.listRestaurants.items,
        });
      };
    
      const handleChange = (e: React.ChangeEvent<HTMLInputElement>) =>
        dispatch({
          type: 'SET_FORM_DATA',
          payload: { [e.target.name]: e.target.value },
        });
    
      return (
        <div className="App">
          <Container>
            <Row className="mt-3">
              <Col md={4}>
                <Form>
                  <Form.Group controlId="formDataName">
                    <Form.Control onChange={handleChange} type="text" name="name" placeholder="Name" />
                  </Form.Group>
                  <Form.Group controlId="formDataDescription">
                    <Form.Control onChange={handleChange} type="text" name="description" placeholder="Description" />
                  </Form.Group>
                  <Form.Group controlId="formDataCity">
                    <Form.Control onChange={handleChange} type="text" name="city" placeholder="City" />
                  </Form.Group>
                  <Button onClick={createNewRestaurant} className="float-left">
                    Add New Restaurant
                  </Button>
                </Form>
              </Col>
            </Row>
    
            {state.restaurants.length ? (
              <Row className="my-3">
                <Col>
                  <Table striped bordered hover>
                    <thead>
                      <tr>
                        <th>#</th>
                        <th>Name</th>
                        <th>Description</th>
                        <th>City</th>
                      </tr>
                    </thead>
                    <tbody>
                      {state.restaurants.map((restaurant, index) => (
                        <tr key={`restaurant-${index}`}>
                          <td>{index + 1}</td>
                          <td>{restaurant.name}</td>
                          <td>{restaurant.description}</td>
                          <td>{restaurant.city}</td>
                        </tr>
                      ))}
                    </tbody>
                  </Table>
                </Col>
              </Row>
            ) : null}
          </Container>
        </div>
      );
    };
    
    export default withAuthenticator(App);

    Finally, we have our app ready. You can now sign-up,sign-in, add new restaurants, see real-time updates of newly added restaurants.

    Hosting

    The hosting category enables you to deploy and host your app on AWS.

    amplify add hosting

    • Select the environment setup: DEV (S3 only with HTTP)
    • hosting bucket name: <YOUR_BUCKET_NAME>
    • index doc for the website: index.html
    • error doc for the website: index.html

    Now, everything is set up & we can publish it:

    amplify publish

    Working with multiple environments

    You can create multiple environments for your application to create & test out new features without affecting the main environment which you are working on.

    When you use an existing environment to create a new environment, you get a copy of the entire backend application stack (CloudFormation) for the current environment. When you make changes in the new environment, you are then able to test these new changes in the new environment & merge only the changes that have been made since the new environment was created.

    Let’s take a look at how to create a new environment. In this new environment, we’ll add another field for the restaurant owner to the GraphQL Schema.

    First, we’ll initialize a new environment using amplify init:

    amplify init

    • Do you want to use an existing environment: N
    • Enter a name for the environment: apiupdate
    • Do you want to use an AWS profile: Y

    Once the new environment is initialized, we should be able to see some information about our environment setup by running:

    amplify env list
    
    | Environments |
    | ------------ |
    | dev |
    | *apiupdate |

    Now, add the owner field to the GraphQL Schema in

    amplify/backend/api/RestaurantAPI/schema.graphql:

    type Restaurant @model {
      ...
      owner: String
    }

    Run the push command to create a new stack:

    amplify push.

    After testing it out, it can be merged into our original dev environment:

    amplify env checkout dev
    amplify status
    amplify push

    • Do you want to update code for your updated GraphQL API: Y
    • Do you want to generate GraphQL statements: Y

    Removing Services

    If at any time, you would like to delete a service from your project & your account, you can do this by running the amplify remove command:

    amplify remove auth
    amplify push

    If you are unsure of what services you have enabled at any time, amplify status will give you the list of resources that are currently enabled in your app.

    Sample code

    The sample code for this blog post with an end to end working app is available here.

    Summary

    Once you’ve worked through all the sections above, your app should now have all the capabilities of a modern app, and building GraphQL + React apps should now be easier and faster with Amplify.

  • Surviving & Thriving in the Age of Software Accelerations

    It has almost been a decade since Marc Andreessen made this prescient statement. Software is not only eating the world but doing so at an accelerating pace. There is no industry that hasn’t been challenged by technology startups with disruptive approaches.

    • Automakers are no longer just manufacturing companies: Tesla is disrupting the industry with their software approach to vehicle development and continuous over-the-air software delivery. Waymo’s autonomous cars have driven millions of miles and self-driving cars are a near-term reality. Uber is transforming the transportation industry into a service, potentially affecting the economics and incentives of almost 3–4% of the world GDP!
    • Social networks and media platforms had a significant and decisive impact on the US election results.
    • Banks and large financial institutions are being attacked by FinTech startups like WealthFront, Venmo, Affirm, Stripe, SoFi, etc. Bitcoin, Ethereum and the broader blockchain revolution can upend the core structure of banks and even sovereign currencies.
    • Traditional retail businesses are under tremendous pressure due to Amazon and other e-commerce vendors. Retail is now a customer ownership, recommendations, and optimization business rather than a brick and mortar one.

    Enterprises need to adopt a new approach to software development and digital innovation. At Velotio, we are helping customers to modernize and transform their business with all of the approaches and best practices listed below.

    Agility

    In this fast-changing world, your business needs to be agile and fast-moving. You need to ship software faster, at a regular cadence, with high quality and be able to scale it globally.

    Agile practices allow companies to rally diverse teams behind a defined process that helps to achieve inclusivity and drives productivity. Agile is about getting cross-functional teams to work in concert in planned short iterations with continuous learning and improvement.

    Generally, teams that work in an Agile methodology will:

    • Conduct regular stand-ups and Scrum/Kanban planning meetings with the optimal use of tools like Jira, PivotalTracker, Rally, etc.
    • Use pair programming and code review practices to ensure better code quality.
    • Use continuous integration and delivery tools like Jenkins or CircleCI.
    • Design processes for all aspects of product management, development, QA, DevOps and SRE.
    • Use Slack, Hipchat or Teams for communication between team members and geographically diverse teams. Integrate all tools with Slack to ensure that it becomes the central hub for notifications and engagement.

    Cloud-Native

    Businesses need software that is purpose-built for the cloud model. What does that mean? Software team sizes are now in the hundreds of thousands. The number of applications and software stacks is growing rapidly in most companies. All companies use various cloud providers, SaaS vendors and best-of-breed hosted or on-premise software. Essentially, software complexity has increased exponentially which required a “cloud-native” approach to manage effectively. Cloud Native Computing Foundation defines cloud native as a software stack which is:

    1. Containerized: Each part (applications, processes, etc) is packaged in its own container. This facilitates reproducibility, transparency, and resource isolation.
    2. Dynamically orchestrated: Containers are actively scheduled and managed to optimize resource utilization.
    3. Microservices oriented: Applications are segmented into micro services. This significantly increases the overall agility and maintainability of applications.

    You can deep-dive into cloud native with this blog by our CTO, Chirag Jog.

    Cloud native is disrupting the traditional enterprise software vendors. Software is getting decomposed into specialized best of breed components — much like the micro-services architecture. See the Cloud Native landscape below from CNCF.

    DevOps

    Process and toolsets need to change to enable faster development and deployment of software. Enterprises cannot compete without mature DevOps strategies. DevOps is essentially a set of practices, processes, culture, tooling, and automation that focuses on delivering software continuously with high quality.

    DevOps tool chains & process

    As you begin or expand your DevOps journey, a few things to keep in mind:

    • Customize to your needs: There is no single DevOps process or toolchain that suits all needs. Take into account your organization structure, team capabilities, current software process, opportunities for automation and goals while making decisions. For example, your infrastructure team may have automated deployments but the main source of your quality issues could be the lack of code reviews in your development team. So identify the critical pain points and sources of delay to address those first.
    • Automation: Automate everything that can be. The lesser the dependency on human intervention, the higher are the chances for success.
    • Culture: Align the incentives and goals with your development, ITOps, SecOps, SRE teams. Ensure that they collaborate effectively and ownership in the DevOps pipeline is well established.
    • Small wins: Pick one application or team and implement your DevOps strategy within it. That way you can focus your energies and refine your experiments before applying them broadly. Show success as measured by quantifiable parameters and use that to transform the rest of your teams.
    • Organizational dynamics & integrations: Adoption of new processes and tools will cause some disruptions and you may need to re-skill part of your team or hire externally. Ensure that compliance, SecOps & audit teams are aware of your DevOps journey and get their buy-in.
    • DevOps is a continuous journey: DevOps will never be done. Train your team to learn continuously and refine your DevOps practice to keep achieving your goal: delivering software reliably and quickly.

    Micro-services

    As the amount of software in an enterprise explodes, so does the complexity. The only way to manage this complexity is by splitting your software and teams into smaller manageable units. Micro-services adoption is primarily to manage this complexity.

    Development teams across the board are choosing micro services to develop new applications and break down legacy monoliths. Every micro-service can be deployed, upgraded, scaled, monitored and restarted independent of other services. Micro-services should ideally be managed by an automated system so that teams can easily update live applications without affecting end-users.

    There are companies with 100s of micro-services in production which is only possible with mature DevOps, cloud-native and agile practice adoption.

    Interestingly, serverless platforms like Google Functions and AWS Lambda are taking the concept of micro-services to the extreme by allowing each function to act like an independent piece of the application. You can read about my thoughts on serverless computing in this blog: Serverless Computing Predictions for 2017.

    Digital Transformation

    Digital transformation involves making strategic changes to business processes, competencies, and models to leverage digital technologies. It is a very broad term and every consulting vendor twists it in various ways. Let me give a couple of examples to drive home the point that digital transformation is about using technology to improve your business model, gain efficiencies or built a moat around your business:

    • GE has done an excellent job transforming themselves from a manufacturing company into an IoT/software company with Predix. GE builds airplane engines, medical equipment, oil & gas equipment and much more. Predix is an IoT platform that is being embedded into all of GE’s products. This enabled them to charge airlines on a per-mile basis by taking the ownership of maintenance and quality instead of charging on a one-time basis. This also gives them huge amounts of data that they can leverage to improve the business as a whole. So digital innovation has enabled a business model improvement leading to higher profits.
    • Car companies are exploring models where they can provide autonomous car fleets to cities where they will charge on a per-mile basis. This will convert them into a “service” & “data” company from a pure manufacturing one.
    • Insurance companies need to built digital capabilities to acquire and retain customers. They need to build data capabilities and provide ongoing value with services rather than interact with the customer just once a year.

    You would be better placed to compete in the market if you have automation and digital process in place so that you can build new products and pivot in an agile manner.

    Big Data / Data Science

    Businesses need to deal with increasing amounts of data due to IoT, social media, mobile and due to the adoption of software for various processes. And they need to use this data intelligently. Cloud platforms provide the services and solutions to accelerate your data science and machine learning strategies. AWS, Google Cloud & open-source libraries like Tensorflow, SciPy, Keras, etc. have a broad set of machine learning and big data services that can be leveraged. Companies need to build mature data processing pipelines to aggregate data from various sources and store it for quick and efficient access to various teams. Companies are leveraging these services and libraries to build solutions like:

    • Predictive analytics
    • Cognitive computing
    • Robotic Process Automation
    • Fraud detection
    • Customer churn and segmentation analysis
    • Recommendation engines
    • Forecasting
    • Anomaly detection

    Companies are creating data science teams to build long term capabilities and moats around their business by using their data smartly.

    Re-platforming & App Modernization

    Enterprises want to modernize their legacy, often monolithic apps as they migrate to the cloud. The move can be triggered due to hardware refresh cycles or license renewals or IT cost optimization or adoption of software-focused business models.

     Benefits of modernization to customers and businesses

    Intelligent Applications

    Software is getting more intelligent and to enable this, businesses need to integrate disparate datasets, distributed teams, and processes. This is best done on a scalable global cloud platform with agile processes. Big data and data science enables the creation of intelligent applications.

    How can smart applications help your business?

    • New intelligent systems of engagement: intelligent apps surface insights to users enabling the user to be more effective and efficient. For example, CRMs and marketing software is getting intelligent and multi-platform enabling sales and marketing reps to become more productive.
    • Personalisation: E-Commerce, social networks and now B2B software is getting personalized. In order to improve user experience and reduce churn, your applications should be personalized based on the user preferences and traits.
    • Drive efficiencies: IoT is an excellent example where the efficiency of machines can be improved with data and cloud software. Real-time insights can help to optimize processes or can be used for preventive maintenance.
    • Creation of new business models: Traditional and modern industries can use AI to build new business models. For example, what if insurance companies allow you to pay insurance premiums only for the miles driven?

    Security

    Security threats to governments, enterprises and data have never been greater. As business adopt cloud native, DevOps & micro-services practices, their security practices need to evolve.

    In our experience, these are few of the features of a mature cloud native security practice:

    • Automated: Systems are updated automatically with the latest fixes. Another approach is immutable infrastructure with the adoption of containers and serverless.
    • Proactive: Automated security processes tend to be proactive. For example, if a malware of vulnerability is found in one environment, automation can fix it in all environments. Mature DevOps & CI/CD processes ensure that fixes can be deployed in hours or days instead of weeks or months.
    • Cloud Platforms: Businesses have realized that the mega-clouds are way more secure than their own data centers can be. Many of these cloud platforms have audit, security and compliance services which should be leveraged.
    • Protecting credentials: Use AWS KMS, Hashicorp Vault or other solutions for protecting keys, passwords and authorizations.
    • Bug bounties: Either setup bug bounties internally or through sites like HackerOne. You want the good guys to work for you and this is an easy way to do that.

    Conclusion

    As you can see, all of these approaches and best practices are intertwined and need to be implemented in concert to gain the desired results. It is best to start with one project, one group or one application and build on early wins. Remember, that is is a process and you are looking for gradual improvements to achieve your final objectives.

    Please let us know your thoughts and experiences by adding comments to this blog or reaching out to @kalpakshah or RSI. We would love to help your business adopt these best practices and help to build great software together. Drop me a note at kalpak (at) velotio (dot) com.

  • API Testing Using Postman and Newman

    In the last few years, we have an exponential increase in the development and use of APIs. We are in the era of API-first companies like Stripe, Twilio, Mailgun etc. where the entire product or service is exposed via REST APIs. Web applications also today are powered by REST-based Web Services. APIs today encapsulate critical business logic with high SLAs. Hence it is important to test APIs as part of the continuous integration process to reduce errors, improve predictability and catch nasty bugs.

    In the context of API development, Postman is great REST client to test APIs. Although Postman is not just a REST Client, it contains a full-featured testing sandbox that lets you write and execute Javascript based tests for your API.

    Postman comes with a nifty CLI tool – Newman. Newman is the Postman’s Collection Runner engine that sends API requests, receives the response and then runs your tests against the response. Newman lets developments easily integrate Postman into continuous integration systems like Jenkins. Some of the important features of Postman & Newman include:-

    1. Ability to test any API and see the response instantly.
    2. Ability to create test suites or collections using a collection of API endpoints.
    3. Ability to collaborate with team members on these collections.
    4. Ability to easily export/import collections as JSON files.

    We are going to look at all these features, some are intuitive and some not so much unless you’ve been using Postman for a while.

    Setting up Your Postman

    You can install Postman either as a Chrome extension or as a native application

    Later, can then look it up in your installed apps and open it. You can choose to Sign Up & create an account if you want, this is important especially for saving your API collections and accessing them anytime on any machine. However, for this article, we can skip this. There’s a button for that towards the bottom when you first launch the app.

    Postman Collections

    Postman Collections in simple words is a collection of tests. It is essentially a test suite of related tests. These tests can be scenario-based tests or sequence/workflow-based tests.

    There’s a Collections tab on the top left of Postman, with an example Postman Echo collection. You can open and go through it.

    Just like in the above screenshot, select a API request and click on the Tests. Check the first line:

    tests["response code is 200"] = responseCode.code === 200;

    The above line is a simple test to check if the response code for the API is 200. This is the pattern for writing Assertions/Tests in Postman (using JavaScript), and this is actually how you are going to write the tests for API’s need to be tested.You can open the other API requests in the POSTMAN Echo collection to get a sense of how requests are made.

    Adding a COLLECTION

    To make your own collection, click on the ‘Add Collection‘ button on the top left of Postman and call it “Test API”

    You will be prompted to give details about the collection, I’ve added a name Github API and given it a description.

    Clicking on Create should add the collection to the left pane, above, or below the example “POSTMAN Echo” collection.

    If you need a hierarchy for maintaining relevance between multiple API’s inside a collection, APIs can further be added to a folder inside a collection. Folders are a great way of separating different parts of your API workflow. You can be added folders through the “3 dot” button beside Collection Name:

    Eg.: name the folder “Get Calls” and give a description once again.

    Now that we have the folder, the next task is to add an API call that is related to the TEST_API_COLLECTION to that folder. That API call is to https://api.github.com/.

    If you still have one of the TEST_API_COLLECTION collections open, you can close it the same way you close tabs in a browser, or just click on the plus button to add a new tab on the right pane where we make requests.

    Type in or paste in https://api.github.com/ and press Send to see the response.

    Once you get the response, you can click on the arrow next to the Save button on the far right, and select Save As, a pop up will be displayed asking where to save the API call.

    Give a name, it can be the request URL, or a name like “GET Github Basic”, and a description, then choose the collection and folder, in this case, TEST_API_COLLECTION> GET CALLS, then click on Save. The API call will be added to the Github Root API folder on the left pane.

    Whenever you click on this request from the collection, it will open in the center pane.

    Write the Tests

    We’ve seen that the GET Github Basic request has a JSON response, which is usually the case for most of the APIs.This response has properties such as current_user_url, emails_url, followers_url and following_url to pick a few. The current_user_url has a value of https://api.github.com/user.  Let’s add a test, for this URL. Click on the ‘GET Github Basic‘ and click on the test tab in the section just below where the URL is put.

    You will notice on the right pane, we have some snippets which Postman creates when you click so that you don’t have to write a lot of code. Let’s add Response Body: JSON value check. Clicking on it produces the following snippet.

    var jsonData = JSON.parse(responseBody);
    tests["Your test name"] = jsonData.value === 100;

    From these two lines, it is apparent that Postman stores the response in a global object called responseBody, and we can use this to access response and assert values in tests as required.

    Postman also has another global variable object called tests, which is an object you can use to name your tests, and equate it to a boolean expression. If the boolean expression returns true, then the test passes.

    tests['some random test'] = x === y

    If you click on Send to make the request, you will see one of the tests failing.

    Lets create a test that relevant to our usecase.

    var jsonData = JSON.parse(responseBody);
    var usersURL = "https://api.github.com/user"
    tests["Gets the correct users url"] = jsonData.current_user_url === usersURL;

    Clicking on ‘Send‘, you’ll see the test passing.

    Let’s modify the test further to test some of the properties we want to check

    Ideally the things to be tested in an API Response Body should be:

    • Response Code ( Assert Correct Response Code for any request)
    • Response Time ( to check api responds in an acceptable time range / is not delayed)
    • Response Body is not empty / null
    tests["Status code is 200"] = responseCode.code === 200;
    tests["Response time is less than 200ms"] = responseTime < 200;
    tests["Response time is acceptable"] = _.inRange(responseTime, 0, 500);
    tests["Body is not empty"] = (responseBody!==null || responseBody.length!==0);

    Newman CLI

    Once you’ve set up all your collections and written tests for them, it may be tedious to go through them one by one and clicking send to see if a given collection test passes. This is where Newman comes in. Newman is a command-line collection runner for Postman.

    All you need to do is export your collection and the environment variables, then use Newman to run the tests from your terminal.

    NOTE: Make sure you’ve clicked on ‘Save’ to save your collection first before exporting.

    USING NEWMAN

    So the first step is to export your collection and environment variables. Click on the Menu icon for Github API collection, and select export.

    Select version 2, and click on “Export”

    Save the JSON file in a location you can access with your terminal. I created a local directory/folder called “postman” and saved it there.

    Install Newman CLI globally, then navigate to the where you saved the collection.

    npm install -g newman 
    cd postman

    Using Newman is quite straight-forward, and the documentation is extensive. You can even require it as a Node.js module and run the tests there. However, we will use the CLI.

    Once you are in the directory, run newman run <collection_name.json>, </collection_name.json> replacing the collection_name with the name you used to save the collection.

    newman run TEST_API_COLLECTION.postman_collection.json     

    NEWMAN CLI Options

    Newman provides a rich set of options to customize a run. A list of options can be retrieved by running it with the -h flag.

    
    $ newman run -h
    Options - Additional args: 
    Utility:
    -h, --help output usage information
    -v, --version output the version number
    Basic setup:
    --folder [folderName] Specify a single folder to run from a collection.
    -e, --environment [file|URL] Specify a Postman environment as a JSON [file]
    -d, --data [file] Specify a data file to use either json or csv
    -g, --global [file] Specify a Postman globals file as JSON [file]
    -n, --iteration-count [number] Define the number of iterations to run
    Request options:
    --delay-request [number] Specify a delay (in ms) between requests [number] --timeout-request [number] Specify a request timeout (in ms) for a request
    Misc.:
    --bail Stops the runner when a test case fails
    --silent Disable terminal output --no-color Disable colored output
    -k, --insecure Disable strict ssl
    -x, --suppress-exit-code Continue running tests even after a failure, but exit with code=0
    --ignore-redirects Disable automatic following of 3XX responses

    Lets try out of some of the options.

    Iterations

    Lets use the -n option to set the number of iterations to run the collection.

    $ newman run mycollection.json -n 10 # runs the collection 10 times

    To provide a different set of data, i.e. variables for each iteration, you can use the -d to specify a JSON or CSV file. For example, a data file such as the one shown below will run 2 iterations, with each iteration using a set of variables.

    [{
    "url": "http://127.0.0.1:5000",
      "user_id": "1",
      "id": "1",
      "token_id": "123123",
    },{
      "url": "http://postman-echo.com",
      "user_id": "2",
      "id": "2",
      "token_id": "899899",
    }]$ newman run mycollection.json -d data.json

    Alternately, the CSV file for the above set of variables would look like:

    url, user_id, id, token_id 
    http://127.0.0.1:5000, 1, 1, 123123123 
    http://postman-echo.com, 2, 2, 899899

    Environment Variables

    Each environment is a set of key-value pairs, with the key as the variable name. These Environment configurations can be used to differentiate between configurations specific to your execution environments eg. Dev, Test & Production.

    To provide a different execution environment, you can use the -e to specify a JSON or CSV file. For example, a environment file such as the one shown below will provide the environment variables globally to all tests during execution.

    postman_dev_env.json
    {
    "id": "b5c617ad-7aaf-6cdf-25c8-fc0711f8941b",
    "name": "dev env",
    "values": [
    {
    "enabled": true,
    "key": "env",
    "value": "dev.example.com",
    "type": "text"
    }  
    ],
    "timestamp": 1507210123364,
    "_postman_variable_scope": "environment",
    "_postman_exported_at": "2017-10-05T13:28:45.041Z",
    "_postman_exported_using": "Postman/5.2.1"
    }

    Bail FLAG

    Newman, by default, exits with a status code of 0 if everything runs well i.e. without any exceptions. Continuous integration tools respond to these exit codes and correspondingly pass or fail a build. You can use the –bail flag to tell Newman to halt on a test case error with a status code of 1 which can then be picked up by a CI tool or build system.

    $ newman run PostmanCollection.json -e environment.json --bail newman

    Conclusion

    Postman and Newman can be used for a number of test cases, including creating usage scenarios, Suites, Packs for your API Test Cases. Further NEWMAN / POSTMAN can be very well Integrated with CI/CD Tools such as Jenkins, Travis etc.

  • A Practical Guide to Deploying Multi-tier Applications on Google Container Engine (GKE)

    Introduction

    All modern era programmers can attest that containerization has afforded more flexibility and allows us to build truly cloud-native applications. Containers provide portability – ability to easily move applications across environments. Although complex applications comprise of many (10s or 100s) containers. Managing such applications is a real challenge and that’s where container orchestration and scheduling platforms like Kubernetes, Mesosphere, Docker Swarm, etc. come into the picture. 
    Kubernetes, backed by Google is leading the pack given that Redhat, Microsoft and now Amazon are putting their weight behind it.

    Kubernetes can run on any cloud or bare metal infrastructure. Setting up & managing Kubernetes can be a challenge but Google provides an easy way to use Kubernetes through the Google Container Engine(GKE) service.

    What is GKE?

    Google Container Engine is a Management and orchestration system for Containers. In short, it is a hosted Kubernetes. The goal of GKE is to increase the productivity of DevOps and development teams by hiding the complexity of setting up the Kubernetes cluster, the overlay network, etc.

    Why GKE? What are the things that GKE does for the user?

    • GKE abstracts away the complexity of managing a highly available Kubernetes cluster.
    • GKE takes care of the overlay network
    • GKE also provides built-in authentication
    • GKE also provides built-in auto-scaling.
    • GKE also provides easy integration with the Google storage services.

    In this blog, we will see how to create your own Kubernetes cluster in GKE and how to deploy a multi-tier application in it. The blog assumes you have a basic understanding of Kubernetes and have used it before. It also assumes you have created an account with Google Cloud Platform. If you are not familiar with Kubernetes, this guide from Deis  is a good place to start.

    Google provides a Command-line interface (gcloud) to interact with all Google Cloud Platform products and services. gcloud is a tool that provides the primary command-line interface to Google Cloud Platform. Gcloud tool can be used in the scripts to automate the tasks or directly from the command-line. Follow this guide to install the gcloud tool.

    Now let’s begin! The first step is to create the cluster.

    Basic Steps to create cluster

    In this section, I would like to explain about how to create GKE cluster. We will use a command-line tool to setup the cluster.

    Set the zone in which you want to deploy the cluster

    $ gcloud config set compute/zone us-west1-a

    Create the cluster using following command,

    $ gcloud container --project <project-name> 
    clusters create <cluster-name> 
    --machine-type n1-standard-2 
    --image-type "COS" --disk-size "50" 
    --num-nodes 2 --network default 
    --enable-cloud-logging --no-enable-cloud-monitoring

    Let’s try to understand what each of these parameters mean:

    –project: Project Name

    –machine-type: Type of the machine like n1-standard-2, n1-standard-4

    –image-type: OS image.”COS” i.e. Container Optimized OS from Google: More Info here.

    –disk-size: Disk size of each instance.

    –num-nodes: Number of nodes in the cluster.

    –network: Network that users want to use for the cluster. In this case, we are using default network.

    Apart from the above options, you can also use the following to provide specific requirements while creating the cluster:

    –scopes: Scopes enable containers to direct access any Google service without needs credentials. You can specify comma separated list of scope APIs. For example:

    • Compute: Lets you view and manage your Google Compute Engine resources
    • Logging.write: Submit log data to Stackdriver.

    You can find all the Scopes that Google supports here: .

    –additional-zones: Specify additional zones to high availability. Eg. –additional-zones us-east1-b, us-east1-d . Here Kubernetes will create a cluster in 3 zones (1 specified at the beginning and additional 2 here).

    –enable-autoscaling : To enable the autoscaling option. If you specify this option then you have to specify the minimum and maximum required nodes as follows; You can read more about how auto-scaling works here. Eg:   –enable-autoscaling –min-nodes=15 –max-nodes=50

    You can fetch the credentials of the created cluster. This step is to update the credentials in the kubeconfig file, so that kubectl will point to required cluster.

    $ gcloud container clusters get-credentials my-first-cluster --project project-name

    Now, your First Kubernetes cluster is ready. Let’s check the cluster information & health.

    $ kubectl get nodes
    NAME    STATUS    AGE   VERSION
    gke-first-cluster-default-pool-d344484d-vnj1  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-kdd7  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-ytre2  Ready  2h  v1.6.4

    After creating Cluster, now let’s see how to deploy a multi tier application on it. Let’s use simple Python Flask app which will greet the user, store employee data & get employee data.

    Application Deployment

    I have created simple Python Flask application to deploy on K8S cluster created using GKE. you can go through the source code here. If you check the source code then you will find directory structure as follows:

    TryGKE/
    ├── Dockerfile
    ├── mysql-deployment.yaml
    ├── mysql-service.yaml
    ├── src    
      ├── app.py    
      └── requirements.txt    
      ├── testapp-deployment.yaml    
      └── testapp-service.yaml

    In this, I have written a Dockerfile for the Python Flask application in order to build our own image to deploy. For MySQL, we won’t build an image of our own. We will use the latest MySQL image from the public docker repository.

    Before deploying the application, let’s re-visit some of the important Kubernetes terms:

    Pods:

    The pod is a Docker container or a group of Docker containers which are deployed together on the host machine. It acts as a single unit of deployment.

    Deployments:

    Deployment is an entity which manages the ReplicaSets and provides declarative updates to pods. It is recommended to use Deployments instead of directly using ReplicaSets. We can use deployment to create, remove and update ReplicaSets. Deployments have the ability to rollout and rollback the changes.

    Services:

    Service in K8S is an abstraction which will connect you to one or more pods. You can connect to pod using the pod’s IP Address but since pods come and go, their IP Addresses change.  Services get their own IP & DNS and those remain for the entire lifetime of the service. 

    Each tier in an application is represented by a Deployment. A Deployment is described by the YAML file. We have two YAML files – one for MySQL and one for the Python application.

    1. MySQL Deployment YAML

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: mysql
    spec:
      template:
        metadata:
          labels:
            app: mysql
        spec:
          containers:
            - env:
                - name: MYSQL_DATABASE
                  value: admin
                - name: MYSQL_ROOT_PASSWORD
                  value: admin
              image: 'mysql:latest'
              name: mysql
              ports:
                - name: mysqlport
                  containerPort: 3306
                  protocol: TCP

    2. Python Application Deployment YAML

    apiVersion: apps/v1beta1
    kind: Deployment
    metadata:
      name: test-app
    spec:
      replicas: 1
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - name: test-app
            image: ajaynemade/pymy:latest
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 5000

    Each Service is also represented by a YAML file as follows:

    1. MySQL service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: mysql-service
    spec:
      ports:
      - port: 3306
        targetPort: 3306
        protocol: TCP
        name: http
      selector:
        app: mysql

    2. Python Application service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: test-service
    spec:
      type: LoadBalancer
      ports:
      - name: test-service
        port: 80
        protocol: TCP
        targetPort: 5000
      selector:
        app: test-app

    You will find a ‘kind’ field in each YAML file. It is used to specify whether the given configuration is for deployment, service, pod, etc.

    In the Python app service YAML, I am using type = LoadBalancer. In GKE, There are two types of cloud load balancers available to expose the application to outside world.

    1. TCP load balancer: This is a TCP Proxy-based load balancer. We will use this in our example.
    2. HTTP(s) load balancer: It can be created using Ingress. For more information, refer to this post that talks about Ingress in detail.

    In the MySQL service, I’ve not specified any type, in that case, type ‘ClusterIP’ will get used, which will make sure that MySQL container is exposed to the cluster and the Python app can access it.

    If you check the app.py, you can see that I have used “mysql-service.default” as a hostname. “Mysql-service.default” is a DNS name of the service. The Python application will refer to that DNS name while accessing the MySQL Database.

    Now, let’s actually setup the components from the configurations. As mentioned above, we will first create services followed by deployments.

    Services:

    $ kubectl create -f mysql-service.yaml
    $ kubectl create -f testapp-service.yaml

    Deployments:

    $ kubectl create -f mysql-deployment.yaml
    $ kubectl create -f testapp-deployment.yaml

    Check the status of the pods and services. Wait till all pods come to the running state and Python application service to get external IP like below:

    $ kubectl get services
    NAME            CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
    kubernetes      10.55.240.1     <none>        443/TCP        5h
    mysql-service   10.55.240.57    <none>        3306/TCP       1m
    test-service    10.55.246.105   35.185.225.67     80:32546/TCP   11s

    Once you get the external IP, then you should be able to make APIs calls using simple curl requests.

    Eg. To Store Data :

    curl -H "Content-Type: application/x-www-form-urlencoded" -X POST  http://35.185.225.67:80/storedata -d id=1 -d name=NoOne

    Eg. To Get Data :

    curl 35.185.225.67:80/getdata/1

    At this stage your application is completely deployed and is externally accessible.

    Manual scaling of pods

    Scaling your application up or down in Kubernetes is quite straightforward. Let’s scale up the test-app deployment.

    $ kubectl scale deployment test-app --replicas=3

    Deployment configuration for test-app will get updated and you can see 3 replicas of test-app are running. Verify it using,

    kubectl get pods

    In the same manner, you can scale down your application by reducing the replica count.

    Cleanup :

    Un-deploying an application from Kubernetes is also quite straightforward. All we have to do is delete the services and delete the deployments. The only caveat is that the deletion of the load balancer is an asynchronous process. You have to wait until it gets deleted.

    $ kubectl delete service mysql-service
    $ kubectl delete service test-service

    The above command will deallocate Load Balancer which was created as a part of test-service. You can check the status of the load balancer with the following command.

    $ gcloud compute forwarding-rules list

    Once the load balancer is deleted, you can clean-up the deployments as well.

    $ kubectl delete deployments test-app
    $ kubectl delete deployments mysql

    Delete the Cluster:

    $ gcloud container clusters delete my-first-cluster

    Conclusion

    In this blog, we saw how easy it is to deploy, scale & terminate applications on Google Container Engine. Google Container Engine abstracts away all the complexity of Kubernetes and gives us a robust platform to run containerised applications. I am super excited about what the future holds for Kubernetes!

    Check out some of Velotio’s other blogs on Kubernetes.

  • An Introduction To Cloudflare Workers And Cloudflare KV store

    Cloudflare Workers

    This post gives a brief introduction to Cloudflare Workers and Cloudflare KV store. They address a fairly common set of problems around scaling an application globally. There are standard ways of doing this but they usually require a considerable amount of upfront engineering work and developers have to be aware of the ‘scalability’ issues to some degree. Serverless application tools target easy scalability and quick response times around the globe while keeping the developers focused on the application logic rather than infra nitty-gritties.

    Global responsiveness

    When an application is expected to be accessed around the globe, requests from users sitting in different time-zones should take a similar amount of time. There can be multiple ways of achieving this depending upon how data intensive the requests are and what those requests actually do.

    Data intensive requests are harder and more expensive to globalize, but again not all the requests are same. On the other hand, static requests like getting a documentation page or a blog post can be globalized by generating markup at build time and deploying them on a CDN.

    And there are semi-dynamic requests. They render static content either with some small amount of data or their content change based on the timezone the request came from.

    The above is a loose classification of requests but there are exceptions, for example, not all the static requests are presentational.

    Serverless frameworks are particularly useful in scaling static and semi-static requests.

    Cloudflare Workers Overview

    Cloudflare worker is essentially a function deployment service. They provide a serverless execution environment which can be used to develop and deploy small(although not necessarily) and modular cloud functions with minimal effort.

    It is very trivial to start with workers. First, lets install wrangler, a tool for managing Cloudfare Worker projects.

    npm i @cloudflare/wrangler -g

    Wrangler handles all the standard stuff for you like project generation from templates, build, config, publishing among other things.

    A worker primarily contains 2 parts: an event listener that invokes a worker and an event handler that returns a response object. Creating a worker is as easy as adding an event listener to a button.

    addEventListener('fetch', event => {
        event.respondWith(handleRequest(event.request))
    })
    
    async function handleRequest(request) {
        return new Response("hello world")
    }

    Above is a simple hello world example. Wrangler can be used to build and get a live preview of your worker.

    wrangler build

    will build your worker. And 

    wrangler preview 

    can be used to take a live preview on the browser. The preview is only meant to be used for testing(either by you or others). If you want the workers to be triggered by your own domain or a workers.dev subdomain, you need to publish it.

    Publishing is fairly straightforward and requires very less configuration on both wrangler and your project.

    Wrangler Configuration

    Just create an account on Cloudflare and get API key. To configure wrangler, just do:

    wrangler config

    It will ask for the registered email and API key, and you are good to go.

    To publish your worker on a workers.dev subdomain, just fill your account ID in the wrangler.toml and hit wrangler publish. The worker will be deployed and live at a generated workers.dev subdomain.

    Regarding Routes

    When you publish on a {script-name}.{subdomain}.workers.dev domain, the script or project associated with script-name will be invoked. There is no way to call a script just from {subdomain}.workers.dev.

    Worker KV

    Workers alone can’t be used to make anything complex without any persistent storage, that’s where Workers KV comes into the picture. Workers KV as it sounds, is a low-latency, high-volume, key-value store that is designed for efficient reads.

    It optimizes the read latency by dynamically spreading the most frequently read entries to the edges(replicated in several regions) and storing less frequent entries centrally.

    Newly added keys(or a CREATE) are immediately reflected in every region while a value change in the keys(or an UPDATE) may take as long as 60 seconds to propagate, depending upon the region.

    Workers KV is only available to paid users of Cloudflare.

    Writing Data in Workers KV

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces" 
    -X POST 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 
    -H "Content-Type: application/json" 
    --data '{"title": "Requests"}'
    The above HTTP request will create a namespace by the name Requests. The response should look something like this:
    {
        "result": {
            "id": "30b52f55aafb41d88546d01d5f69440a",
            "title": "Requests",
            "supports_url_encoding": true
        },
        "success": true,
        "errors": [],
        "messages": []
    }

    Now we can write KV pairs in this namespace. The following HTTP requests will do the same:

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/values/first-key" 
    -X PUT 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 
    --data 'My first value!'

    Here the NAMESPACE_ID is the same ID that we received in the last request. First-key is the key name and the My first value is the value.

    Let’s complicate things a little

    Above overview just introduces the managed cloud workers with a ‘hello world’ app and basics of the Workers KV, but now let’s make something more complicated. We will make an app which will tell how many requests have been made from your country till now. For example, if you pinged the worker from the US then it will return number of requests made so far from the US.

    We will need: 

    • Some place to store the count of requests for each country. 
    • Find from which country the Worker was invoked.

    For the first part, we will use the Workers KV to store the count for every request.

    Let’s start

    First, we will create a new project using wrangler: wrangler generate request-count.

    We will be making HTTP calls to write values in the Workers KV, so let’s add ‘node-fetch’ to the project:

    npm install node-fetch

    Now, how do we find from which country each request is coming from? The answer is the cf object that is provided with each request to a worker.

    The cf object is a special object that is passed with each request and can be accessed with request.cf. This mainly contains region specific information along with TLS and Auth information. The details of what is provided in the cf, can be found here.

    As we can see from the documentation, we can get country from

    request.cf.country.

    The cf object is not correctly populated in the wrangler preview, you will need to publish your worker in order to test cf’s usage. An open issue mentioning the same can be found here.

    Now, the logic is pretty straightforward here. When we get a request from a country for which we don’t have an entry in the Worker’s KV, we make an entry with value 1, else we increment the value of the country key.

    To use Workers KV, we need to create a namespace. A namespace is just a collection of key-value pairs where all the keys have to be unique.

    A namespace can be created under the KV tab in Cloudflare web UI by giving the name or using the API call above. You can also view/browse all of your namespaces from the web UI. Following API call can be used to read the value of a key from a namespace:

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/values/first-key" 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 

    But, it is neither the fastest nor the easiest way. Cloudflare provides a better and faster way to read data from your namespaces. It’s called binding. Each KV namespace can be bound to a worker script so to make it available in the script by the variable name. Any namespace can be bound with any worker. A KV namespace can be bound to a worker by going to the editing menu of a worker from the Cloudflare UI. 

    Following steps show you how to bind a namespace to a worker:

    Go to the edit page of the worker in Cloudflare web UI and click on the KV tab:

    Then add a binding by clicking the ‘Add binding’ button.

    You can select the namespace name and the variable name by which it will be bound. More details can be found here. A binding that I’ve made can be seen in the above image.

    That’s all we need to get this to work. Following is the relevant part of the script:

    const fetch = require('node-fetch')
    
    addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request))
    })
    
    /**
    * Fetch and log a request
    * @param {Request} request
    */
    async function handleRequest(request) {
        const country = request.cf.country
    
        const url = `https://api.cloudflare.com/client/v4/accounts/account-id/storage/kv/namespaces/namespace-id/values/${country}`
    
        let count = await requests.get(country)
    
        if (!count) {
            count = 1
        } else {
            count = parseInt(count) + 1
        }
    
        try {
            response = await fetch(url, {
            method: 'PUT',
            headers: {"X-Auth-Email": "email", "X-Auth-Key": "auth-key"},
            body: `${count}`
            })
        } catch (error) {
            return new Response(error, { status: 500 })
        }
    
        return new Response(`${country}: ${count}`, { status: 200 }) 
    }

    In the above code, I bound the Requests namespace that we created by the requests variable that would be dynamically resolved when we publish.

    The full source of this can be found here.

    This small application also demonstrates some of the practical aspects of the workers. For example, you would notice that the updates take some time to get reflected and response time of the workers is quick, especially when they are deployed on a .workers.dev subdomain here.

    Side note: You will have to recreate the namespace-worker binding everytime you deploy the worker or you do wrangler publish.

    Workers vs. AWS Lambda

    AWS Lambda has been a major player in the serverless market for a while now. So, how is Cloudflare Workers as compared to it? Let’s see.

    Architecture:

    Cloudflare Workers `Isolates` instead of a container based underlying architecture. `Isolates` is the technology that allows V8(Google Chrome’s JavaScript Engine) to run thousands of processes on a single server in an efficient and secure manner. This effectively translates into faster code execution and lowers memory usage. More details can be found here.

    Price:

    The above mentioned architectural difference allows Workers to be significantly cheaper than Lambda. While a Worker offering 50 milliseconds of CPU costs $0.50 per million requests, the equivalent Lambda costs $1.84 per million. A more detailed price comparison can be found here.

    Speed:

    Workers also show significantly better performance numbers than Lambda and Lambda@Edge. Tests run by Cloudflare claim that they are 441% faster than Lambda and 192% faster than Lambda@Edge. A detailed performance comparison can be found here.

    This better performance is also confirmed by serverless-benchmark.

    Wrapping Up:

    As we have seen, Cloudflare Workers along with the KV Store does make it very easy to start with a serverless application. They provide fantastic performance while using less cost along with intuitive deployment. These properties make them ideal for making globally accessible serverless applications.

  • Explanatory vs. Predictive Models in Machine Learning

    My vision on Data Analysis is that there is continuum between explanatory models on one side and predictive models on the other side. The decisions you make during the modeling process depend on your goal. Let’s take Customer Churn as an example, you can ask yourself why are customers leaving? Or you can ask yourself which customers are leaving? The first question has as its primary goal to explain churn, while the second question has as its primary goal to predict churn. These are two fundamentally different questions and this has implications for the decisions you take along the way. The predictive side of Data Analysis is closely related to terms like Data Mining and Machine Learning.

    SPSS & SAS

    When we’re looking at SPSS and SAS, both of these languages originate from the explanatory side of Data Analysis. They are developed in an academic environment, where hypotheses testing plays a major role. This makes that they have significant less methods and techniques in comparison to R and Python. Nowadays, SAS and SPSS both have data mining tools (SAS Enterprise Miner and SPSS Modeler), however these are different tools and you’ll need extra licenses.

    I have spent some time to build extensive macros in SAS EG to seamlessly create predictive models, which also does a decent job at explaining the feature importance. While a Neural Network may do a fair job at making predictions, it is extremely difficult to explain such models, let alone feature importance. The macros that I have built in SAS EG does precisely the job of explaining the features, apart from producing excellent predictions.

    Open source TOOLS: R & PYTHON

    One of the major advantages of open source tools is that the community continuously improves and increases functionality. R was created by academics, who wanted their algorithms to spread as easily as possible. R has the widest range of algorithms, which makes R strong on the explanatory side and on the predictive side of Data Analysis.

    Python is developed with a strong focus on (business) applications, not from an academic or statistical standpoint. This makes Python very powerful when algorithms are directly used in applications. Hence, we see that the statistical capabilities are primarily focused on the predictive side. Python is mostly used in Data Mining or Machine Learning applications where a data analyst doesn’t need to intervene. Python is therefore also strong in analyzing images and videos. Python is also the easiest language to use when using Big Data Frameworks like Spark. With the plethora of packages and ever improving functionality, Python is a very accessible tool for data scientists.

    MACHINE LEARNING MODELS

    While procedures like Logistic Regression are very good at explaining the features used in a prediction, some others like Neural Networks are not. The latter procedures may be preferred over the former when it comes to only prediction accuracy and not explaining the models. Interpreting or explaining the model becomes an issue for Neural Networks. You can’t just peek inside a deep neural network to figure out how it works. A network’s reasoning is embedded in the behavior of numerous simulated neurons, arranged into dozens or even hundreds of interconnected layers. In most cases the Product Marketing Officer may be interested in knowing what are the factors that are most important for a specific advertising project. What can they concentrate on to get the response rates higher, rather than, what will be their response rate, or revenues in the upcoming year. These questions are better answered by procedures which can be interpreted in an easier way. This is a great article about the technical and ethical consequences of the lack of explanations provided by complex AI models.

    Procedures like Decision Trees are very good at explaining and visualizing what exactly are the decision points (features and their metrics). However, those do not produce the best models. Random Forests, Boosting are the procedures which use Decision Trees as the basic starting point to build the predictive models, which are by far some of the best methods to build sophisticated prediction models.

    While Random Forests use fully grown (highly complex) Trees, and by taking random samples from the training set (a process called Bootstrapping), then each split uses only a proper subset of features from the entire feature set to actually make the split, rather than using all of the features. This process of bootstrapping helps with lower number of training data (in many cases there is no choice to get more data). The (proper) subsetting of the features has a tremendous effect on de-correlating the Trees grown in the Forest (hence randomizing it), leading to a drop in Test Set error. A fresh subset of features is chosen at each step of splitting, making the method robust. The strategy also stops the strongest feature from appearing each time a split is considered, making all the trees in the forest similar. The final result is obtained by averaging the result over all trees (in case of Regression problems), or by taking a majority class vote (in case of classification problem).

    On the other hand, Boosting is a method where a Forest is grown using Trees which are NOT fully grown, or in other words, with Weak Learners. One has to specify the number of trees to be grown, and the initial weights of those trees for taking a majority vote for class selection. The default weight, if not specified is the average of the number of trees requested. At each iteration, the method fits these weak learners, finds the residuals. Then the weights of those trees which failed to predict the correct class is increased so that those trees can concentrate better on the failed examples. This way, the method proceeds by improving the accuracy of the Boosted Trees, stopping when the improvement is below a threshold. One particularly implementation of Boosting, AdaBoost has very good accuracy over other implementations. AdaBoost uses Trees of depth 1, known as Decision Stump as each member of the Forest. These are slightly better than random guessing to start with, but over time they learn the pattern and perform extremely well on test set. This method is more like a feedback control mechanism (where the system learns from the errors). To address overfitting, one can use the hyper-parameter Learning Rate (lambda) by choosing values in the range: (0,1]. Very small values of lambda will take more time to converge, however larger values may have difficulty converging. This can be achieved by a iterative process to select the correct value for lambda, plotting the test error rate against values of lambda. The value of lambda with the lowest test error should be chosen.

    In all these methods, as we move from Logistic Regression, to Decision Trees to Random Forests and Boosting, the complexity of the models increase, making it almost impossible to EXPLAIN the Boosting model to marketers/product managers. Decision Trees are easy to visualize, Logisitic Regression results can be used to demonstrate the most important factors in a customer acquisition model and hence will be well received by business leaders. On the other hand, the Random Forest and Boosting methods are extremely good predictors, without much scope for explaining. But there is hope: These models have functions for revealing the most important variables, although it is not possible to visualize why. 

    USING A BALANCED APPROACH

    So I use a mixed strategy: Use the previous methods as a step in Exploratory Data Analysis, present the importance of features, characteristics of the data to the business leaders in phase one, and then use the more complicated models to build the prediction models for deployment, after building competing models. That way, one not only gets to understand what is happening and why, but also gets the best predictive power. In most cases that I have worked, I have rarely seen a mismatch between the explanation and the predictions using different methods. After all, this is all math and the way of delivery should not change end results. Now that’s a happy ending for all sides of the business!

  • Installing Redis Service in DC/OS With Persistent Storage Using AWS Volumes

    Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker.

    It supports various data structures such as Strings, Hashes, Lists, Sets etc. DCOS offers Redis as a service

    Why Do We Use External Persistent Storage for Redis Mesos Containers?

    Since Redis is an in-memory database, an instance/service restart will result in loss of data. To counter this, it is always advisable to snapshot the Redis in-memory database from time to time.

    This helps Redis instance to recover from the point in time failure.

    In DCOS, Redis is deployed as a stateless service. To make it a stateful and persistent data, we can configure local volumes or external volumes.

    The disadvantage of having a local volume mapped to Mesos containers is when a slave node goes down, your local volume becomes unavailable, and the data loss occurs.

    However, with external persistent volumes, as they are available on each node of the DCOS cluster, a slave node failure does not impact the data availability.

    Rex-Ray

    REX-Ray is an open source, storage management solution designed to support container runtimes such as Docker and Mesos.

    REX-Ray enables stateful applications such as databases to persist and maintain its data after the life cycle of the container has ended. Built-in high availability enables orchestrators such as Docker Swarm, Kubernetes, and Mesos Frameworks like Marathon to automatically orchestrate storage tasks between hosts in a cluster.

    Built on top of the libStorage framework, REX-Ray’s simplified architecture consists of a single binary and runs as a stateless service on every host using a configuration file to orchestrate multiple storage platforms.

    Objective: To create a Redis service in DC/OS environment with persistent storage.

    Warning: The Persistent Volume feature is still in beta Phase for DC/OS Version 1.11.

    Prerequisites:

    • Make sure the rexray service is running and is in a healthy state for the cluster.

    Steps:

    • Click on the Add button in Services component of DC/OS GUI.
    • Click on JSON Configuration.  

    Note: For persistent storage, below code should be added in the normal Redis service configuration JSON file to mount external persistent volumes.

    "volumes": [
          {
            "containerPath": "/data",
            "mode": "RW",
            "external": {
              "name": "redis4volume",
              "provider": "dvdi",
              "options": {
                "dvdi/driver": "rexray"
              }
            }
          }
        ],

    • Make sure the service is up and in a running state:

    If you look closely, the service was suspended and respawned on a different slave node. We populated the database with dummy data and saved the snapshot in the data directory.

    When the service did come upon a different node 10.0.3.204, the data persisted and the volume was visible on the new node.

    core@ip-10-0-3-204 ~ $ /opt/mesosphere/bin/rexray volume list
    
    - name: datavolume
      volumeid: vol-00aacade602cf960c
      availabilityzone: us-east-1a
      status: in-use
      volumetype: standard
      iops: 0
      size: "16"
      networkname: ""
      attachments:
      - volumeid: vol-00aacade602cf960c
        instanceid: i-0d7cad91b62ec9a64
        devicename: /dev/xvdb
    

    •  Check the volume tab :

    Note: For external volumes, the status will be unavailable. This is an issue with DC/OS.

    The Entire Service JSON file:

    {
      "id": "/redis4.0-new-failover-test",
      "instances": 1,
      "cpus": 1.001,
      "mem": 2,
      "disk": 0,
      "gpus": 0,
      "backoffSeconds": 1,
      "backoffFactor": 1.15,
      "maxLaunchDelaySeconds": 3600,
      "container": {
        "type": "DOCKER",
        "volumes": [
          {
            "containerPath": "/data",
            "mode": "RW",
            "external": {
              "name": "redis4volume",
              "provider": "dvdi",
              "options": {
                "dvdi/driver": "rexray"
              }
            }
          }
        ],
        "docker": {
          "image": "redis:4",
          "network": "BRIDGE",
          "portMappings": [
            {
              "containerPort": 6379,
              "hostPort": 0,
              "servicePort": 10101,
              "protocol": "tcp",
              "name": "default",
              "labels": {
                "VIP_0": "/redis4.0:6379"
              }
            }
          ],
          "privileged": false,
          "forcePullImage": false
        }
      },
      "healthChecks": [
        {
          "gracePeriodSeconds": 60,
          "intervalSeconds": 5,
          "timeoutSeconds": 5,
          "maxConsecutiveFailures": 3,
          "portIndex": 0,
          "protocol": "TCP"
        }
      ],
      "upgradeStrategy": {
        "minimumHealthCapacity": 0.5,
        "maximumOverCapacity": 0
      },
      "unreachableStrategy": {
        "inactiveAfterSeconds": 300,
        "expungeAfterSeconds": 600
      },
      "killSelection": "YOUNGEST_FIRST",
      "requirePorts": true
    }

    Redis entrypoint

    To connect with Redis service, use below host:port in your applications:

    redis.marathon.l4lb.thisdcos.directory:6379

    Conclusion

    We learned about Standalone Redis Service deployment from DCOS catalog on DCOS.  Also, we saw how to add Persistent storage to it using RexRay. We also learned how RexRay automatically manages volumes over AWS ebs and how to integrate them in DCOS apps/services.  Finally, we saw how other applications can communicate with this Redis service.

    References