Author: admin

  • Implementing Federated GraphQL Microservices using Apollo Federation

    Introduction

    GraphQL has revolutionized how a client queries a server. With the thin layer of GraphQL middleware, the client has the ability to query the data more comprehensively than what’s provided by the usual REST APIs.

    One of the key principles of GraphQL involves having a single data graph of the implementing services that will allow the client to have a unified interface to access more data and services through a single query. Having said that, it can be challenging to follow this principle for an enterprise-level application on a single, monolith GraphQL server.

    The Need for Federated Services

    James Baxley III, the Engineering Manager at Apollo, in his talk here, puts forward the rationale behind choosing an independently managed federated set of services very well.

    To summarize his point, let’s consider a very complex enterprise product. This product would essentially have multiple teams responsible for maintaining different modules of the product. Now, if we’re considering implementing a GraphQL layer at the backend, it would only make sense to follow the one graph principle of GraphQL: this says that to maximize the value of GraphQL, we should have a single unified data graph that’s operating at the data layer of this product. With that, it will be easier for a client to query a single graph and get all the data without having to query different graphs for different data portions.

    However, it would be challenging to have all of the huge enterprise data graphs’ layer logic residing on a single codebase. In addition, we want teams to be able to independently implement, maintain, and ship different schemas of the data graph on their own release cycles.

    Though there is only one graph, the implementation of that graph should be federated across multiple teams.

    Now, let’s consider a massive enterprise e-commerce platform as an example. The different schemas of the e-commerce platform look something like:

    Fig:- E-commerce platform set of schemas

    Considering the above example, it would be a chaotic task to maintain the graph implementation logic of all these schemas on a single code base. Another overhead that this would bring is having to scale a huge monolith that’s implementing all these services. 

    Thus, one solution is a federation of services for a single distributed data graph. Each service can be implemented independently by individual teams while maintaining their own release cycles and having their own iterations of their services. Also, a federated set of services would still follow the Onegraph principle of GraphQL, which will allow the client to query a single endpoint for fetching any part of the data graph.

    To further demonstrate the example above, let’s say the client asks for the top-five products, their reviews, and the vendor selling them. In a usual monolith GraphQL server, this query would involve writing a resolver that’s a mesh of the data sources of these individual schemas. It would be a task for teams to collaborate and come up with their individual implementations. Let’s consider a federated approach with separate services implementing products, reviews, and vendors. Each service is responsible for resolving only the part of the data graph that includes the schema and data source. This makes it extremely streamlined to allow different teams managing different schemas to collaborate easily.

    Another advantage would be handling the scaling of individual services rather than maintaining a compute-heavy monolith for a huge data graph. For example, the products service is used the most on the platform, and the vendors service is scarcely used. In case of a monolith approach, the scaling would’ve had to take place on the overall server. This is eliminated with federated services where we can independently maintain and scale individual services like the products service.

    Federated Implementation of GraphQL Services

    A monolith GraphQL server that implements a lot of services for different schemas can be challenging to scale. Instead of implementing the complete data graph on a single codebase, the responsibilities of different parts of the data graph can be split across multiple composable services. Each one will contain the implementation of only the part of the data graph it is responsible for. Apollo Federation allows this division of services and follows a declarative programming model to allow splitting of concerns.

    Architecture Overview

    This article will not cover the basics of GraphQL, such as writing resolvers and schemas. If you’re not acquainted with the basics of GraphQL and setting up a basic GraphQL server using Apollo, I would highly recommend reading about it here. Then, you can come back here to understand the implementation of federated services using Apollo Federation.

    Apollo Federation has two principal parts to it:

    • A collection of services that distinctly define separate GraphQL schemas
    • A gateway that builds the federated data graph and acts as a forefront to distinctly implement queries for different services
    Fig:- Apollo Federation Architecture

    Separation of Concerns

    The usual way of going about implementing federated services would be by splitting an existing monolith based on the existing schemas defined. Although this way seems like a clear approach, it will quickly cause problems when multiple Schemas are involved.

    To illustrate, this is a typical way to split services from a monolith based on the existing defined Schemas:

     

    In the example above, although the tweets field belongs to the User schema, it wouldn’t make sense to populate this field in the User service. The tweets field of a User should be declared and resolved in the Tweet service itself. Similarly, it wouldn’t be right to resolve the creator field inside the Tweet service.

    The reason behind this approach is the separation of concerns. The User service might not even have access to the Tweet datastore to be able to resolve the tweets field of a user. On the other hand, the Tweet service might not have access to the User datastore to resolve the creator field of the Tweet schema.

    Considering the above schemas, each service is responsible for resolving the respective field of each Schema it is responsible for.

    Implementation

    To illustrate an Apollo Federation, we’ll be considering a Nodejs server built with Typescript. The packages used are provided by the Apollo libraries.

    npm i --save apollo-server @apollo/federation @apollo/gateway

    Some additional libraries to help run the services in parallel:

    npm i --save nodemon ts-node concurrently

    Let’s go ahead and write the structure for the gateway service first. Let’s create a file gateway.ts:

    touch gateway.ts

    And add the following code snippet:

    import { ApolloServer } from 'apollo-server';
    import { ApolloGateway } from '@apollo/gateway';
    
    const gateway = new ApolloGateway({
      serviceList: [],
    });
    
    const server = new ApolloServer({ gateway, subscriptions: false });
    
    server.listen().then(({ url }) => {
      console.log(`Server ready at url: ${url}`);
    });

    Note the serviceList is an empty array for now since we’ve yet to implement the individual services. In addition, we pass the subscriptions: false option to the apollo server config because currently, Apollo Federation does not support subscriptions.

    Next, let’s add the User service in a separate file user.ts using:

    touch user.ts

    The code will go in the user service as follows:

    import { buildFederatedSchema } from '@apollo/federation';
    import { ApolloServer, gql } from 'apollo-server';
    import User from './datasources/models/User';
    import mongoStore from './mongoStore';
    
    const typeDefs = gql`
      type User @key(fields: "id") {
        id: ID!
        username: String!
      }
      extend type Query {
        users: [User]
        user(id: ID!): User
      }
      extend type Mutation {
        createUser(userPayload: UserPayload): User
      }
      input UserPayload {
        username: String!
      }
    `;
    
    const resolvers = {
      Query: {
        users: async () => {
          const allUsers = await User.find({});
          return allUsers;
        },
        user: async (_, { id }) => {
          const currentUser = await User.findOne({ _id: id });
          return currentUser;
        },
      },
      User: {
        __resolveReference: async (ref) => {
          const currentUser = await User.findOne({ _id: ref.id });
          return currentUser;
        },
      },
      Mutation: {
        createUser: async (_, { userPayload: { username } }) => {
          const user = new User({ username });
          const createdUser = await user.save();
          return createdUser;
        },
      },
    };
    
    mongoStore();
    
    const server = new ApolloServer({
      schema: buildFederatedSchema([{ typeDefs, resolvers }]),
    });
    
    server.listen({ port: 4001 }).then(({ url }) => {
      console.log(`User service ready at url: ${url}`);
    });

    Let’s break down the code that went into the User service.

    Consider the User schema definition:

    type User @key(fields: "id") {
       id: ID!
       username: String!
    }

    The @key directive helps other services understand the User schema is, in fact, an entity that can be extended within other individual services. The fields will help other services uniquely identify individual instances of the User schema based on the id.

    The Query and the Mutation types need to be extended by all implementing services according to the Apollo Federation documentation since they are always defined on a gateway level.

    As a side note, the User model imported from datasources/model/User

    import User from ‘./datasources/models/User’; is essentially a Mongoose ORM Model for MongoDB that will help in all the CRUD operations of a User entity in a MongoDB database. In addition, the mongoStore() function is responsible for establishing a connection to the MongoDB database server.

    The User model implementation internally in Mongoose ORM looks something like this:

    export const UserSchema = new Schema({
      username: {
        type: String,
      },
    });
    
    export default mongoose.model(
      'User',
      UserSchema
    );

    In the Query type, the users and the user(id: ID!) queries fetch a list or the details of individual users.

    In the resolvers, we define a __resolveReference function responsible for returning an instance of the User entity to all other implementing services, which just have a reference id of a User entity and need to return an instance of the User entity. The ref parameter is an object { id: ‘userEntityId’ } that contains the id of an instance of the User entity that may be passed down from other implementing services that need to resolve the reference of a User entity based on the reference id. Internally, we fire a mongoose .findOne query to return an instance of the User from the users database based on the reference id. To illustrate the resolver, 

    User: {
        __resolveReference: async (ref) => {
          const currentUser = User.findOne({ _id: ref.id });
          return currentUser;
        },
      },

    At the end of the file, we make sure the service is running on a unique port number 4001, which we pass as an option while running the apollo server. That concludes the User service.

    Next, let’s add the tweet service by creating a file tweet.ts using:

    touch tweet.ts

    The following code goes as a part of the tweet service:

    import { buildFederatedSchema } from '@apollo/federation';
    import { ApolloServer, gql } from 'apollo-server';
    import Tweet from './datasources/models/Tweet';
    import TweetAPI from './datasources/tweet';
    import mongoStore from './mongoStore';
    
    const typeDefs = gql`
      type Tweet {
        text: String
        id: ID!
        creator: User
      }
      extend type User @key(fields: "id") {
        id: ID! @external
        tweets: [Tweet]
      }
      extend type Query {
        tweet(id: ID!): Tweet
        tweets: [Tweet]
      }
      extend type Mutation {
        createTweet(tweetPayload: TweetPayload): Tweet
      }
      input TweetPayload {
        userId: String
        text: String
      }
    `;
    
    const resolvers = {
      Query: {
        tweet: async (_, { id }) => {
          const currentTweet = await Tweet.findOne({ _id: id });
          return currentTweet;
        },
        tweets: async () => {
          const tweetsList = await Tweet.find({});
          return tweetsList;
        },
      },
      Tweet: {
        creator: (tweet) => ({ __typename: 'User', id: tweet.userId }),
      },
      User: {
        tweets: async (user) => {
          const tweetsByUser = await Tweet.find({ userId: user.id });
          return tweetsByUser;
        },
      },
      Mutation: {
        createTweet: async (_, { tweetPayload: { text, userId } }) => {
          const newTweet = new Tweet({ text, userId });
          const createdTweet = await newTweet.save();
          return createdTweet;
        },
      },
    };
    
    mongoStore();
    
    const server = new ApolloServer({
      schema: buildFederatedSchema([{ typeDefs, resolvers }]),
    });
    
    server.listen({ port: 4002 }).then(({ url }) => {
      console.log(`Tweet service ready at url: ${url}`);
    });

    Let’s break down the Tweet service as well

    type Tweet {
       text: String
       id: ID!
       creator: User
    }

    The Tweet schema has the text field, which is the content of the tweet, a unique id of the tweet,  and a creator field, which is of the User entity type and resolves into the details of the user that created the tweet:

    extend type User @key(fields: "id") {
       id: ID! @external
       tweets: [Tweet]
    }

    We extend the User entity schema in this service, which has the id field with an @external directive. This helps the Tweet service understand that based on the given id field of the User entity schema, the instance of the User entity needs to be derived from another service (user service in this case).

    As we discussed previously, the tweets field of the extended User schema for the user entity should be resolved in the Tweet service since all the resolvers and access to the data sources with respect to the Tweets entity resides in this service.

    The Query and Mutation types of the Tweet service are pretty straightforward; we have a tweets and a tweet(id: ID!) queries to resolve a list or resolve an individual instance of the Tweet entity.

    Let’s further break down the resolvers:

    Tweet: {
       creator: (tweet) => ({ __typename: 'User', id: tweet.userId }),
    },

    To resolve the creator field of the Tweet entity, the Tweet service needs to tell the gateway that this field will be resolved by the User service. Hence, we pass the id of the User and a __typename for the gateway to be able to call the right service to resolve the User entity instance. In the User service earlier, we wrote a  __resolveReference resolver, which will resolve the reference of a User based on an id.

    User: {
       tweets: async (user) => {
           const tweetsByUser = await Tweet.find({ userId: user.id });
           return tweetsByUser;
       },
    },

    Now, we need to resolve the tweets field of the User entity extended in the Tweet service. We need to write a resolver where we get the parent user entity reference in the first argument of the resolver using which we can fire a Mongoose ORM query to return all the tweets created by the user given its id.

    At the end of the file, similar to the User service, we make sure the Tweet service runs on a different port by adding the port: 4002 option to the Apollo server config. That concludes both our implementing services.

    Now that we have our services ready, let’s update our gateway.ts file to reflect the added services:

    import { ApolloServer } from 'apollo-server';
    import { ApolloGateway } from '@apollo/gateway';
    
    const gateway = new ApolloGateway({
      serviceList: [
          { name: 'users', url: 'http://localhost:4001' },
          { name: 'tweets', url: 'http://localhost:4002' },
        ],
    });
    
    const server = new ApolloServer({ gateway, subscriptions: false });
    
    server.listen().then(({ url }) => {
      console.log(`Server ready at url: ${url}`);
    });

    We’ve added two services to the serviceList with a unique name to identify each service followed by the URL they are running on.

    Next, let’s make some small changes to the package.json file to make sure the services and the gateway run in parallel:

    "scripts": {
        "start": "concurrently -k npm:server:*",
        "server:gateway": "nodemon --watch 'src/**/*.ts' --exec 'ts-node' src/gateway.ts",
        "server:user": "nodemon --watch 'src/**/*.ts' --exec 'ts-node' src/user.ts",
        "server:tweet": "nodemon --watch 'src/**/*.ts' --exec 'ts-node' src/tweet.ts"
      },

    The concurrently library helps run 3 separate scripts in parallel. The server:* scripts spin up a dev server using nodemon to watch and reload the server for changes and ts-node to execute Typescript node.

    Let’s spin up our server:

    npm start

    On visiting the http://localhost:4000, you should see the GraphQL query playground running an Apollo server:

    Querying and Mutation from the Client

    Initially, let’s fire some mutations to create two users and some tweets by those users.

    Mutations

    Here we have created a user with the username “@elonmusk” that returns the id of the user. Fire the following mutations in the GraphQL playground:

     

    We will create another user named “@billgates” and take a note of the ID.

    Here is a simple mutation to create a tweet by the user “@elonmusk”. Now that we have two created users, let’s fire some mutations to create tweets by those users:

    Here is another mutation that creates a tweet by the user“@billgates”.

    After adding a couple of those, we are good to fire our queries, which will allow the gateway to compose the data by resolving fields through different services.

    Queries

    Initially, let’s list all the tweets along with their creator, which is of type User. The query will look something like:

    {
     tweets {
       text
       creator {
         username
       }
     }
    }

    When the gateway encounters a query asking for tweet data, it forwards that query to the Tweet service since the Tweet service that extends the Query type has a tweet query defined in it. 

    On encountering the creator field of the tweet schema, which is of the type User, the creator resolver within the Tweet service is invoked. This is essentially just passing a __typename and an id, which tells the gateway to resolve this reference from another service.

    In the User service, we have a __resolveReference function, which returns the complete instance of a user given it’s id passed from the Tweet service. It also helps all other implementing services that need the reference of a User entity resolved.

    On firing the query, the response should look something like:

    {
      "data": {
        "tweets": [
          {
            "text": "I own Tesla",
            "creator": {
              "username": "@elonmusk"
            }
          },
          {
            "text": "I own SpaceX",
            "creator": {
              "username": "@elonmusk"
            }
          },
          {
            "text": "I own PayPal",
            "creator": {
              "username": "@elonmusk"
            }
          },
          {
            "text": "I own Microsoft",
            "creator": {
              "username": "@billgates"
            }
          },
          {
            "text": "I own XBOX",
            "creator": {
              "username": "@billgates"
            }
          }
        ]
      }
    }

    Now, let’s try it the other way round. Let’s list all users and add the field tweets that will be an array of all the tweets created by that user. The query should look something like:

    {
     users {
       username
       tweets {
         text
       }
     }
    }

    When the gateway encounters the query of type users, it passes down that query to the user service. The User service is responsible for resolving the username field of the query.

    On encountering the tweets field of the users query, the gateway checks if any other implementing service has extended the User entity and has a resolver written within the service to resolve any additional fields of the type User.

    The Tweet service has extended the type User and has a resolver for the User type to resolve the tweets field, which will fetch all the tweets created by the user given the id of the user.

    On firing the query, the response should be something like:

    {
      "data": {
        "users": [
          {
            "username": "@elonmusk",
            "tweets": [
              {
                "text": "I own Tesla"
              },
              {
                "text": "I own SpaceX"
              },
              {
                "text": "I own PayPal"
              }
            ]
          },
          {
            "username": "@billgates",
            "tweets": [
              {
                "text": "I own Microsoft"
              },
              {
                "text": "I own XBOX"
              }
            ]
          }
        ]
      }
    }

    Conclusion

    To scale an enterprise data graph on a monolith GraphQL service brings along a lot of challenges. Having the ability to distribute our data graph into implementing services that can be individually maintained or scaled using Apollo Federation helps to quell any concerns.

    There are further advantages of federated services. Considering our example above, we could have two different kinds of datastores for the User and the Tweet service. While the User data could reside on a NoSQL database like MongoDB, the Tweet data could be on a SQL database like Postgres or SQL. This would be very easy to implement since each service is only responsible for resolving references only for the type they own.

    Final Thoughts

    One of the key advantages of having different services that can be maintained individually is the ability to deploy each service separately. In addition, this also enables deployment of different services independently to different platforms such as Firebase, Lambdas, etc.

    A single monolith GraphQL server deployed on an instance or a single serverless platform can have some challenges with respect to scaling an instance or handling high concurrency as mentioned above.

    By splitting out the services, we could have a separate serverless function for each implementing service that can be maintained or scaled individually and also a separate function on which the gateway can be deployed.

    One popular usage of GraphQL Federation can be seen in this Netflix Technology blog, where they’ve explained how they solved a bottleneck with the GraphQL APIs in Netflix Studio . What they did was create a federated GraphQL microservices architecture, along with a Schema store using Apollo Federation. This solution helped them create a unified schema but with distributed ownership and implementation.

  • How to Write Jenkinsfile for Angular and .Net Based Applications

    If you landed here directly and want to know how to setup Jenkins master-slave architecture, please visit this post related to Setting-up the Jenkins Master-Slave Architecture.

    The source code that we are using here is also a continuation of the code that was written in this GitHub Packer-Terraform-Jenkins repository.

    Creating Jenkinsfile

    We will create some Jenkinsfile to execute a job from our Jenkins master.

    Here I will create two Jenkinsfile ideally, it is expected that your Jenkinsfile is present in source code repo but it can be passed directly in the job as well.

    There are 2 ways of writing Jenkinsfile – Scripted and Declarative. You can find numerous points online giving their difference. We will be creating both of them to do a build so that we can get a hang of both of them.

    Jenkinsfile for Angular App (Scripted)

    As mentioned before we will be highlighting both formats of writing the Jenkinsfile. For the Angular app, we will be writing a scripted one but can be easily written in declarative format too.

    We will be running this inside a docker container. Thus, the tests are also going to get executed in a headless manner.

    Here is the Jenkinsfile for reference.

    Here we are trying to leverage Docker volume to keep updating our source code on bare metal and use docker container for the environments.

    Dissecting Node App’s Jenkinsfile

    1. We are using CleanWs() to clear the workspace.
    2. Next is the Main build in which we define our complete build process.
    3. We are pulling the required images.
    4. Highlighting the steps that we will be executing.
    5. Checkout SCM: Checking out our code from Git
    6. We are now starting the node container inside of which we will be running npm install and npm run lint.
    7. Get test dependency: Here we are downloading chrome.json which will be used in the next step when starting the container.
    8. Here we test our app. Specific changes for running the test are mentioned below.
    9. Build: Finally we build the app.
    10. Deploy: Once CI is completed we need to start with CD. The CD itself can be a blog of itself but wanted to highlight what basic deployment would do.
    11. Here we are using Nginx container to host our application.
    12. If the container does not exist it will create a container and use the “dist” folder for deployment.
    13. If Nginx container exists, then it will ask for user input to recreate a container or not.
    14. If you select not to create, don’t worry as we are using Nginx it will do a hot reload with new changes.

    The angular application used here was created using the standard generate command given by the CLI itself. Although the build and install give no trouble in a bare metal some tweaks are required for running test in a container.

    In karma.conf.js update browsers withChromeHeadless.

    Next in protractor.conf.js update browserName with chrome and add

    chromeOptions': {
    args': ['--headless', '--disable-gpu', '--window-size=800x600']
    },

    That’s it! And We have our CI pipeline setup for Angular based application.

    Jenkinsfile for .Net App (Declarative)

    For a .Net application, we have to setup MSBuild and MSDeploy. In the blog post mentioned above, we have already setup MSBuild and we will shortly discuss how to setup MSDeploy.

    To do the Windows deployment we have two options. Either setup MSBuild in Jenkins Global Tool Configuration or use the full path of MSBuild on the slave machine.

    Passing the path is fairly simple and here we will discuss how to use global tool configuration in a Jenkinsfile.

    First, get the path of MSBuild from your server. If it is not the latest version then the path is different and is available in Current directory otherwise always in <version> directory.</version>

    As we are using MSBuild 2017. Our MSBuild path is:

    C:Program Files (x86)Microsoft Visual Studio2017BuildToolsMSBuild15.0Bin

    Place this in /configureTools/ —> MSBuild

    Now you have your configuration ready to be used in Jenkinsfile.

    Jenkinsfile to build and test the app is given below.

    As seen above the structure of Declarative syntax is almost same as that of Declarative. Depending upon which one you find easier to read you should opt the syntax.

    Dissecting Dotnet App’s Jenkinsfile

    1. In this case too we are cleaning the workspace as the first step.
    2. Checkout: This is also the same as before.
    3. Nuget Restore: We are downloading dependent required packages for both PrimeService and PrimeService.Tests
    4. Build: Building the Dotnet app using MSBuild tool which we had configured earlier before writing the Jenkinsfile.
    5. UnitTest: Here we have used dotnet test although we could’ve used MSTest as well here just wanted to highlight how easy dotnet utility makes it. We can even use dotnet build for the build as well.
    6. Deploy: Deploying on the IIS server. Creation of IIS we are covering below.

    From the above-given examples, you get a hang of what Jenkinsfile looks like and how it can be used for creating jobs. Above file highlights basic job creation but it can be extended to everything that old-style job creation could do.

    Creating IIS Server

    Unlike our Angular application where we just had to get another image and we were good to go. Here we will have to Packer to create our IIS server. We will be automating the creation process and will be using it to host applications.

    Here is a Powershell script for IIS for reference.

    # To list all Windows Features: dism /online /Get-Features
    # Get-WindowsOptionalFeature -Online 
    # LIST All IIS FEATURES: 
    # Get-WindowsOptionalFeature -Online | where FeatureName -like 'IIS-*'
    
    # NetFx dependencies
    dism /online /Enable-Feature /FeatureName:NetFx4 /All
    
    # ASP dependencies
    dism /online /enable-feature /all /featurename:IIS-ASPNET45
    
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-WebServerRole
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-WebServer 
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-CommonHttpFeatures
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-Security 
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-RequestFiltering 
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-StaticContent
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-DefaultDocument
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-DirectoryBrowsing
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-HttpErrors 
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ApplicationDevelopment
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-WebSockets 
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ApplicationInit
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-NetFxExtensibility45
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ISAPIExtensions
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ISAPIFilter
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ASP
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ASPNET45
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ServerSideIncludes
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-HealthAndDiagnostics
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-HttpLogging 
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-Performance
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-HttpCompressionStatic
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-WebServerManagementTools
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ManagementConsole 
    Enable-WindowsOptionalFeature -Online -FeatureName IIS-ManagementService
    
    # Install Chocolatey
    Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
    
    # Install WebDeploy (It will deploy 3.6)
    choco install webdeploy -y

    We won’t be deploying any application on it as we have created a sample app for PrimeNumber. But in the real world, you might be deploying Web Based application and you will need IIS. We have covered here the basic idea of how to install IIS along with any dependency that might be required.

    Conclusion

    In this post, we have covered deploying Windows and Linux based applications using Jenkinsfile in both scripted and declarative format.

    Thanks for Reading! Till next time…!!

  • Getting Started With Kubernetes Operators (Ansible Based) – Part 2

    Introduction

    In the first part of this blog series, getting started with Kubernetes operators (Helm based), we learned the basics of operators and build a Helm based operator. In this blog post, we will try out an Ansible-based operator. Ansible is a very popular tool used by organizations across the globe for configuration management, deployment, and automation of other operational tasks, this makes Ansible an ideal tool to build operators as with operators also we intend to eliminate/minimize the manual interventions required while running/managing our applications on Kubernetes. Ansible based operators allow us to use Ansible playbooks and roles to manage our application on Kubernetes.

    Operator Maturity Model

    Image source: Github

    Before we start building the operator let’s spend some time in understanding the operator maturity model. Operator maturity model gives an idea of the kind of application management capabilities different types of operators can have. As we can see in the diagram above the model describes five generic phases of maturity/capability for operators. The minimum expectation/requirement from an operator is that they should be able to deploy/install and upgrade application and that is provided by all the operators. Helm based operators are simplest of all of them as Helm is Chart manager and we can do only install and upgrades using it. Ansible based operators can be more mature as Ansible has modules to perform a wide variety of operational tasks, we can use these modules in the Ansible roles/playbooks we use in our operator and make them handle more complex applications or use cases. In the case of Golang based operators, we write the operational logic ourselves so we have the liberty to customize it as per our requirements.

    Building an Ansible Based Operator

    1. Let’s first install the operator sdk

    go get -d github.com/operator-framework/operator-sdk
    cd $GOPATH/src/github.com/operator-framework/operator-sdk
    git checkout master make dep make install

    Now we will have the operator-sdk binary in the $GOPATH/bin folder.

    2.  Setup the project

    Operator-sdk new bookstore-operator–api-version=blog.velotio.com/v1alpha1–kind=BookStore–type=ansible

    In the above command we have set the operator type as ansible as we want an ansible based operator. It creates a folder structure as shown below

    bookstore-operator/
    
    | |- build/ # Contains the Dockerfile to build the operator image.
    
    | |- deploy/ # Contains the crd, cr and manifest files for deploying operator.
    
    | |- roles/ # Contains the helm chart we used while creating the project.
    
    | |- molecule/ # molecule is used for testing the ansible roles.
    
    | |- watches.yaml # Specifies the resource the operator watches (maintains the state of). 

    Inside the roles folder, it creates an Ansible role name `bookstore`. This role is bootstrapped with all the directories and files which are part of the standard ansible roles.

    Now let’s take a look at the watches.yaml file:

    ---
    - version: v1alpha1
      group: blog.velotio.com
      kind: BookStore
      role: /opt/ansible/roles/bookstore

    Here we can see that it looks just like the operator is going to watch the events related to the objects of BookStore kind and execute the ansible role bookstore.  Drawing parallels from our helm based operator we can see that the behavior in both the cases are similar the only difference being that in case of Helm based operator the operator used to execute the helm chart specified in response to the events related to the object it was watching and here we are executing an ansible role.

    In case of ansible based operators, we can get the operator to execute an Ansible playbook as well rather than an ansible role.

    3.  Building the bookstore Ansible role        

    Now we need to modify the bookstore Ansible roles created for us by the operator-framework.

    First we will update the custom resource (CR) file ( blog_v1alpha1_bookstore_cr.yaml) available at deploy/crd/ location. In this CR we can configure all the values which we want to pass to the bookstore Ansible role. By default the CR contains only the size field, we will update it to include other field which we need in our role.  To keep things simple, we will just include some basic variables like image name, tag etc. in our spec.

    apiVersion: blog.velotio.com/v1alpha1
    kind: BookStore
    metadata:
      name: my-bookstore
    spec:
      image:
        app:
          repository: akash125/pyapp
          tag: latest
          pullPolicy: IfNotPresent
        mongodb:
          repository: mongo
          tag: latest
          pullPolicy: IfNotPresent
        
      service:
        app:
          type: LoadBalancer
        mongodb:
          type: ClusterIP

    The Ansible operator passes the key values pairs listed in the spec of the cr as variables to Ansible.  The operator changes the name of the variables to snake_case before running Ansible so when we use the variables in our role we will refer the values in snake case.

    Next, we need to create the tasks the bookstore roles will execute. Now we will update the tasks to define our deployment. By default an Ansible role executes the tasks defined at `/tasks/main.yml`. For defining our deployment we will leverage k8s module of Ansible. We will create a kubernetes deployment and service for our app as well as mongodb.

    ---
    # tasks file for bookstore
    
    - name: Create the mongodb deployment
      k8s:
        definition:
          kind: Deployment
          apiVersion: apps/v1beta1
          metadata:
            name: mongodb-deployment
            namespace: '{{ meta.namespace }}'
          spec:
            replicas: 1
            selector:
            matchLabels:
              app: book-store-mongodb
            template:
              metadata:
                labels:
                  app: book-store-mongodb
              spec:
                containers:
                - name: mongodb
                  image: "{{image.mongodb.repository}}:{{image.mongodb.tag}}"
                  imagePullPolicy: "{{ image.mongodb.pull_policy }}"
                  ports:
                  - containerPort: 27017
    
    - name: Create the mongodb service
      k8s:
        definition:
          apiVersion: v1
          kind: Service
          metadata:
            name: mongodb-service
            namespace: '{{ meta.namespace }}'
            labels:
              app: book-store-mongodb
          spec:
            type: "{{service.mongodb.type}}"
            ports:
            - name: elb-port
              port: 27017
              protocol: TCP
              targetPort: 27017
            selector:
              app: book-store-mongodb
     - name: Create the bookstore deployment
      k8s:
        definition: 
          kind: Deployment
          apiVersion: apps/v1beta1
          metadata:
            name: book-store
            namespace: '{{ meta.namespace }}'
          spec:
            replicas: 1
            selector:
            matchLabels:
              app: book-store
            template:
              metadata:
                labels:
                  app: book-store
              spec:
                containers:
                - name: book-store
                  image: "{{image.app.repository}}:{{image.app.tag}}"
                  imagePullPolicy: "{{image.app.pull_policy}}"         
                  ports:
                  - containerPort: 3000
    
    - name: Create the bookstore service
      k8s:
        definition:
          apiVersion: v1
          kind: Service
          metadata:
            name: book-store
            namespace: '{{ meta.namespace }}'
            labels:
              app: book-store
          spec:
            type: "{{service.app.type}}"
            ports:
            - name: elb-port
              port: 80
              protocol: TCP
              targetPort: 3000
            selector:
              app: book-store

    In the above file we can see that we have used the pullPolicy field defined in our cr spec as ‘pull_policy’ in our tasks. Here we have used inline definition to create our k8s objects as our app is quite simple. For large applications creating objects using separate definition files would be a better approach.

    4 . Build the bookstore-operator image

    The Dockerfile for building the operator image is already in our build folder we need to run the below command from the root folder of our operator project to build the image.

    ‘operator-sdk build akash125/bookstore-operator:ansible’

    You can use your own docker repository instead of ‘akash125/bookstore-operator’

    5. Run the bookstore-operator

    As we have our operator image ready we can now go ahead and run it. The deployment file (operator.yaml under deploy folder) for the operator was created as a part of our project setup we just need to set the image for this deployment to the one we built in the previous step.

    After updating the image in the operator.yaml we are ready to deploy the operator.

    kubectl create -f deploy/service_account.yaml
    kubectl create -f deploy/role.yaml
    kubectl create -f deploy/role_binding.yaml
    kubectl create -f deploy/operator.yaml

    Note: The role created might have more permissions then actually required for the operator so it is always a good idea to review it and trim down the permissions in production setups.

    Verify that the operator pod is in running state.

    Here two containers have been started as part of the operator deployment. One is the operator and the other one is ansible. The ansible pod exists only to make the logs available to stdout in ansible format.

    6. Deploy the bookstore app

    Now we have the bookstore-operator running in our cluster we just need to create the custom resource for deploying our bookstore app.

    First, we can create bookstore cr we need to register its crd

    ‘kubectl delete -f deploy/crds/blog_v1alpha1_bookstore_crd.yaml’

    Now we can create the bookstore object

    ‘kubectl delete -f deploy/crds/blog_v1alpha1_bookstore_cr.yaml’

    Now we can see that our operator has deployed out book-store app:

    Now let’s grab the external IP of the app and make some requests to store details of books.

    Let’s hit the external IP on the browser and see if it lists the books we just stored:

    We can see that our ‘book-store’ app is up and running.

    The operator build is available here.

    Conclusion

    In this blog post, we learned how we can create an Ansible based operator using the operator framework. Ansible based operators are a great way to combine the power of Ansible and Kubernetes as it allows us to deploy our applications using Ansible role and playbooks and we can pass parameters to them (control them) using custom K8s resources. If Ansible is being heavily used across your organization and you are migrating to Kubernetes then Ansible based operators are an ideal choice for managing deployments. In the next blog, we will learn about Golang based operators.

  • Flutter vs React Native: A Detailed Comparison

    Flutter and React Native are two of the most popular cross-platform development frameworks on the market. Both of these technologies enable you to develop applications for iOS and Android with a single codebase. However, they’re not entirely interchangeable.

    Flutter allows developers to create Material Design-like applications with ease. React Native, on the other hand, has an active community of open source contributors, which means that it can be easily modified to meet almost any standard.

    In this blog, we have compared both of these technologies based on popularity, performance, learning curve, community support, and developer mindshare to help you decide which one you can use for your next project.

    But before digging into the comparison, let’s have a brief look at both these technologies:

    ‍About React Native

    React Native has gained the attention of many developers for its ease of use with JS code. Facebook has developed the framework to solve cross-platform application development using React and introduced React Native in their first React.js conference in 2015.

    React Native enables developers to create high-end mobile apps with the help of JavaScript. This eventually comes in handy for speeding up the process of developing mobile apps. The framework also makes use of the impressive features of JavaScript while maintaining excellent performance. React Native is highly feature-rich and allows you to create dynamic animations and gestures which are usually unavailable in the native platform.

    React Native has been adopted by many companies as their preferred technology. 

    For example:

    • Facebook
    • Instagram
    • Skype
    • Shopify
    • Tesla
    • Salesforce

    About Flutter

    Flutter is an open-source mobile development kit that makes it easy for developers to build high-quality applications for Android and iOS. It has a library with widgets to create the user interface of the application independent of the platform on which it is supported.

    Flutter has extended the reach of mobile app development by enabling developers to build apps on any platform without being restrained by mobile development limitations. The framework started as an internal project at Google back in 2015, with its first stable release in 2018

    Since its inception, Google aimed to provide a simplistic, usable programming language for building sophisticated apps and wanted to carry out Dart’s goal of replacing JavaScript as the next-generation web programming language.

    Let’s see what all apps are built using Flutter:

    • Google Ads
    • eBay
    • Alibaba
    • BMW
    • Philips Hue

    React Native vs. Flutter – An overall comparison

    Design Capabilities

    React Native is based on React.js, one of the most popular JavaScript libraries for building user interfaces. It is often used with Redux, which provides a solid basis for creating predictable web applications.

    Flutter, on the other hand, is Google’s new mobile UI framework. Flutter uses Dart language to write code, compiled to native code for iOS and Android apps.

    Both React Native and Flutter can be used to create applications with beautiful graphics and smooth animations.

    React Native

    In the React Native framework, UI elements look native to both iOS and Android platforms. These UI elements make it easier for developers to build apps because they only have to write them once. In addition, many of these components also render natively on each platform. The user experiences an interface that feels more natural and seamless while maintaining the capability to customize the app’s look and feel.

    The framework allows developers to use JavaScript or a combination of HTML/CSS/JavaScript for cross-platform development. While React Native allows you to build native apps, it does not mean that your app will look the same on both iOS and Android.

    Flutter

    Flutter is a toolkit for creating high-performance, high-fidelity mobile apps for iOS and Android. Flutter works with existing code, is used by developers and organizations worldwide, and is free and open source. The standard neutral style is what Flutter offers.

    Flutter has its own widgets library, which includes Material Design Components and Cupertino. 

    The Material package contains widgets that look like they belong on Android devices. The Cupertino package contains widgets that look like they belong on iOS devices. By default, a Flutter application uses Material widgets. If you want to use Cupertino widgets, then import the Cupertino library and change your app’s theme to CupertinoTheme.

    Community

    Flutter and React Native have a very active community of developers. Both frameworks have extensive support and documentation and an active GitHub repository, which means they are constantly being maintained and updated.

    With the Flutter community, we can even find exciting tools such as Flutter Inspector or Flutter WebView Plugin. In the case of React Native, Facebook has been investing heavily in this framework. Besides the fact that the development process is entirely open-source, Facebook has created various tools to make the developer’s life easier.

    Also, the more updates and versions come out, the more interest and appreciation the developer community shows. Let’s see how both frameworks stack up when it comes to community engagement.

    For React Native

    The Facebook community is the most significant contributor to the React Native framework, followed by the community members themselves.

    React Native has garnered over 1,162 contributors on GitHub since its launch in 2015. The number of commits (or changes) to the framework has increased over time. It increased from 1,183 commits in 2016 to 1,722 commits in 2017.

    This increase indicates that more and more developers are interested in improving React Native.

    Moreover, there are over 19.8k live projects where developers share their experiences to resolve existing issues. The official React Native website offers tutorials for beginners who want to get started quickly with developing applications for Android and iOS while also providing advanced users with the necessary documentation.

    Also, there are a few other platforms where you can ask your question to the community, meet other React Native developers, and gain new contacts:

    Reddit: https://www.reddit.com/r/reactnative/

    Stack Overflow: http://stackoverflow.com/questions/tagged/react-native

    Meetuphttps://www.meetup.com/topics/react-native/

    Facebook: https://www.facebook.com/groups/reactnativecommunity/

    For Flutter

    The Flutter community is smaller than React Native. The main reason is that Flutter is relatively new and is not yet widely used in production apps. But it’s not hard to see that its popularity is growing day by day. Flutter has excellent documentation with examples, articles, and tutorials that you can find online. It also has direct contact with its users through channels, such as Stack Overflow and Google Groups.

    The community of Flutter is growing at a steady pace with around 662 contributors. The total count of projects being forked by the community is 13.7k, where anybody can seek help for development purposes.

    Here are some platforms to connect with other developers in the Flutter community:

    GitHub: https://github.com/flutter

    Google Groups: https://groups.google.com/g/flutter-announce

    Stack Overflow: https://stackoverflow.com/tags/flutter

    Reddithttps://www.reddit.com/r/FlutterDev/

    Discordhttps://discord.com/invite/N7Yshp4

    Slack: https://fluttercommunity.slack.com/

    Learning curve

    The learning curve of Flutter is steeper than React Native. However, you can learn both frameworks within a reasonable time frame. So, let’s discuss what would be required to learn React Native and Flutter.

    React Native

    The language of React Native is JavaScript. Any person who knows how to write JS will be able to utilize this framework. But, it’s different from building web applications. So if you are a mobile developer, you need to get the hang of things that might take some time.

    However, React Native is relatively easy to learn for newbies. For starters, it offers a variety of resources, both online and offline. On the React website, users can find the documentation, guides, FAQs, and learning resources.

    Flutter

    Flutter has a bit steeper learning curve than React native. You need to know some basic concepts of native Android or iOS development. Flutter requires you to have experience in Java or Kotlin for Android or Objective-C or Swift for iOS. It can be a challenge if you’re accustomed to using new languages without type casts and generics. However, once you’ve learned how to use it, it can speed up your development process.

    Flutter provides great documentation of its APIs that you can refer to. Since the framework is still new, some information might not be updated yet.

    Team size

    The central aspect of choosing between React Native and Flutter is the team size. To set a realistic expectation on the cost, you need to consider the type of application you will develop.

    React Native

    Technically, React Native’s core library can be implemented by a single developer. This developer will have to build all native modules by himself, which is not an easy task. However, the required team size for React Native depends on the complexity of the mobile app you want to build. If you plan to create a simple mobile app, such as a mobile-only website, then one developer will be enough. However, if your project requires complex UI and animations, then you will need more skillful and experienced developers.

    Flutter

    Team size is a very important factor for the flutter app development. The number of people in your team might depend on the requirements and type of app you need to develop.

    Flutter makes it easy to use existing code that you might already have, or share code with other apps that you might already be building. You can even use Java or Kotlin if you prefer (though Dart is preferred).

    UI component

    When developing a cross-platform app, keep in mind that not all platforms behave identically. You will need to choose a library that supports the core element of the app consistent for each platform. We need the framework to have an API so that we can access the native modules.

    React Native

    There are two aspects to implementing React Native in your app development. The first one is writing the apps in JavaScript. This is the easiest part since it’s somewhat similar to writing web apps. The second aspect is the integration of third-party modules that are not part of the core framework.

    The reason third-party modules are required is that React Native does not support all native functionalities. For instance, if you want to implement an Alert box, you need to import the UIAlertController module from Applenv SDK.

    This makes the React Native framework somehow dependent on third-party libraries. There are lots of third-party libraries for React Native. You can use these libraries in your project to add native app features which are not available in React Native. Mostly, it is used to include maps, camera, sharing, and other native features.

    Flutter

    Flutter offers rich GUI components called widgets. A widget can be anything from simple text fields, buttons, switches, sliders, etc., to complex layouts that include multiple pages with split views, navigation bars, tab bars, etc., that are present in modern mobile apps.

    The Flutter toolkit is cross-platform and it has its own widgets, but it still needs third-party libraries to create applications. It also depends on the Android SDK and the iOS SDK for compilation and deployment. Developers can use any third-party library they want as long as it does not have any restrictions on open source licensing. Developers are also allowed to create their own libraries for Flutter app development.

    Testing Framework and Support

    React Native and Flutter have been used to develop many high-quality mobile applications. Of course, in any technology, a well-developed testing framework is essential.

    Based on this, we can see that both React Native and Flutter have a relatively mature testing framework. 

    React Native

    React Native uses the same UI components and APIs as a web application written in React.js. This means you can use the same frameworks and libraries for both platforms. Testing a React Native application can be more complex than a traditional web-based application because it relies heavily on the device itself. If you’re using a hybrid JavaScript approach, you can use tools like WebdriverIO or Appium to run the same tests across different browsers. Still, if you’re going native, you need to make sure you choose a tool with solid native support.

    Flutter

    Flutter has developed a testing framework that helps ensure your application is high quality. It is based on the premise of these three pillars: unit tests, widget tests, and integration tests. As you build out your Flutter applications, you can combine all three types of tests to ensure that your application works perfectly.

    Programming language

    One of the most important benefits of using Flutter and React Native to develop your mobile app is using a single programming language. This reduces the time required to hire developers and allows you to complete projects faster.

    React Native

    React Native breaks that paradigm by bridging the gap between native and JavaScript environments. It allows developers to build mobile apps that run across platforms by using JavaScript. It makes mobile app development faster, as it only requires one language—JavaScript—to create a cross-platform mobile app. This gives web developers a significant advantage over native application developers as they already know JavaScript and can build a mobile app prototype in a couple of days. There is no need to learn Java or Swift. They can even use the same JavaScript libraries they use at work, like Redux and ImmutableJS.

    Flutter

    Flutter provides tools to create native mobile apps for both Android and iOS. Furthermore, it allows you to reuse code between the platforms because it supports code sharing using libraries written in Dart.

    You can also choose between two different ways of creating layouts for Flutter apps. The first one is similar to CSS, while the second one is more like HTML. Both are very powerful and simple to use. By default, you should use widgets built by the Flutter team, but if needed, you can also create your own custom widgets or modify existing ones.

    Tooling and DX

    While using either Flutter or React Native for mobile app development, it is likely that your development team will also be responsible for the CI/CD pipeline used to release new versions of your app.

    CI/CD support for Flutter and React Native is very similar at the moment. Both frameworks have good support for continuous integration (CI), continuous delivery (CD), and continuous deployment (CD). Both offer a first-class experience for building, testing, and deploying apps.

    React Native

    The React Native framework has existed for some time now and is pretty mature. However, it still lacks documentation around continuous integration (CI) and continuous delivery (CD) solutions. Considering the maturity of the framework, we might expect to see more investment here. 

    Whereas Expo is a development environment and build tool for React Native. It lets you develop and run React Native apps on your computer just like you would do on any other web app.

    Expo turns a React Native app into a single JavaScript bundle, which is then published to one of the app stores using Expo’s tools. It provides all the necessary tooling—like bundling, building, and hot reloading—and manages the technical details of publishing to each app store. Expo provides the tooling and environment so that you can develop and test your app in a familiar way, while it also takes care of deploying to production.

    Flutter

    Flutter’s open-source project is complete, so the next step is to develop a rich ecosystem around it. The good news is that Flutter uses the same command-line interface as XCode, Android Studio, IntelliJ IDEA, and other fully-featured IDE’s. This means Flutter can easily integrate with continuous integration/continuous deployment tools. Some CI/CD tools for Flutter include Bitrise and Codemagic. All of these tools are free to use, although they offer paid plans for more features.

    Here is an example of a to-do list app built with React Native and Flutter.

    Flutter: https://github.com/velotiotech/simple_todo_flutter

    React Native: https://github.com/velotiotech/react-native-todo-example

    Conclusion

    As you can see, both Flutter and React Native are excellent cross-platform app development tools that will be able to offer you high-quality apps for iOS and Android. The choice between React Native vs Flutter will depend on the complexity of the app that you are looking to create, your team size, and your needs for the app. Still, all in all, both of these frameworks are great options to consider to develop cross-platform native mobile applications.

  • UI Automation and API Testing with Cypress – A Step-by-step Guide

    These days, most web applications are driven by JavaScript frameworks that include front-end and back-end development. So, we need to have a robust QA automation framework that covers APIs as well as end-to-end tests (E2E tests). These tests check the user flow over a web application and confirm whether it meets the requirement. 

    Full-stack QA testing is critical in stabilizing APIs and UI, ensuring a high-quality product that satisfies user needs.

    To test UI and APIs independently, we can use several tools and frameworks, like Selenium, Postman, Rest-Assured, Nightwatch, Katalon Studio, and Jest, but this article will be focusing on Cypress.

    We will cover how we can do full stack QA testing using Cypress. 

    What exactly is Cypress?

    Cypress is a free, open-source, locally installed Test Runner and Dashboard Service for recording your tests. It is a frontend and backend test automation tool built for the next generation of modern web applications.

    It is useful for developers as well as QA engineers to test real-life applications developed in React.js, Angular.js, Node.js, Vue.js, and other front-end technologies.

    How does Cypress Work Functionally?

    Cypress is executed in the same run loop as your application. Behind Cypress is a Node.js server process.

    Most testing tools operate by running outside of the browser and executing remote commands across the network. Cypress does the opposite, while at the same time working outside of the browser for tasks that require higher privileges.

    Cypress takes snapshots of your application and enables you to time travel back to the state it was in when commands ran. 

    Why Use Cypress Over Other Automation Frameworks?

    Cypress is a JavaScript test automation solution for web applications.

    This all-in-one testing framework provides a chai assertion library with mocking and stubbing all without Selenium. Moreover, it supports the Mocha test framework, which can be used to develop web test automations.

    Key Features of Cypress:

    • Mocking – By mocking the server response, it has the ability to test edge cases.
    • Time Travel – It takes snapshots as your tests run, allowing users to go back and forth in time during test scenarios.
    • Flake Resistant – It automatically waits for commands and assertions before moving on.
    • Spies, Stubs, and Clocks – It can verify and control the behavior of functions, server responses, or timers.
    • Real Time Reloads – It automatically reloads whenever you make changes to your tests.
    • Consistent Results – It gives consistent and reliable tests that aren’t flaky.
    • Network Traffic Control – Easily control, stub, and test edge cases without involving your server.
    • Automatic Waiting – It automatically waits for commands and assertions without ever adding waits or sleeps to your tests. No more async hell. 
    • Screenshots and Videos – View screenshots taken automatically on failure, or videos of your entire test suite when it has run smoothly.
    • Debuggability – Readable error messages help you to debug quickly.

       


    Fig:- How Cypress works 

     

     Installation and Configuration of the Cypress Framework

    This will also create a package.json file for the test settings and project dependencies.

    The test naming convention should be test_name.spec.js 

    • To run the Cypress test, use the following command:
    $ npx cypress run --spec "cypress/integration/examples/tests/e2e_test.spec.js"

    • This is how the folder structure will look: 
    Fig:- Cypress Framework Outline

    REST API Testing Using Cypress

    It’s important to test APIs along with E2E UI tests, and it can also be helpful to stabilize APIs and prepare data to interact with third-party servers.

    Cypress provides the functionality to make an HTTP request.

    Using Cypress’s Request() method, we can validate GET, POST, PUT, and DELETE API Endpoints.

    Here are some examples: 

    describe(“Testing API Endpoints Using Cypress”, () => {
    
          it(“Test GET Request”, () => {
                cy.request(“http://localhost:3000/api/posts/1”)
                     .then((response) => {
                            expect(response.body).to.have.property('code', 200);
                })
          })
    
          it(“Test POST Request”, () => {
                cy.request({
                     method: ‘POST’,
                     url: ‘http://localhost:3000/api/posts’,
                     body: {
                         “id” : 2,
                         “title”:“Automation”
                     }
                }).then((response) => { 
                        expect(response.body).has.property(“title”,“Automation”); 
                })
          })
    
          it(“Test PUT Request”, () => {
                cy.request({
                        method: ‘PUT’,
                        url: ‘http://localhost:3000/api/posts/2’,
                        body: { 
                           “id”: 2,
                           “title” : “Test Automation”
                        }
                }).then((response) => { 
                        expect(response.body).has.property(“title”,“ Test Automation”); 
                })          
          })        
    
          it(“Test DELETE Request”, () => {
                cy.request({
                          method : ‘DELETE’,
                          url: ‘http://localhost:3000/api/post/2’
                          }).then((response) => {
                            expect(response.body).to.be.empty;
                })	
          })
       
     })

    How to Write End-to-End UI Tests Using Cypress

    With Cypress end-to-end testing, you can replicate user behaviour on your application and cross-check whether everything is working as expected. In this section, we’ll check useful ways to write E2E tests on the front-end using Cypress. 

    Here is an example of how to write E2E test in Cypress: 

    How to Pass Test Case Using Cypress

    1. Navigate to the Google website
    2. Click on the search input field 
    3. Type Cypress and press enter  
    4. The search results should contain Cypress

    How to Fail Test Case Using Cypress

    1. Navigate to the wrong URL http://locahost:8080
    2. Click on the search input field 
    3. Type Cypress and press enter
    4. The search results should contain Cypress  
    describe('Testing Google Search', () => {
    
         // To Pass the Test Case 1 
    
         it('I can search for Valid Content on Google', () => {
    
              cy.visit('https://www.google.com');
              cy.get("input[title='Search']").type('Cypress').type(‘{enter}’);
              cy.contains('https://www.cypress.io'); 
    
         });
    
         // To Fail the Test Case 2
    
         it('I can navigate to Wrong URL’', () => {
    
              cy.visit('http://localhost:8080');
              cy.get("input[title='Search']").type('Cypress').type(‘{enter}’);
              cy.contains('https://www.cypress.io'); 
    
         });
     
    });

    Cross Browser Testing Using Cypress 

    Cypress can run tests across the latest releases of multiple browsers. It currently has support for Chrome and Firefox (beta). 

    Cypress supports the following browsers:

    • Google Chrome
    • Firefox (beta)
    • Chromium
    • Edge
    • Electron

    Browsers can be specified via the –browser flag when using the run command to launch Cypress. npm scripts can be used as shortcuts in package.json to launch Cypress with a specific browser more conveniently. 

    To run tests on browsers:

    $ npx cypress run --browser chrome --spec “cypress/integration/examples/tests/e2e_test.spec.js”

    Here is an example of a package.json file to show how to define the npm script:

    "scripts": {
      "cy:run:chrome": "cypress run --browser chrome",
      "cy:run:firefox": "cypress run --browser firefox"
    }

    Cypress Reporting

    Reporter options can be specified in the cypress.json configuration file or via CLI options. Cypress supports the following reporting capabilities:

    • Mocha Built-in Reporting – As Cypress is built on top of Mocha, it has the default Mochawesome reporting 
    • JUnit and TeamCity – These 3rd party Mocha reporters are built into Cypress.

    To install additional dependencies for report generation: 

    Installing Mochaawesome:

    $ npm install mochawesome

    Or installing JUnit:

    $ npm install junit

    Examples of a config file and CLI for the Mochawesome report 

    • Cypress.json config file:
    {
        "reporter": "mochawesome",
        "reporterOptions":
        {
            "reportDir": "cypress/results",
            "overwrite": false,
            "html": true,
            "json": true
        }
    }

    • CLI Reporting:
    $ npx cypress run --reporter mochawesome --spec “cypress/integration/examples/tests/e2e_test.spec.js”

    Examples of a config File and CLI for the JUnit Report: 

    • Cypress.json config file for JUnit: 
    {
        "reporter": "junit",
        "reporterOptions": 
        {
            "reportDir": "cypress/results",
            "mochaFile": "results/my-test-output.xml",
            "toConsole": true
        }
    }

    • CLI Reporting: <cypress_junit_reporting></cypress_junit_reporting>
    $ npx cypress run --reporter junit  --reporter-options     “mochaFile=results/my-test-output.xml,toConsole=true

    Fig:- Collapsed View of Mochawesome Report

     

    Fig:- Expanded View of Mochawesome Report

     

    Fig:- Mochawesome Report Settings

    Additional Possibilities of Using Cypress 

    There are several other things we can do using Cypress that we could not cover in this article, although we’ve covered the most important aspects of the tool..

    Here are some other usages of Cypress that we could not explore here:

    • Continuous integration and continuous deployment with Jenkins 
    • Behavior-driven development (BDD) using Cucumber
    • Automating applications with XHR
    • Test retries and retry ability
    • Custom commands
    • Environment variables
    • Plugins
    • Visual testing
    • Slack integration 
    • Model-based testing
    • GraphQL API Testing 

    Limitations with Cypress

    Cypress is a great tool with a great community supporting it. Although it is still young, it is being continuously developed and is quickly catching up with the other full-stack automation tools on the market.

    So, before you decide to use Cypress, we would like to touch upon some of its limitations. These limitations are for version 5.2.0, the latest version of Cypress at the time of this article’s publishing.

    Here are the current limitations of using Cypress:

    • It can’t use two browsers at the same time.
    • It doesn’t provide support for multi-tabs.
    • It only supports the JavaScript language for creating test cases.
    • It doesn’t currently provide support for browsers like Safari and IE.
    • It has limited support for iFrames.

    Conclusion

    Cypress is a great tool with a growing feature-set. It makes setting up, writing, running, and debugging tests easy for QA automation engineers. It also has a quicker learning cycle with a good, baked-in execution environment.

    It is fully JavaScript/MochaJS-oriented with specific new APIs to make scripting easier. It also provides a flexible test execution plan that can implement significant and unexpected changes.

    In this blog, we talked about how Cypress works functionally, performed end-to-end UI testing, and touched upon its limitations. We hope you learned more about using Cypress as a full-stack test automation tool.

    Related QA Articles

    1. Building a scalable API testing framework with Jest and SuperTest
    2. Automation testing with Nightwatch JS and Cucumber: Everything you need to know
    3. API testing using Postman and Newman
  • Flannel: A Network Fabric for Containers

    Containers are a disruptive technology and is being adopted by startups and enterprises alike. Whenever a new infrastructure technology comes along, two areas require a lot of innovation – storage & networking. Anyone who is adopting containers would have faced challenges in these two areas.

    Flannel is an overlay network that helps to connect containers across multiple hosts. This blog provides an overview of container networking followed by details of Flannel.

    What is Docker?

    Docker is the world’s leading software container platform. Developers use Docker to eliminate “works on my machine” problems when collaborating on software with co-workers. Operators use Docker to run and manage apps side-by-side in isolated containers to get better compute density. Enterprises use Docker to build agile software delivery pipelines to ship new features faster, more securely and with repeatability for both Linux and Windows Server apps.

    Need for Container networking

    • Containers need to talk to the external world.
    • Containers should be reachable from the external world so that the external world can use the services that containers provide.
    • Containers need to talk to the host machine. An example can be getting memory usage of the underlying host.
    • There should be inter-container connectivity in the same host and across hosts. An example is a LAMP stack running Apache, MySQL and PHP in different containers across hosts.

    How Docker’s original networking works?

    Docker uses host-private networking. It creates a virtual bridge, called docker0 by default, and allocates a subnet from one of the private address blocks defined in RFC1918 for that bridge. For each container that Docker creates, it allocates a virtual ethernet device (called veth) which is attached to the bridge. The veth is mapped to appear as eth0 in the container, using Linux namespaces. The in-container eth0 interface is given an IP address from the bridge’s address range.

    Drawbacks of Docker networking

    Docker containers can talk to other containers only if they are on the same machine (and thus the same virtual bridge). Containers on different machines cannot reach each other – in fact they may end up with the exact same network ranges and IP addresses. This limits the system’s effectiveness on cloud platforms.

    In order for Docker containers to communicate across nodes, they must be allocated ports on the machine’s own IP address, which are then forwarded or peroxided to the containers. This obviously means that containers must either coordinate which ports they use very carefully or else be allocated ports dynamically.This approach obviously fails if container dies as the new container will get a new IP, breaking the proxy rules.

    Real world expectations from Docker

    Enterprises expect docker containers to be used in production grade systems, where each component of the application can run on different containers running across different grades of underlying hardware. All application components are not same and some of them may be resource intensive. It makes sense to run such resource intensive components on compute heavy physical servers and others on cost saving cloud virtual machines. It also expects Docker containers to be replicated on demand and the application load to be distributed across the replicas. This is where Google’s Kubernetes project fits in.

    What is Kubernetes?

    Kubernetes is an open-source platform for automating deployment, scaling, and operations of application containers across clusters of hosts, providing container-centric infrastructure. It provides portability for an application to run on public, private, hybrid, multi-cloud. It gives extensibility as it is modular, pluggable, hookable and composable. It also self heals by doing auto-placement, auto-restart, auto-replication, auto-scaling of application containers. Kubernetes does not provide a way for containers across nodes to communicate with each other, it assumes that each container (pod) has a unique, routable IP inside the cluster. To facilitate inter-container connectivity across nodes, any networking solution based on Pure Layer-3 or VxLAN or UDP model, can be used. Flannel is one such solution which provides an overlay network using UDP as well as VxLAN based model.

    Flannel: a solution for networking for Kubernetes

    Flannel is a basic overlay network that works by assigning a range of subnet addresses (usually IPv4 with a /24 or /16 subnet mask). An overlay network is a computer network that is built on top of another network. Nodes in the overlay network can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network.

    While flannel was originally designed for Kubernetes, it is a generic overlay network that can be used as a simple alternative to existing software defined networking solutions. More specifically, flannel gives each host an IP subnet (/24 by default) from which the Docker daemon is able to allocate IPs to the individual containers. Each address corresponds to a container, so that all containers in a system may reside on different hosts.

    It works by first configuring an overlay network, with an IP range and the size of the subnet for each host. For example, one could configure the overlay to use 10.1.0.0/16 and each host to receive a /24 subnet. Host A could then receive 10.1.15.1/24 and host B could get 10.1.20.1/24. Flannel uses etcd to maintain a mapping between allocated subnets and real host IP addresses. For the data path, flannel uses UDP to encapsulate IP datagrams to transmit them to the remote host.

    As a result, complex, multi-host systems such as Hadoop can be distributed across multiple Docker container hosts, using Flannel as the underlying fabric, resolving a deficiency in Docker’s native container address mapping system.

    Integrating Flannel with Kubernetes

    Kubernetes cluster consists of a master node and multiple minion nodes. Each minion node gets its own subnet through flannel service. Docker needs to be configured to use the subnet created by Flannel. Master starts a etcd server and flannel service running on each minion uses that etcd server to registers its container’s IP. etcd server stores a key-value mapping of each containers with its IP. kube-apiserver uses etcd server as a service to get the IP mappings and assign service IP’s accordingly. Kubernetes will create iptable rules through kube-proxy which will allocate static endpoints and load balancing. In case the minion node goes down or the pod restarts it will get a new local IP, but the service IP created by kubernetes will remain the same enabling kubernetes to route traffic to correct set of pods. Learn how to setup Kubernetes with Flannel undefined.

    Alternatives to Flannel

    Flannel is not the only solution for this. Other options like Calico and Weave are available. Weave is the closest competitor as it provides a similar set of features as Flannel. Flannel gets an edge in its ease of configuration and some of the benchmarks have found Weave to be slower than Flannel. 

    PS: Velotio is helping enterprises and product companies modernize their infrastructure using Containers, Docker, Kubernetes. Click here to learn more. 

  • Extending Kubernetes APIs with Custom Resource Definitions (CRDs)

    Introduction:

    Custom resources definition (CRD) is a powerful feature introduced in Kubernetes 1.7 which enables users to add their own/custom objects to the Kubernetes cluster and use it like any other native Kubernetes objects. In this blog post, we will see how we can add a custom resource to a Kubernetes cluster using the command line as well as using the Golang client library thus also learning how to programmatically interact with a Kubernetes cluster.

    What is a Custom Resource Definition (CRD)?

    In the Kubernetes API, every resource is an endpoint to store API objects of certain kind. For example, the built-in service resource contains a collection of service objects. The standard Kubernetes distribution ships with many inbuilt API objects/resources. CRD comes into picture when we want to introduce our own object into the Kubernetes cluster to full fill our requirements. Once we create a CRD in Kubernetes we can use it like any other native Kubernetes object thus leveraging all the features of Kubernetes like its CLI, security, API services, RBAC etc.

    The custom resource created is also stored in the etcd cluster with proper replication and lifecycle management. CRD allows us to use all the functionalities provided by a Kubernetes cluster for our custom objects and saves us the overhead of implementing them on our own.

    How to register a CRD using command line interface (CLI)

    Step-1: Create a CRD definition file sslconfig-crd.yaml

    apiVersion: "apiextensions.k8s.io/v1beta1"
    kind: "CustomResourceDefinition"
    metadata:
      name: "sslconfigs.blog.velotio.com"
    spec:
      group: "blog.velotio.com"
      version: "v1alpha1"
      scope: "Namespaced"
      names:
        plural: "sslconfigs"
        singular: "sslconfig"
        kind: "SslConfig"
      validation:
        openAPIV3Schema:
          required: ["spec"]
          properties:
            spec:
              required: ["cert","key","domain"]
              properties:
                cert:
                  type: "string"
                  minimum: 1
                key:
                  type: "string"
                  minimum: 1
                domain:
                  type: "string"
                  minimum: 1 

    Here we are creating a custom resource definition for an object of kind SslConfig. This object allows us to store the SSL configuration information for a domain. As we can see under the validation section specifying the cert, key and the domain are mandatory for creating objects of this kind, along with this we can store other information like the provider of the certificate etc. The name metadata that we specify must be spec.names.plural+”.”+spec.group.

    An API group (blog.velotio.com here) is a collection of API objects which are logically related to each other. We have also specified version for our custom objects (spec.version), as the definition of the object is expected to change/evolve in future so it’s better to start with alpha so that the users of the object knows that the definition might change later. In the scope, we have specified Namespaced, by default a custom resource name is clustered scoped. 

    # kubectl create -f crd.yaml
    # kubectl get crd NAME AGE sslconfigs.blog.velotio.com 5s

    Step-2:  Create objects using the definition we created above

    apiVersion: "blog.velotio.com/v1alpha1"
    kind: "SslConfig"
    metadata:
      name: "sslconfig-velotio.com"
    spec:
      cert: "my cert file"
      key : "my private  key"
      domain: "*.velotio.com"
      provider: "digicert"

    # kubectl create -f crd-obj.yaml
    # kubectl get sslconfig NAME AGE sslconfig-velotio.com 12s

    Along with the mandatory fields cert, key and domain, we have also stored the information of the provider ( certifying authority ) of the cert.

    How to register a CRD programmatically using client-go

    Client-go project provides us with packages using which we can easily create go client and access the Kubernetes cluster.  For creating a client first we need to create a connection with the API server.
    How we connect to the API server depends on whether we will be accessing it from within the cluster (our code running in the Kubernetes cluster itself) or if our code is running outside the cluster (locally)

    If the code is running outside the cluster then we need to provide either the path of the config file or URL of the Kubernetes proxy server running on the cluster.

    kubeconfig := filepath.Join(
    os.Getenv("HOME"), ".kube", "config",
    )
    config, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
    if err != nil {
    log.Fatal(err)
    }

    OR

    var (
    // Set during build
    version string
    
    proxyURL = flag.String("proxy", "",
    `If specified, it is assumed that a kubctl proxy server is running on the
    given url and creates a proxy client. In case it is not given InCluster
    kubernetes setup will be used`)
    )
    if *proxyURL != "" {
    config, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(
    &clientcmd.ClientConfigLoadingRules{},
    &clientcmd.ConfigOverrides{
    ClusterInfo: clientcmdapi.Cluster{
    Server: *proxyURL,
    },
    }).ClientConfig()
    if err != nil {
    glog.Fatalf("error creating client configuration: %v", err)
    }

    When the code is to be run as a part of the cluster then we can simply use

    import "k8s.io/client-go/rest"  ...  rest.InClusterConfig() 

    Once the connection is established we can use it to create clientset. For accessing Kubernetes objects, generally the clientset from the client-go project is used, but for CRD related operations we need to use the clientset from apiextensions-apiserver project

    apiextension “k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset”

    kubeClient, err := apiextension.NewForConfig(config)
    if err != nil {
    glog.Fatalf("Failed to create client: %v.", err)
    }

    Now we can use the client to make the API call which will create the CRD for us.

    package v1alpha1
    
    import (
    	"reflect"
    
    	apiextensionv1beta1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1beta1"
    	apiextension "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
    	apierrors "k8s.io/apimachinery/pkg/api/errors"
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    )
    
    const (
    	CRDPlural   string = "sslconfigs"
    	CRDGroup    string = "blog.velotio.com"
    	CRDVersion  string = "v1alpha1"
    	FullCRDName string = CRDPlural + "." + CRDGroup
    )
    
    func CreateCRD(clientset apiextension.Interface) error {
    	crd := &apiextensionv1beta1.CustomResourceDefinition{
    		ObjectMeta: meta_v1.ObjectMeta{Name: FullCRDName},
    		Spec: apiextensionv1beta1.CustomResourceDefinitionSpec{
    			Group:   CRDGroup,
    			Version: CRDVersion,
    			Scope:   apiextensionv1beta1.NamespaceScoped,
    			Names: apiextensionv1beta1.CustomResourceDefinitionNames{
    				Plural: CRDPlural,
    				Kind:   reflect.TypeOf(SslConfig{}).Name(),
    			},
    		},
    	}
    
    	_, err := clientset.ApiextensionsV1beta1().CustomResourceDefinitions().Create(crd)
    	if err != nil && apierrors.IsAlreadyExists(err) {
    		return nil
    	}
    	return err
    }

    In the create CRD function, we first create the definition of our custom object and then pass it to the create method which creates it in our cluster. Just like we did while creating our definition using CLI, here also we set the parameters like version, group, kind etc.

    Once our definition is ready we can create objects of its type just like we did earlier using the CLI. First we need to define our object.

    package v1alpha1
    
    import meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    
    type SslConfig struct {
    	meta_v1.TypeMeta   `json:",inline"`
    	meta_v1.ObjectMeta `json:"metadata"`
    	Spec               SslConfigSpec   `json:"spec"`
    	Status             SslConfigStatus `json:"status,omitempty"`
    }
    type SslConfigSpec struct {
    	Cert   string `json:"cert"`
    	Key    string `json:"key"`
    	Domain string `json:"domain"`
    }
    
    type SslConfigStatus struct {
    	State   string `json:"state,omitempty"`
    	Message string `json:"message,omitempty"`
    }
    
    type SslConfigList struct {
    	meta_v1.TypeMeta `json:",inline"`
    	meta_v1.ListMeta `json:"metadata"`
    	Items            []SslConfig `json:"items"`
    }

    Kubernetes API conventions suggests that each object must have two nested object fields that govern the object’s configuration: the object spec and the object status. Objects must also have metadata associated with them. The custom objects that we define here comply with these standards. It is also recommended to create a list type for every type thus we have also created a SslConfigList struct.

    Now we need to write a function which will create a custom client which is aware of the new resource that we have created.

    package v1alpha1
    
    import (
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/apimachinery/pkg/runtime"
    	"k8s.io/apimachinery/pkg/runtime/schema"
    	"k8s.io/apimachinery/pkg/runtime/serializer"
    	"k8s.io/client-go/rest"
    )
    
    var SchemeGroupVersion = schema.GroupVersion{Group: CRDGroup, Version: CRDVersion}
    
    func addKnownTypes(scheme *runtime.Scheme) error {
    	scheme.AddKnownTypes(SchemeGroupVersion,
    		&SslConfig{},
    		&SslConfigList{},
    	)
    	meta_v1.AddToGroupVersion(scheme, SchemeGroupVersion)
    	return nil
    }
    
    func NewClient(cfg *rest.Config) (*SslConfigV1Alpha1Client, error) {
    	scheme := runtime.NewScheme()
    	SchemeBuilder := runtime.NewSchemeBuilder(addKnownTypes)
    	if err := SchemeBuilder.AddToScheme(scheme); err != nil {
    		return nil, err
    	}
    	config := *cfg
    	config.GroupVersion = &SchemeGroupVersion
    	config.APIPath = "/apis"
    	config.ContentType = runtime.ContentTypeJSON
    	config.NegotiatedSerializer = serializer.DirectCodecFactory{CodecFactory: serializer.NewCodecFactory(scheme)}
    	client, err := rest.RESTClientFor(&config)
    	if err != nil {
    		return nil, err
    	}
    	return &SslConfigV1Alpha1Client{restClient: client}, nil
    }

    Building the custom client library

    Once we have registered our custom resource definition with the Kubernetes cluster we can create objects of its type using the Kubernetes cli as we did earlier but for creating controllers for these objects or for developing some custom functionalities around them we need to build a client library also using which we can access them from go API. For native Kubernetes objects, this type of library is provided for each object.

    package v1alpha1
    
    import (
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/client-go/rest"
    )
    
    func (c *SslConfigV1Alpha1Client) SslConfigs(namespace string) SslConfigInterface {
    	return &sslConfigclient{
    		client: c.restClient,
    		ns:     namespace,
    	}
    }
    
    type SslConfigV1Alpha1Client struct {
    	restClient rest.Interface
    }
    
    type SslConfigInterface interface {
    	Create(obj *SslConfig) (*SslConfig, error)
    	Update(obj *SslConfig) (*SslConfig, error)
    	Delete(name string, options *meta_v1.DeleteOptions) error
    	Get(name string) (*SslConfig, error)
    }
    
    type sslConfigclient struct {
    	client rest.Interface
    	ns     string
    }
    
    func (c *sslConfigclient) Create(obj *SslConfig) (*SslConfig, error) {
    	result := &SslConfig{}
    	err := c.client.Post().
    		Namespace(c.ns).Resource("sslconfigs").
    		Body(obj).Do().Into(result)
    	return result, err
    }
    
    func (c *sslConfigclient) Update(obj *SslConfig) (*SslConfig, error) {
    	result := &SslConfig{}
    	err := c.client.Put().
    		Namespace(c.ns).Resource("sslconfigs").
    		Body(obj).Do().Into(result)
    	return result, err
    }
    
    func (c *sslConfigclient) Delete(name string, options *meta_v1.DeleteOptions) error {
    	return c.client.Delete().
    		Namespace(c.ns).Resource("sslconfigs").
    		Name(name).Body(options).Do().
    		Error()
    }
    
    func (c *sslConfigclient) Get(name string) (*SslConfig, error) {
    	result := &SslConfig{}
    	err := c.client.Get().
    		Namespace(c.ns).Resource("sslconfigs").
    		Name(name).Do().Into(result)
    	return result, err
    }

    We can add more methods like watch, update status etc. Their implementation will also be similar to the methods we have defined above. For looking at the methods available for various Kubernetes objects like pod, node etc. we can refer to the v1 package.

    Putting all things together

    Now in our main function we will get all the things together.

    package main
    
    import (
    	"flag"
    	"fmt"
    	"time"
    
    	"blog.velotio.com/crd-blog/v1alpha1"
    	"github.com/golang/glog"
    	apiextension "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/client-go/rest"
    	"k8s.io/client-go/tools/clientcmd"
    	clientcmdapi "k8s.io/client-go/tools/clientcmd/api"
    )
    
    var (
    	// Set during build
    	version string
    
    	proxyURL = flag.String("proxy", "",
    		`If specified, it is assumed that a kubctl proxy server is running on the
    		given url and creates a proxy client. In case it is not given InCluster
    		kubernetes setup will be used`)
    )
    
    func main() {
    
    	flag.Parse()
    	var err error
    
    	var config *rest.Config
    	if *proxyURL != "" {
    		config, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(
    			&clientcmd.ClientConfigLoadingRules{},
    			&clientcmd.ConfigOverrides{
    				ClusterInfo: clientcmdapi.Cluster{
    					Server: *proxyURL,
    				},
    			}).ClientConfig()
    		if err != nil {
    			glog.Fatalf("error creating client configuration: %v", err)
    		}
    	} else {
    		if config, err = rest.InClusterConfig(); err != nil {
    			glog.Fatalf("error creating client configuration: %v", err)
    		}
    	}
    
    	kubeClient, err := apiextension.NewForConfig(config)
    	if err != nil {
    		glog.Fatalf("Failed to create client: %v", err)
    	}
    	// Create the CRD
    	err = v1alpha1.CreateCRD(kubeClient)
    	if err != nil {
    		glog.Fatalf("Failed to create crd: %v", err)
    	}
    
    	// Wait for the CRD to be created before we use it.
    	time.Sleep(5 * time.Second)
    
    	// Create a new clientset which include our CRD schema
    	crdclient, err := v1alpha1.NewClient(config)
    	if err != nil {
    		panic(err)
    	}
    
    	// Create a new SslConfig object
    
    	SslConfig := &v1alpha1.SslConfig{
    		ObjectMeta: meta_v1.ObjectMeta{
    			Name:   "sslconfigobj",
    			Labels: map[string]string{"mylabel": "test"},
    		},
    		Spec: v1alpha1.SslConfigSpec{
    			Cert:   "my-cert",
    			Key:    "my-key",
    			Domain: "*.velotio.com",
    		},
    		Status: v1alpha1.SslConfigStatus{
    			State:   "created",
    			Message: "Created, not processed yet",
    		},
    	}
    	// Create the SslConfig object we create above in the k8s cluster
    	resp, err := crdclient.SslConfigs("default").Create(SslConfig)
    	if err != nil {
    		fmt.Printf("error while creating object: %vn", err)
    	} else {
    		fmt.Printf("object created: %vn", resp)
    	}
    
    	obj, err := crdclient.SslConfigs("default").Get(SslConfig.ObjectMeta.Name)
    	if err != nil {
    		glog.Infof("error while getting the object %vn", err)
    	}
    	fmt.Printf("SslConfig Objects Found: n%vn", obj)
    	select {}
    }

    Now if we run our code then our custom resource definition will get created in the Kubernetes cluster and also an object of its type will be there just like with the cli. The docker image akash125/crdblog is build using the code discussed above it can be directly pulled from docker hub and run in a Kubernetes cluster. After the image is run successfully, the CRD definition that we discussed above will get created in the cluster along with an object of its type. We can verify the same using the CLI the way we did earlier, we can also check the logs of the pod running the docker image to verify it. The complete code is available here.

    Conclusion and future work

    We learned how to create a custom resource definition and objects using Kubernetes command line interface as well as the Golang client. We also learned how to programmatically access a Kubernetes cluster, using which we can build some really cool stuff on Kubernetes, we can now also create custom controllers for our resources which continuously watches the cluster for various life cycle events of our object and takes desired action accordingly. To read more about CRD refer the following links:

  • Exploring Upgrade Strategies for Stateful Sets in Kubernetes

    Introduction

    In the age of continuous delivery and agility where the software is being deployed 10s of times per day and sometimes per hour as well using container orchestration platforms, a seamless upgrade mechanism becomes a critical aspect of any technology adoption, Kubernetes being no exception. 

    Kubernetes provides a variety of controllers that define how pods are set up and deployed within the Kubernetes cluster. These controllers can group pods together according to their runtime needs and can be used to define pod replication and pod startup ordering. Kubernetes controllers are nothing but an application pattern. The controller controls the pods(smallest unit in Kubernetes), so, you don’t need to create, manage and delete the pods. There are few types of controllers in Kubernetes like,

    1. Deployment
    2. Statefulset
    3. Daemonset
    4. Job
    5. Replica sets

    Each controller represents an application pattern. For example, Deployment represents the stateless application pattern in which you don’t store the state of your application. Statefulset represents the statefulset application pattern where you store the data, for example, databases, message queues.  We will be focusing on Statefulset controller and its update feature in this blog.

    Statefulset

    The StatefulSet acts as a controller in Kubernetes to deploy applications according to a specified rule set and is aimed towards the use of persistent and stateful applications. It is an ordered and graceful deployment. Statefulset is generally used with a distributed applications that require each node to have a persistent state and the ability to configure an arbitrary number of nodes. StatefulSet pods have a unique identity that is comprised of an ordinal, a stable network identity, and stable storage. The identity sticks to the pod, regardless of which node it’s scheduled on. For more details check here.

    Update Strategies FOR STATEFULSETS

    There are a couple of different strategies available for upgrades – Blue/Green and Rolling updates. Let’s review them in detail:

    Blue-Green DeploymentBlue-green deployment is one of the commonly used update strategies. There are 2 identical environments of your application in this strategy. One is the Blue environment which is running the current deployment and the Green environment is the new deployment to which we want to upgrade. The approach is simple:

    1. Switch the load balancer to route traffic to the Green environment.
    2. Delete the Blue environment once the Green environment is verified. 

    Disadvantages of Blue-Green deployment:

    1. One of the disadvantages of this strategy is that all current transactions and sessions will be lost, due to the physical switch from one machine serving the traffic to another one.
    2. Implementing blue-green deployment become complex with the database, especially if, the database schema changes across version.
    3. In blue-green deployment, you need the extra cloud setup/hardware which increases the overall costing.

    Rolling update strategy

    After Blue-Green deployment, let’s take a look at Rolling updates and how it works.

    1. In short, as the name suggests this strategy replaces currently running instances of the application with new instances, one by one. 
    2. In this strategy, health checks play an important role i.e. old instances of the application are removed only if new version are healthy. Due to this, the existing deployment becomes heterogeneous while moving from the old version of the application to new version. 
    3. The benefit of this strategy is that its incremental approach to roll out the update and verification happens in parallel while increasing traffic to the application.
    4. In rolling update strategy, you don’t need extra hardware/cloud setup and hence it’s cost-effective technique of upgrade.

    Statefulset upgrade strategies

    With the basic understanding of upgrade strategies, let’s explore the update strategies available for Stateful sets in Kubernetes. Statefulsets are used for databases where the state of the application is the crucial part of the deployment. We will take the example of Cassandra to learn about statefulset upgrade feature. We will use the gce-pd storage to store the data. StatefulSets(since Kubernetes 1.7) uses an update strategy to configure and disable automated rolling updates for containers, labels, resource request/limits, and annotations for its pods. The update strategy is configured using the updateStrategy field.

    The updateStrategy field accepts one of the following value 

    1. OnDelete
    2. RollingUpdate

    OnDelete update strategy

    OnDelete prevents the controller from automatically updating its pods. One needs to delete the pod manually for the changes to take effect. It’s more of a manual update process for the Statefulset application and this is the main difference between OnDelete and RollingUpdate strategy. OnDelete update strategy plays an important role where the user needs to perform few action/verification post the update of each pod. For example, after updating a single pod of Cassandra user might need to check if the updated pod joined the Cassandra cluster correctly.

    We will now create a Statefulset deployment first. Let’s take a simple example of Cassandra and deploy it using a Statefulset controller. Persistent storage is the key point in Statefulset controller. You can read more about the storage class here.

    For the purpose of this blog, we will use the Google Kubernetes Engine.

    • First, define the storage class as follows:
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: fast
    provisioner: kubernetes.io/gce-pd
    parameters:
      type: pd-ssd

    • Then create the Storage class using kubectl:
    $ kubectl create -f storage_class.yaml

    • Here is the YAML file for the Cassandra service and the Statefulset deployment.
    apiVersion: v1
    kind: Service
    metadata:
      labels:
        app: cassandra
      name: cassandra
    spec:
      clusterIP: None
      ports:
      - port: 9042
      selector:
        app: cassandra
    ---
    apiVersion: apps/v1beta2
    kind: StatefulSet
    metadata:
      name: cassandra
      labels:
        app: cassandra
    spec:
      serviceName: cassandra
      replicas: 3
      updateStrategy:
        type: OnDelete
      selector:
        matchLabels:
          app: cassandra
      template:
        metadata:
          labels:
            app: cassandra
        spec:
          terminationGracePeriodSeconds: 1800
          containers:
          - name: cassandra
            image: gcr.io/google-samples/cassandra:v12
            imagePullPolicy: Always
            ports:
            - containerPort: 7000
              name: intra-node
            - containerPort: 7001
              name: tls-intra-node
            - containerPort: 7199
              name: jmx
            - containerPort: 9042
              name: cql
            resources:
              limits:
                cpu: "500m"
                memory: 1Gi
              requests:
               cpu: "500m"
               memory: 1Gi
            securityContext:
              capabilities:
                add:
                  - IPC_LOCK
            lifecycle:
              preStop:
                exec:
                  command: 
                  - /bin/sh
                  - -c
                  - nodetool drain
            env:
              - name: MAX_HEAP_SIZE
                value: 512M
              - name: HEAP_NEWSIZE
                value: 100M
              - name: CASSANDRA_SEEDS
                value: "cassandra-0.cassandra.default.svc.cluster.local"
              - name: CASSANDRA_CLUSTER_NAME
                value: "K8Demo"
              - name: CASSANDRA_DC
                value: "DC1-K8Demo"
              - name: CASSANDRA_RACK
                value: "Rack1-K8Demo"
              - name: POD_IP
                valueFrom:
                  fieldRef:
                    fieldPath: status.podIP
            readinessProbe:
              exec:
                command:
                - /bin/bash
                - -c
                - /ready-probe.sh
              initialDelaySeconds: 15
              timeoutSeconds: 5
            volumeMounts:
            - name: cassandra-data
              mountPath: /cassandra_data
      volumeClaimTemplates:
      - metadata:
          name: cassandra-data
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: "fast"
          resources:
            requests:
              storage: 5Gi

    • Let’s create the Statefulset now.
    $ kubectl create -f cassandra.yaml

    • After creating Cassandra Statefulset, if you check the running pods then you will find something like,
    $ kubectl get podsNAME READY STATUS RESTARTS AGE
    cassandra-0 1/1 Running 0 2m
    cassandra-1 1/1 Running 0 2m
    cassandra-2 1/1 Running 0 2m

    • Check if Cassandra cluster is formed correctly using following command:
    $ kubectl exec -it cassandra-0 -- nodetool statusDatacenter: DC1-K8Demo
    #ERROR!
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    
    Address Load Tokens Owns Host ID Rack
    UN 192.168.4.193 101.15 KiB 32 72.0% abd9f52d-85ef-44ee-863c-e1b174cd9412 Rack1-K8Demo
    UN 192.168.199.67 187.81 KiB 32 72.8% c40e89e4-44fe-4fc2-9e8a-863b6a74c90c Rack1-K8Demo
    UN 192.168.187.196 131.42 KiB 32 55.2% c235505c-eec5-43bc-a4d9-350858814fe5 Rack1-K8Demo

    • Let’s describe the running pod first before updating. Look for the image field in the output of the following command
    $ kubectl describe pod cassandra-0

    • The Image field will show gcr.io/google-samples/cassandra:v12 . Now, let’s patch the Cassandra statefulset with the latest image to which we want to update. The latest image might contain the new Cassandra version or database schema changes. Before upgrading such crucial components, it’s always safe to have the backup of the data,
    $ kubectl patch statefulset cassandra --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google-samples/cassandra:v13"}]'

    You will see output as `statefulset.apps “cassandra” patched`, but controller won’t update the running pod automatically in this strategy. You need to delete the pods once and wait till pods with new configuration comes up. Let’s try deleting the cassandra-0 pod.

    $ kubectl delete pod cassandra-0

    • Wait till cassandra-0 comes up in running state and then check if the cassandra-0 is running with intended/updated image i.e. gcr.io/google-samples/cassandra:v13 Now, cassandra-0 is running the new image while cassandra-1 and cassandra-2 are still running the old image. You need to delete these pods for the new image to take effect in this strategy.

    Rolling update strategy

    Rolling update is an automated update process. In this, the controller deletes and then recreates each of its pods. Pods get updated one at a time. While updating, the controller makes sure that an updated pod is running and is in ready state before updating its predecessor. The pods in the StatefulSet are updated in reverse ordinal order(same as pod termination order i.e from the largest ordinal to the smallest)

    For the rolling update strategy, we will create the Cassandra statefulset with the .spec.updateStrategy field pointing to RollingUpdate

    apiVersion: v1
    kind: Service
    metadata:
      labels:
        app: cassandra
      name: cassandra
    spec:
      clusterIP: None
      ports:
      - port: 9042
      selector:
        app: cassandra
    ---
    apiVersion: apps/v1beta2
    kind: StatefulSet
    metadata:
      name: cassandra
      labels:
        app: cassandra
    spec:
      serviceName: cassandra
      replicas: 3
      updateStrategy:
        type: RollingUpdate
      selector:
        matchLabels:
          app: cassandra
      template:
        metadata:
          labels:
            app: cassandra
        spec:
          terminationGracePeriodSeconds: 1800
          containers:
          - name: cassandra
            image: gcr.io/google-samples/cassandra:v12
            imagePullPolicy: Always
            ports:
            - containerPort: 7000
              name: intra-node
            - containerPort: 7001
              name: tls-intra-node
            - containerPort: 7199
              name: jmx
            - containerPort: 9042
              name: cql
            resources:
              limits:
                cpu: "500m"
                memory: 1Gi
              requests:
               cpu: "500m"
               memory: 1Gi
            securityContext:
              capabilities:
                add:
                  - IPC_LOCK
            lifecycle:
              preStop:
                exec:
                  command: 
                  - /bin/sh
                  - -c
                  - nodetool drain
            env:
              - name: MAX_HEAP_SIZE
                value: 512M
              - name: HEAP_NEWSIZE
                value: 100M
              - name: CASSANDRA_SEEDS
                value: "cassandra-0.cassandra.default.svc.cluster.local"
              - name: CASSANDRA_CLUSTER_NAME
                value: "K8Demo"
              - name: CASSANDRA_DC
                value: "DC1-K8Demo"
              - name: CASSANDRA_RACK
                value: "Rack1-K8Demo"
              - name: POD_IP
                valueFrom:
                  fieldRef:
                    fieldPath: status.podIP
            readinessProbe:
              exec:
                command:
                - /bin/bash
                - -c
                - /ready-probe.sh
              initialDelaySeconds: 15
              timeoutSeconds: 5
            volumeMounts:
            - name: cassandra-data
              mountPath: /cassandra_data
      volumeClaimTemplates:
      - metadata:
          name: cassandra-data
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: "fast"
          resources:
            requests:
              storage: 5Gi

    • To try the rolling update feature, we can patch the existing statefulset with the updated image.
    $ kubectl patch statefulset cassandra --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google-samples/cassandra:v13"}]'

    • Once you execute the above command, monitor the output of the following command,
    $ kubectl get pods -w

    In the case of failure in update process, controller restores any pod that fails during the update to its current version i.e. pods that have already received the update will be restored to the updated version, and pods that have not yet received the update will be restored to the previous version.

    Partitioning a RollingUpdate (Staging an Update)

    The updateStrategy contains one more field for partitioning the RollingUpdate. If a partition is specified, all pods with an ordinal greater than or equal to that of the provided partition will be updated and the pods with an ordinal that is less than the partition will not be updated. If the pods with an ordinal value less than the partition get deleted, then those pods will get recreated with the old definition/version. This partitioning rolling update feature plays important role in the scenario where if you want to stage an update, roll out a canary, or perform a phased rollout.

    RollingUpdate supports partitioning option. You can define the partition parameter in the .spec.updateStrategy

    $ kubectl patch statefulset cassandra -p '{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":2}}}}'

    In the above command, we are giving partition value as 2, which will patch the Cassandra statefulset in such a way that, whenever we try to update the Cassandra statefulset, it will update the cassandra-2 pod only. Let’s try to patch the updated image to existing statefulset.

    $ kubectl patch statefulset cassandra --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"gcr.io/google-samples/cassandra:v14"}]'

    After patching, watch the following command output,

    $ kubectl get pods -w

    You can keep decrementing the partition value and that many pods will keep taking the effect of the applied patch. For example, if you patch the statefulset with partition=0 then all the pods of the Cassandra statefulset will get updated with provided upgrade configuration.

    Verifying if the upgrade was successful

    Verifying the upgrade process of your application is the important step to conclude the upgrade. This step might differ as per the application. Here, in the blog we have taken the Cassandra example, so we will verify if the cluster of the Cassandra nodes is being formed properly.

    Use `nodetool status` command to verify the cluster. After upgrading all the pods, you might want to run some post-processing like migrating schema if your upgrade dictates that etc.

    As per the upgrade strategy, verification of your application can be done by following ways.

    1. In OnDelete update strategy, you can keep updating pod one by one and keep checking the application status to make sure the upgrade working fine.
    2. In RollingUpdate strategy, you can check the application status once all the running pods of your application gets upgraded.

    For Cassandra like application, OnDelete update is more preferred than RollingUpdate. In rolling update, we saw that Cassandra pod gets updated one by one, starting from high to low ordinal index. There might be the case where after updating 2 pods, Cassandra cluster might go in failed state but you can not recover it like the OnDelete strategy. You have to try to recover Cassandra once the complete upgrade is done i.e. once all the pods get upgraded to provided image. If you have to use the rolling update then try partitioning the rolling update.

    Conclusion

    In this blog, we went through the Kubernetes controllers and mainly through statefulsets. We learnt about the differences between blue-green deployment and rolling update strategies then we played with the Cassandra statefulset example and successfully upgraded it with update strategies like OnDelete and RollingUpdate. Do let us know if you have any questions, queries and additional thoughts in the comments section below.

  • Explanatory vs. Predictive Models in Machine Learning

    My vision on Data Analysis is that there is continuum between explanatory models on one side and predictive models on the other side. The decisions you make during the modeling process depend on your goal. Let’s take Customer Churn as an example, you can ask yourself why are customers leaving? Or you can ask yourself which customers are leaving? The first question has as its primary goal to explain churn, while the second question has as its primary goal to predict churn. These are two fundamentally different questions and this has implications for the decisions you take along the way. The predictive side of Data Analysis is closely related to terms like Data Mining and Machine Learning.

    SPSS & SAS

    When we’re looking at SPSS and SAS, both of these languages originate from the explanatory side of Data Analysis. They are developed in an academic environment, where hypotheses testing plays a major role. This makes that they have significant less methods and techniques in comparison to R and Python. Nowadays, SAS and SPSS both have data mining tools (SAS Enterprise Miner and SPSS Modeler), however these are different tools and you’ll need extra licenses.

    I have spent some time to build extensive macros in SAS EG to seamlessly create predictive models, which also does a decent job at explaining the feature importance. While a Neural Network may do a fair job at making predictions, it is extremely difficult to explain such models, let alone feature importance. The macros that I have built in SAS EG does precisely the job of explaining the features, apart from producing excellent predictions.

    Open source TOOLS: R & PYTHON

    One of the major advantages of open source tools is that the community continuously improves and increases functionality. R was created by academics, who wanted their algorithms to spread as easily as possible. R has the widest range of algorithms, which makes R strong on the explanatory side and on the predictive side of Data Analysis.

    Python is developed with a strong focus on (business) applications, not from an academic or statistical standpoint. This makes Python very powerful when algorithms are directly used in applications. Hence, we see that the statistical capabilities are primarily focused on the predictive side. Python is mostly used in Data Mining or Machine Learning applications where a data analyst doesn’t need to intervene. Python is therefore also strong in analyzing images and videos. Python is also the easiest language to use when using Big Data Frameworks like Spark. With the plethora of packages and ever improving functionality, Python is a very accessible tool for data scientists.

    MACHINE LEARNING MODELS

    While procedures like Logistic Regression are very good at explaining the features used in a prediction, some others like Neural Networks are not. The latter procedures may be preferred over the former when it comes to only prediction accuracy and not explaining the models. Interpreting or explaining the model becomes an issue for Neural Networks. You can’t just peek inside a deep neural network to figure out how it works. A network’s reasoning is embedded in the behavior of numerous simulated neurons, arranged into dozens or even hundreds of interconnected layers. In most cases the Product Marketing Officer may be interested in knowing what are the factors that are most important for a specific advertising project. What can they concentrate on to get the response rates higher, rather than, what will be their response rate, or revenues in the upcoming year. These questions are better answered by procedures which can be interpreted in an easier way. This is a great article about the technical and ethical consequences of the lack of explanations provided by complex AI models.

    Procedures like Decision Trees are very good at explaining and visualizing what exactly are the decision points (features and their metrics). However, those do not produce the best models. Random Forests, Boosting are the procedures which use Decision Trees as the basic starting point to build the predictive models, which are by far some of the best methods to build sophisticated prediction models.

    While Random Forests use fully grown (highly complex) Trees, and by taking random samples from the training set (a process called Bootstrapping), then each split uses only a proper subset of features from the entire feature set to actually make the split, rather than using all of the features. This process of bootstrapping helps with lower number of training data (in many cases there is no choice to get more data). The (proper) subsetting of the features has a tremendous effect on de-correlating the Trees grown in the Forest (hence randomizing it), leading to a drop in Test Set error. A fresh subset of features is chosen at each step of splitting, making the method robust. The strategy also stops the strongest feature from appearing each time a split is considered, making all the trees in the forest similar. The final result is obtained by averaging the result over all trees (in case of Regression problems), or by taking a majority class vote (in case of classification problem).

    On the other hand, Boosting is a method where a Forest is grown using Trees which are NOT fully grown, or in other words, with Weak Learners. One has to specify the number of trees to be grown, and the initial weights of those trees for taking a majority vote for class selection. The default weight, if not specified is the average of the number of trees requested. At each iteration, the method fits these weak learners, finds the residuals. Then the weights of those trees which failed to predict the correct class is increased so that those trees can concentrate better on the failed examples. This way, the method proceeds by improving the accuracy of the Boosted Trees, stopping when the improvement is below a threshold. One particularly implementation of Boosting, AdaBoost has very good accuracy over other implementations. AdaBoost uses Trees of depth 1, known as Decision Stump as each member of the Forest. These are slightly better than random guessing to start with, but over time they learn the pattern and perform extremely well on test set. This method is more like a feedback control mechanism (where the system learns from the errors). To address overfitting, one can use the hyper-parameter Learning Rate (lambda) by choosing values in the range: (0,1]. Very small values of lambda will take more time to converge, however larger values may have difficulty converging. This can be achieved by a iterative process to select the correct value for lambda, plotting the test error rate against values of lambda. The value of lambda with the lowest test error should be chosen.

    In all these methods, as we move from Logistic Regression, to Decision Trees to Random Forests and Boosting, the complexity of the models increase, making it almost impossible to EXPLAIN the Boosting model to marketers/product managers. Decision Trees are easy to visualize, Logisitic Regression results can be used to demonstrate the most important factors in a customer acquisition model and hence will be well received by business leaders. On the other hand, the Random Forest and Boosting methods are extremely good predictors, without much scope for explaining. But there is hope: These models have functions for revealing the most important variables, although it is not possible to visualize why. 

    USING A BALANCED APPROACH

    So I use a mixed strategy: Use the previous methods as a step in Exploratory Data Analysis, present the importance of features, characteristics of the data to the business leaders in phase one, and then use the more complicated models to build the prediction models for deployment, after building competing models. That way, one not only gets to understand what is happening and why, but also gets the best predictive power. In most cases that I have worked, I have rarely seen a mismatch between the explanation and the predictions using different methods. After all, this is all math and the way of delivery should not change end results. Now that’s a happy ending for all sides of the business!

  • Eliminate Render-blocking Resources using React and Webpack

    In the previous blog, we learned how a browser downloads many scripts and useful resources to render a webpage. But not all of them are necessary to show a page’s content. Because of this, the page rendering is delayed. However, most of them will be needed as the user navigates through the website’s various pages.

    In this article, we’ll learn to identify such resources and classify them as critical and non-critical. Once identified, we’ll inline the critical resources and defer the non-critical resources.

    For this blog, we’ll use the following tools:

    • Google Lighthouse and other Chrome DevTools to identify render-blocking resources.
    • Webpack and CRACO to fix it.

    Demo Configuration

    For the demo, I have added the JavaScript below to the <head></head> of index.html as a render-blocking JS resource. This script loads two more CSS resources on the page.

    https://use.fontawesome.com/3ec06e3d93.js

    Other configurations are as follows:

    • Create React App v4.0
    • Formik and Yup for handling form validations
    • Font Awesome and Bootstrap
    • Lazy loading and code splitting using Suspense, React lazy, and dynamic import
    • CRACO
    • html-critical-webpack-plugin
    • ngrok and serve for serving build

    Render-Blocking Resources

    A render-blocking resource typically refers to a script or link that prevents a browser from rendering the processed content.

    Lighthouse will flag the below as render-blocking resources:

    • A <script></script> tag in <head></head> that doesn’t have a defer or async attribute.
    • A <link rel=””stylesheet””> tag that doesn’t have a media attribute to match a user’s device or a disabled attribute to hint browser to not download if unnecessary.
    • A <link rel=””import””> that doesn’t have an async attribute.

    Identifying Render-Blocking Resources

    To reduce the impact of render-blocking resources, find out what’s critical for loading and what’s not.

    To do that, we’re going to use the Coverage Tab in Chrome DevTools. Follow the steps below:

    1. Open the Chrome DevTools (press F12)

    2. Go to the Sources tab and press the keys to Run command

    The below screenshot is taken on a macOS.

    3. Search for Show Coverage and select it, which will show the Coverage tab below. Expand the tab.

    4. Click on the reload button on the Coverage tab to reload the page and start instrumenting the coverage of all the resources loading on the current page.

    5. After capturing the coverage, the resources loaded on the page will get listed (refer to the screenshot below). This will show you the code being used vs. the code loaded on the page.

    The list will display coverage in 2 colors:

    a. Green (critical) – The code needed for the first paint

    b. Red (non-critical) – The code not needed for the first paint.

    After checking each file and the generated index.html after the build, I found three primary non-critical files –

    a. 5.20aa2d7b.chunk.css98% non-critical code

    b. https://use.fontawesome.com/3ec06e3d93.js – 69.8% non-critical code. This script loads below CSS –

    1. font-awesome-css.min.css – 100% non-critical code

    2. https://use.fontawesome.com/3ec06e3d93.css – 100% non-critical code

    c. main.6f8298b5.chunk.css – 58.6% non-critical code

    The above resources satisfy the condition of a render-blocking resource and hence are prompted by the Lighthouse Performance report as an opportunity to eliminate the render-blocking resources (refer screenshot). You can reduce the page size by only shipping the code that you need.

    Solution

    Once you’ve identified critical and non-critical code, it is time to extract the critical part as an inline resource in index.html and deferring the non-critical part by using the webpack plugin configuration.

    For Inlining and Preloading CSS: 

    Use html-critical-webpack-plugin to inline the critical CSS into index.html. This will generate a <style></style> tag in the <head> with critical CSS stripped out of the main CSS chunk and preloading the main file.</head>

    const path = require('path');
    const { whenProd } = require('@craco/craco');
    const HtmlCriticalWebpackPlugin = require('html-critical-webpack-plugin');
    
    module.exports = {
      webpack: {
        configure: (webpackConfig) => {
          return {
            ...webpackConfig,
            plugins: [
              ...webpackConfig.plugins,
              ...whenProd(
                () => [
                  new HtmlCriticalWebpackPlugin({
                    base: path.resolve(__dirname, 'build'),
                    src: 'index.html',
                    dest: 'index.html',
                    inline: true,
                    minify: true,
                    extract: true,
                    width: 320,
                    height: 565,
                    penthouse: {
                      blockJSRequests: false,
                    },
                  }),
                ],
                []
              ),
            ],
          };
        },
      },
    };

    Once done, create a build and deploy. Here’s a screenshot of the improved opportunities:

    To use CRACO, refer to its README file.

    NOTE: If you’re planning to use the critters-webpack-plugin please check these issues first: Could not find HTML asset and Incompatible with html-webpack-plugin v4.

    For Deferring Routes/Pages:

    Use lazy-loading and code-splitting techniques along with webpack’s magic comments as below to preload or prefetch a route/page according to your use case.

    import { Suspense, lazy } from 'react';
    import { Redirect, Route, Switch } from 'react-router-dom';
    import Loader from '../../components/Loader';
    
    import './style.scss';
    
    const Login = lazy(() =>
      import(
        /* webpackChunkName: "login" */ /* webpackPreload: true */ '../../containers/Login'
      )
    );
    const Signup = lazy(() =>
      import(
        /* webpackChunkName: "signup" */ /* webpackPrefetch: true */ '../../containers/Signup'
      )
    );
    
    const AuthLayout = () => {
      return (
        <Suspense fallback={<Loader />}>
          <Switch>
            <Route path="/auth/login" component={Login} />
            <Route path="/auth/signup" component={Signup} />
            <Redirect from="/auth" to="/auth/login" />
          </Switch>
        </Suspense>
      );
    };
    
    export default AuthLayout;

    The magic comments enable webpack to add correct attributes to defer the scripts according to the use-case.

    For Deferring External Scripts:

    For those who are using a version of webpack lower than 5, use script-ext-html-webpack-plugin or resource-hints-webpack-plugin.

    I would recommend following the simple way given below to defer an external script.

    // Add defer/async attribute to external render-blocking script
    <script async defer src="https://use.fontawesome.com/3ec06e3d93.js"></script>

    The defer and async attributes can be specified on an external script. The async attribute has a higher preference. For older browsers, it will fallback to the defer behaviour.

    If you want to know more about the async/defer, read the further reading section.

    Along with defer/async, we can also use media attributes to load CSS conditionally.

    It’s also suggested to load fonts locally instead of using full CDN in case we don’t need all the font-face rules added by Font providers.

    Now, let’s create and deploy the build once more and check the results.

    The opportunity to eliminate render-blocking resources shows no more in the list.

    We have finally achieved our goal!

    Final Thoughts

    The above configuration is a basic one. You can read the libraries’ docs for more complex implementation.

    Let me know if this helps you eliminate render-blocking resources from your app.

    If you want to check out the full implementation, here’s the link to the repo. I have created two branches—one with the problem and another with the solution. Read the further reading section for more details on the topics.

    Hope this helps.

    Happy Coding!

    Further Reading