Category: Software Engineering & Architecture

  • Setting Up A Single Sign On (SSO) Environment For Your App

    Single Sign On (SSO) makes it simple for users to begin using an application. Support for SSO is crucial for enterprise apps, as many corporate security policies mandate that all applications use certified SSO mechanisms. While the SSO experience is straightforward, the SSO standard is anything but straightforward. It’s easy to get confused when you’re surrounded by complex jargon, including SAML, OAuth 1.0, 1.0a, 2.0, OpenID, OpenID Connect, JWT, and tokens like refresh tokens, access tokens, bearer tokens, and authorization tokens. Standards documentation is too precise to allow generalization, and vendor literature can make you believe it’s too difficult to do it yourself.

    I’ve created SSO for a lot of applications in the past. Knowing your target market, norms, and platform are all crucial.

    Single Sign On

    Single Sign On is an authentication method that allows apps to securely authenticate users into numerous applications by using just one set of login credentials.

    This allows applications to avoid the hassle of storing and managing user information like passwords and also cuts down on troubleshooting login-related issues. With SSO configured, applications check with the SSO provider (Okta, Google, Salesforce, Microsoft) if the user’s identity can be verified.

    Types of SSO

    • Security Access Markup Language (SAML)
    • OpenID Connect (OIDC)
    • OAuth (specifically OAuth 2.0 nowadays)
    • Federated Identity Management (FIM)

    Security Assertion Markup Language – SAML

    SAML (Security Assertion Markup Language) is an open standard that enables identity providers (IdP) to send authorization credentials to service providers (SP). Meaning you can use one set of credentials to log in to many different websites. It’s considerably easier to manage a single login per user than to handle several logins to email, CRM software, Active Directory, and other systems.

    For standardized interactions between the identity provider and service providers, SAML transactions employ Extensible Markup Language (XML). SAML is the link between a user’s identity authentication and authorization to use a service.

    In our example implementation, we will be using SAML 2.0 as the standard for the authentication flow.

    Technical details

    • A Service Provider (SP) is the entity that provides the service, which is in the form of an application. Examples: Active Directory, Okta Inbuilt IdP, Salesforce IdP, Google Suite.
    • An Identity Provider (IdP) is the entity that provides identities, including the ability to authenticate a user. The user profile is normally stored in the Identity Provider typically and also includes additional information about the user such as first name, last name, job code, phone number, address, and so on. Depending on the application, some service providers might require a very simple profile (username, email), while others may need a richer set of user data (department, job code, address, location, and so on). Examples: Google – GDrive, Meet, Gmail.
    • The SAML sign-in flow initiated by the Identity Provider is referred to as an Identity Provider Initiated (IdP-initiated) sign-in. In this flow, the Identity Provider begins a SAML response that is routed to the Service Provider to assert the user’s identity, rather than the SAML flow being triggered by redirection from the Service Provider. When a Service Provider initiates the SAML sign-in process, it is referred to as an SP-initiated sign-in. When end-users try to access a protected resource, such as when the browser tries to load a page from a protected network share, this is often triggered.

    Configuration details

    • Certificate – To validate the signature, the SP must receive the IdP’s public certificate. On the SP side, the certificate is kept and used anytime a SAML response is received.
    • Assertion Consumer Service (ACS) Endpoint – The SP sign-in URL is sometimes referred to simply as the URL. This is the endpoint supplied by the SP for posting SAML responses. This information must be sent by the SP to the IdP.
    • IdP Sign-in URL – This is the endpoint where SAML requests are posted on the IdP side. This information must be obtained by the SP from the IdP.

    OpenID Connect – OIDC

    OIDC protocol is based on the OAuth 2.0 framework. OIDC authenticates the identity of a specific user, while OAuth 2.0 allows two applications to trust each other and exchange data.

    So, while the main flow appears to be the same, the labels are different.

    How are SAML and OIDC similar?

    The basic login flow for both is the same.

    1. A user tries to log into the application directly.

    2. The program sends the user’s login request to the IdP via the browser.

    3. The user logs in to the IdP or confirms that they are already logged in.

    4. The IdP verifies that the user has permission to use the program that initiated the request.

    5. Information about the user is sent from the IdP to the user’s browser.

    6. Their data is subsequently forwarded to the application.

    7. The application verifies that they have permission to use the resources.

    8. The user has been granted access to the program.

    Difference between SAML and OIDC

    1. SAML transmits user data in XML, while OpenID Connect transmits data in JSON.

    2. SAML calls the data it sends an assertion. OAuth2 calls the data it sends a claim.

    3. In SAML, the application or system the user is trying to get into is referred to as the Service Provider. In OIDC, it’s called the Relying Party.

    SAML vs. OIDC

    1. OpenID Connect is becoming increasingly popular. Because it interacts with RESTful API endpoints, it is easier to build than SAML and is easily available through APIs. This also implies that it is considerably more compatible with mobile apps.

    2. You won’t often have a choice between SAML and OIDC when configuring Single Sign On (SSO) for an application through an identity provider like OneLogin. If you do have a choice, it is important to understand not only the differences between the two, but also which one is more likely to be sustained over time. OIDC appears to be the clear winner at this time because developers find it much easier to work with as it is more versatile.

    Use Cases

    1. SAML with OIDC:

    – Log in with Salesforce: SAML Authentication where Salesforce was used as IdP and the web application as an SP.

    Key Reason:

    All users are centrally managed in Salesforce, so SAML was the preferred choice for authentication.

    – Log in with Okta: OIDC Authentication where Okta used IdP and the web application as an SP.

    Key Reason:

    Okta Active Directory (AD) is already used for user provisioning and de-provisioning of all internal users and employees. Okta AD enables them to integrate Okta with any on-premise AD.

    In both the implementation user provisioning and de-provisioning takes place at the IdP side.

    SP-initiated (From web application)

    IdP-initiated (From Okta Active Directory)

    2. Only OIDC login flow:

    • OIDC Authentication where Google, Salesforce, Office365, and Okta are used as IdP and the web application as SP.

    Why not use OAuth for SSO

    1. OAuth 2.0 is not a protocol for authentication. It explicitly states this in its documentation.

    2. With authentication, you’re basically attempting to figure out who the user is when they authenticated, and how they authenticated. These inquiries are usually answered with SAML assertions rather than access tokens and permission grants.

    OIDC vs. OAuth 2.0

    • OAuth 2.0 is a framework that allows a user of a service to grant third-party application access to the service’s data without revealing the user’s credentials (ID and password).
    • OpenID Connect is a framework on top of OAuth 2.0 where a third-party application can obtain a user’s identity information which is managed by a service. OpenID Connect can be used for SSO.
    • In OAuth flow, Authorization Server gives back Access Token only. In the OpenID flow, the Authorization server returns Access Code and ID Token. A JSON Web Token, or JWT, is a specially formatted string of characters that serves as an ID Token. The Client can extract information from the JWT, such as your ID, name, when you logged in, the expiration of the ID Token, and if the JWT has been tampered with.

    Federated Identity Management (FIM)

    Identity Federation, also known as federated identity management, is a system that allows users from different companies to utilize the same verification method for access to apps and other resources.

    In short, it’s what allows you to sign in to Spotify with your Facebook account.

    • Single Sign On (SSO) is a subset of the identity federation.
    • SSO generally enables users to use a single set of credentials to access multiple systems within a single organization, while FIM enables users to access systems across different organizations.

    How does FIM work?

    • To log in to their home network, users use the security domain to authenticate.
    • Users attempt to connect to a distant application that employs identity federation after authenticating to their home domain.
    • Instead of the remote application authenticating the user itself, the user is prompted to authenticate from their home authentication server.
    • The user’s home authentication server authorizes the user to the remote application and the user is permitted to access the app. The user’s home client is authenticated to the remote application, and the user is permitted access to the application.

    A user can log in to their home domain once, to their home domain; remote apps in other domains can then grant access to the user without an additional login process.

    Applications:

    • Auth0: Auth0 uses OpenID Connect and OAuth 2.0 to authenticate users and get their permission to access protected resources. Auth0 allows developers to design and deploy applications and APIs that easily handle authentication and authorization issues such as the OIDC/OAuth 2.0 protocol with ease.
    • AWS Cognito
    • User pools – In Amazon Cognito, a user pool is a user directory. Your users can sign in to your online or mobile app using Amazon Cognito or federate through a third-party identity provider using a user pool (IdP). All members of the user pool have a directory profile that you may access using an SDK, whether they sign indirectly or through a third party.
    • Identity pools – An identity pool allows your users to get temporary AWS credentials for services like Amazon S3 and DynamoDB.

    Conclusion:

    I hope you found the summary of my SSO research beneficial. The optimum implementation approach is determined by your unique situation, technological architecture, and business requirements.

  • Enable Real-time Functionality in Your App with GraphQL and Pusher

    The most recognized solution for real-time problems is WebSockets (WS), where there is a persistent connection between the client and the server, and either can start sending data at any time. One of the latest implementations of WS is GraphQL subscriptions.

    With GraphQL subscriptions, you can easily add real-time functionalities to your application. There is an easy and standard way to implement a subscription in the GraphQL app. The client just has to make a subscription query to the server, which specifies the event and the data shape. With this query, the client establishes a long-lived connection with the server on which it listens to specific events. Just as how GraphQL solves the over-fetching problem in the REST API, a subscription continues to extend the solution for real-time.

    In this post, we will learn how to bring real-time functionality to your app by implementing GraphQL subscriptions with Pusher to manage Pub/Sub capabilities. The goal is to configure a Pusher channel and implement two subscriptions to be exposed by your GraphQL server. We will be implementing this in a Node.js runtime environment.

    Why Pusher?

    Why are we doing this using Pusher? 

    • Pusher, being a hosted real-time services provider, relieves us from managing our own real-time infrastructure, which is a highly complex problem.
    • Pusher provides an easy and consistent API.
    • Pusher also provides an entire set of tools to monitor and debug your realtime events.
    • Events can be triggered by and consumed easily from different applications written in different frameworks.

    Project Setup

    We will start with a repository that contains a codebase for a simple GraphQL backend in Node.js, which is a minimal representation of a blog post application. The entities included are:

    1. Link – Represents an URL and a small description for the Link
    2. User – Link belongs to User
    3. Vote – Represents users vote for a Link

    In this application, a User can sign up and add or vote a Link in the application, and other users can upvote the Link. The database schema is built using Prisma and SQLite for quick bootstrapping. In the backend, we will use graphql-yoga as the GraphQL server implementation. To test our GraphQL backend, we will use the graphql-playground by Prisma, as a client, which will perform all queries and mutations on the server.

    To set up the application:

    1. Clone the repository here
    2. Install all dependencies using 
    npm install

    1. Set up a database using prisma-cli with following commands
    npx prisma migrate save --experimental
    #! Select ‘yes’ for the prompt to add an SQLite db after this command and enter a name for the migration. 
    npx prisma migrate up --experimental
    npx prisma generate

    Note: Migrations are experimental features of the Prisma ORM, but you can ignore them because you can have a different backend setup for DB interactions. The purpose of using Prisma here is to quickly set up the project and dive into subscriptions.

    A new directory, named Prisma, will be created containing the schema and database in SQLite. Now, you have your database and app set up and ready to use.

    To start the Node.js application, execute the command:

    npm start

    Navigate to http://localhost:4000 to see the graphql-playground where we will execute our queries and mutations.

    Our next task is to add a GraphQL subscription to our server to allow clients to listen to the following events:

    • A new Link is created
    • A Link is upvoted

    To add subscriptions, we will need an npm package called graphql-pusher-subscriptions to help us interact with the Pusher service from within the GraphQL resolvers. The module will trigger events and listen to events for a channel from the Pusher service.

    Before that, let’s first create a channel in Pusher. To configure a Pusher channel, head to their website at Pusher to create an account. Then, go to your dashboard and create a channels application. Choose a name, the cluster closest to your location, and frontend tech as React and backend tech as Node.js.

    You will receive the following code to start.

    Now, we add the graphql-pusher-subscription package. This package will take the Pusher channel configuration and give you an API to trigger and listen to events published on the channel.

    Now, we import the package in the src/index.js file.

    const { PusherChannel } = require('graphql-pusher-subscriptions');

    After the PusherChannel class provided by the module accepts a configuration for the channel, we need to instantiate the class and create a reference Pub/Sub to the object. We give the Pusher config object given while creating the channel.

    const pubsub = new PusherChannel({
      appId: '1046878',
      key: '3c84229419ed7b47e5b0',
      secret: 'e86868a98a2f052981a6',
      cluster: 'ap2',
      encrypted: true,
      channel: 'graphql-subscription'
    });

    Now, we add “pubsub” to the context so that it is available to all the resolvers. The channel field tells the client which channel to subscribe to. Here we have the channel “graphql-subscription”.

    const server = new GraphQLServer({
      typeDefs: './src/schema.graphql',
      resolvers,
      context: request => {
        return {
          ...request,
          prisma,
          pubsub
        }
      },
    })

    The above part enables us to access the methods we need to implement our subscriptions from inside our resolvers via context.pubsub.

    Subscribing to Link-created Event

    The first step to add a subscription is to extend the GraphQL schema definition.

    type Subscription {
      newLink: Link
    }

    Next, we implement the resolver for the “newLink” subscription type field. It is important to note that resolvers for subscriptions are different from queries and mutations in minor ways.

    1. They return an AsyncIterator instead of data, which is then used by a GraphQL server to publish the event payload to the subscriber client.

    2. The subscription resolvers are provided as a value of the resolve field inside an object. The object should also contain another field named “resolve” that returns the payload data from the data emitted by AsyncIterator.

    To add the resolvers for the subscription, we start by adding a new file called Subscriptions.js

    Inside the project directory, add the file as src/resolvers/Subscription.js

    Now, in the new file created, add the following code, which will be the subscription resolver for the “newLink” type we created in GraphQL schema.

    function newLinkSubscribe(parent, args, context, info) {
      return context.pubsub.asyncIterator("NEW_LINK")
    }
    
    const newLink = {
      subscribe: newLinkSubscribe,
      resolve: payload => {
        return payload
      },
    }
    
    module.exports = {
      newLink,
    }
    view raw

    In the code above, the subscription resolver function, newLinkSubscribe, is added as a field value to the property subscribe just as we described before. The context provides reference to the Pub/Sub object, which lets us use the asyncIterator() with “NEW_LINK” as a parameter. This function resolves subscriptions and publishes events.

    Adding Subscriptions to Your Resolvers

    The final step for our subscription implementation is to call the function above inside of a resolver. We add the following call to pubsub.publish() inside the post resolver function inside Mutation.js file.

    function post(parent, args, context, info) {
      const userId = getUserId(context)
      const newLink = await context.prisma.link.create({
        data: {
          url: args.url,
          description: args.description,
          postedBy: { connect: { id: userId } },
        }
      })
      context.pubsub.publish("NEW_LINK", newLink)
      return newLink
    }

    In the code above, we can see that we pass the same string “NEW_LINK” to the publish method as we did in the newLinkSubscribe function in the subscription function before. The “NEW_LINK” is the event name, and it will publish events to the Pusher service, and the same name will be used on the subscription resolver to bind to the particular event name. We also add the newLink as a second argument, which contains the data part for the event that will be published. The context.pubsub.publish function will be triggered before returning the newLink data.

    Now, we will update the main resolver object, which is given to the GraphQL server.

    First, import the subscription module inside of the index.js file.

    const Subscription = require('./resolvers/Subscription') 
    const resolvers = {
      Query,
      Mutation,
      Subscription,
      User,
      Link,
    }

    Now, with all code in place, we start testing our real time API. We will use multiple instances/tabs of GraphQL playground concurrently.

    Testing Subscriptions

    If your server is already running, then kill it with CTRL+C and restart with this command:

    npm start

    Next, open the browser and navigate to http://localhost:4000 to see the GraphQL playground. We will use one tab of the playground to perform the mutation to trigger the event to Pusher and invoke the subscriber.

    We will now start to execute the queries to add some entities in the application.

    First, let’s create a user in the application by using the signup mutation. We send the following mutation to the server to create a new User entity.

    mutation {
        signup(
        name: "Alice"
        email: "alice@prisma.io"
        password: "graphql"
      ) {
        token
        user {
          Id
        }
      }
    }

    You will see a response in the playground that contains the authentication token for the user. Copy the token, and open another tab in the playground. Inside that new tab, open the HTTP_HEADERS section in the bottom and add the Authorization header.

    Replace the __TOKEN__  placeholder from the below snippet with the copied token from above.

    {
      "Authorization": "Bearer __TOKEN__"
    }

    Now, all the queries or mutations executed from that tab will carry the authentication token. With this in place, we sent the following mutation to our GraphQL server.

    mutation {
    post(
        url: "http://velotio.com"
        description: "An awesome GraphQL blog"
      ) {
        id
      }
    }

    The mutations above create a Link entity inside the application. Now that we have created an entity, we now move to test the subscription part. In another tab, we will send the subscription query and create a persistent WebSocket connection to the server. Before firing out a subscription query, let us first understand the syntax of it. It starts with the keyword subscription followed by the subscription name. The subscription query is defined in the GraphQL schema and shows the data shape we can resolve to. Here, we want to subscribe to a newLink subscription name, and the data resolved by it consists of that of a Link entity. That means we can resolve any specific part of the Link entity. Here, we are asking for attributes like id, URL, description, and nested attributes of the postedBy field.

    subscription {
      newLink {
          id
          url
          description
          postedBy {
            id
            name
            email
          }
      }
    }

    The response of this operation is different from that of a mutation or query. You see a loading spinner, which indicates that it is waiting for an event to happen. This means the GraphQL client (playground) has established a connection with the server and is listening for response data.

    Before triggering a subscription, we will also keep an eye on the Pusher channel for events triggered to verify that our Pusher service is integrated successfully.

    To do this, we go to Pusher dasboard and navigate to the channel app we created and click on the debug console. The debug console will show us the events triggered in real-time.

    Now that the Pusher dashboard is visible, we will trigger the subscription event by running the following mutation inside a new Playground tab.

    mutation {
      post(
        url: "www.velotio.com"
        description: "Graphql remote schema stitching"
      ) {
        id
      }
    }

    Now, we observe the Playground where subscription was running.

    We can see that the newly created Link is visible in the response section, and the subscription continues to listen, and the event has reached the Pusher service.

    You will observe an event on the Pusher console that is the same event and data as sent by your post mutation.

     

    We have achieved our first goal, i.e., we have integrated the Pusher channel and implemented a subscription for a Link creation event.

    To achieve our second goal, i.e., to listen to Vote events, we repeat the same steps as we did for the Link subscription.

    We add a subscription resolver for Vote in the Subscriptions.js file and update the Subscription type in the GraphQL schema. To trigger a different event, we use “NEW_VOTE” as the event name and add the publish function inside the resolver for Vote mutation.

    function newVoteSubscribe(parent, args, context, info) {
      return context.pubsub.asyncIterator("NEW_VOTE")
    }
    
    const newVote = {
      subscribe: newVoteSubscribe,
      resolve: payload => {
        return payload
      },
    }
    view raw

    Update the export statement to add the newVote resolver.

    module.exports = {
      newLink,
      newVote,
    }

    Update the Vote mutation to add the publish call before returning the newVote data. Notice that the first parameter, “NEW_VOTE”, is being passed so that the listener can bind to the new event with that name.

    const newVote = context.prisma.vote.create({
        data: {
          user: { connect: { id: userId } },
          link: { connect: { id: Number(args.linkId) } },
        }
      })
      context.pubsub.publish("NEW_VOTE", newVote)
      return newVote
    }

    Now, restart the server and complete the signup process with setting HTTP_HEADERS as we did before. Add the following subscription to a new Playground tab.

    subscription {
      newVote {
        id
        link {
          url
          description
        }
        user {
          name
          email
        }
      }
    }

    In another Playground tab, send the following Vote mutation to the server to trigger the event, but do not forget to verify the Authorization header. The below mutation will add the Vote of the user to the Link.  Replace the “__LINK_ID__” with the linkId generated in the previous post mutation.

    mutation {
      vote(linkId: "__LINK_ID__") {
        link {
          url
          description
        }
        user {
          name
          email
        }
      }
    }

    Observe the event data on the response tab of the vote subscription. Also, you can check your event triggered on the pusher dashboard.

    The final codebase is available on a branch named with-subscription.

    Conclusion

    By following the steps above, we saw how easy it is to add real-time features to GraphQL apps with subscriptions. Also, establishing a connection with the server is no hassle, and it is much similar to how we implement the queries and mutations. Unlike the mainstream approach, where one has to build and manage the event handlers, the GraphQL subscriptions come with these features built-in for the client and server. Also, we saw how we can use a managed real-time service like Pusher can be for Pub/Sub events. Both GraphQL and Pusher can prove to be a solid combination for a reliable real-time system.

    Related Articles

    1. Build and Deploy a Real-Time React App Using AWS Amplify and GraphQL

    2. Scalable Real-time Communication With Pusher

  • Using DRF Effectively to Build Cleaner and Faster APIs in Django

    Django REST Framework (DRF) is a popular library choice when it comes to creating REST APIs with Django. With minimal effort and time, you can start creating APIs that support authentication, authorization, pagination, sorting, etc. Once we start creating production-level APIs, we must do a lot of customization that are highly supported by DRF.

    In this blog post, I will share some of the features that I have used extensively while working with DRF. We will be covering the following use cases:

    1. Using serializer context to pass data from view to serializer
    2. Handling reverse relationships in serializers
    3. Solving slow queries by eliminating the N+1 query problem
    4. Custom Response Format
    5. SerializerMethodField to add read-only derived data to the response
    6. Using Mixin to enable/disable pagination with Query Param

    This will help you to write cleaner code and improve API performance.

    Prerequisite:

    To understand the things discussed in the blog, the reader should have some prior experience of creating REST APIs using DRF. We will not be covering the basic concepts like serializers, API view/viewsets, generic views, permissions, etc. If you need help in building the basics, here is the list of resources from official documentation.

    Let’s explore Django REST Framework’s (DRF) lesser-known but useful features:

    1. Using Serializer Context to Pass Data from View to Serializer

    Let us consider a case when we need to write some complex validation logic in the serializer. 

    The validation method takes two parameters. One is the self or the serializer object, and the other is the field value received in the request payload. Our validation logic may sometimes need some extra information that must be taken from the database or derived from the view calling the serializer. 

    Next is the role of the serializer’s context data. The serializer takes the context parameter in the form of a python dictionary, and this data is available throughout the serializer methods. The context data can be accessed using self.context in serializer validation methods or any other serializer method. 

    Passing custom context data to the serializer

    To pass the context to the serializer, create a dictionary with the data and pass it in the context parameter when initializing the serializer.

    context_data = {"valid_domains": ValidDomain.objects.all()}
    serializer = MySerializer(data=request.data, context=context_data)

    In case of generic view and viewsets, the serializer initialization is handled by the framework and passed the following as default context.

    {
       'request': self.request,
       'format': self.format_kwarg,
       'view': self
    }

    Thanks to DRF, we can cleanly and easily customize the context data. 

    # override the get_serializer_context method in the generic viewset
    class UserCreateListAPIView(generice.ListCreateAPIView):
        def get_serializer_context(self):
            context = super().get_serializer_context()
            # Update context data to add new data
       	  context.update({"valid_domains": ValidDomain.objects.all()})
       	  return context

    # read the context data in the serializer validation method
    class UserSerializer(serializer.Serializer):
        def validate_email(self, val):
            valid_domains = serf.context.get("valid_domains")
            # main validation logic goes here

    2. Handling Reverse Relationships in Serializers 

    To better understand this, take the following example. 

    class User(models.Model):
       name = models.CharField(max_length=60)
       email = models.EmailField()
    
    
    class Address(models.Model):
       detail = models.CharField(max_length=100)
       city = models.FloatField()
       user = models.ForeignKey(User, related_name="addresses", on_delete=models.CASCADE)

    We have a User model, which contains data about the customer and Address that has the list of addresses added. We need to return the user details along with their address detail, as given below.

    {
       "name": "Velotio",
       "email": "velotio@example.com",
       "addresses": [
       	{
           	"detail": "Akshya Nagar 1st Block 1st Cross, Rammurthy nagar",
           	"city": "Banglore"
       	},
       	{
           	"detail": "50 nd Floor, , Narayan Dhuru Street, Mandvi",
           	"city": "Mumbai"
       	},
       	{
           	"detail": "Ground Floor, 8/5, J K Bldg, H G Marg, Opp Gamdevi Temple, Grant Road",
           	"city": "Banglore"
       	}
       ]
    }

    • Forward model relationships are automatically included in the fields returned by the ModelSerializer.
    • The relationship between User and Address is a reverse relationship and needs to be explicitly added in the fields. 
    • We have defined a related_name=addresses for the User Foreign Key in the Address; it can be used in the fields meta option. 
    • If we don’t have the related_name, we can use address_set, which is the default related_name.
    class UserSerializer(serializers.ModelSerializer):
          class Meta:
              model = User
              fields = ("name", "email", "addresses")

    The above code will return the following response:

    {
       "name": "Velotio",
       "email": "velotio@example.com",
       "addresses": [
           10,
           20,
           45
       ]
    }

    But this isn’t what we need. We want to return all the information about the address and not just the IDs. DRF gives us the ability to use a serializer as a field to another serializer. 

    The below code shows how to use the nested Serializer to return the address details.

    class AddressSerializer(serializers.ModelSerializer):
       class Meta:
           model = Address
           fields = ("detail", "city") 
    
    class UserSerializer(serializers.ModelSerializer):
       addresses = AddressSerializer(many=True, read_only=True)
       class Meta:
           model = User
           fields = ("name", "email", "addresses")

    • The read_only=True parameter marks the field as a read-only field. 
    • The addresses field will only be used in GET calls and will be ignored in write operations. 
    • Nested Serializers can also be used in write operations, but DRF doesn’t handle the creation/deletion of nested serializers by default.

    3. Solving Slow Queries by Eliminating the N+1 Query Problem

    When using nested serializers, the API needs to run queries over multiple tables and a large number of records. This can often lead to slower APIs. A common and easy mistake to make while using serializer with relationships is the N+1 queries problem. Let’s first understand the problem and ways to solve it.

    Identifying the N+1 Queries Problem 

    Let’s take the following API example and count the number of queries hitting the database on each API call.

    class Author(models.Model):
       name = models.CharField(max_length=20)
    
    
    class Book(models.Model):
       name = models.CharField(max_length=20)
       author = models.ForeignKey("Author", models.CASCADE, related_name="books")
       created_at = models.DateTimeField(auto_now_add=True)

    class AuthorSerializer(serializers.ModelSerializer):
       class Meta:
       	model = Author
       	fields = "__all__"
    
    
    class BookSerializer(serializers.ModelSerializer):
       author = AuthorSerializer()
       class Meta:
       	model = Book
       	fields = "__all__"

    class BookListCreateAPIView(generics.ListCreateAPIView):
    
    	serializer_class = BookSerializer
    	queryset = Book.objects.all()

    urlpatterns = [
    	path('admin/', admin.site.urls),
    	path('hello-world/', HelloWorldAPI.as_view()),
    	path('books/', BookListCreateAPIView.as_view(), name="book_list")
    ]

    We are creating a simple API to list the books along with the author’s details. Here is the output:

    {
      "message": "",
      "errors": [],
      "data": [
        {
          "id": 1,
          "author": {
            "id": 3,
            "name": "Meet teacher."
          },
          "name": "Body society.",
          "created_at": "1973-08-03T02:43:22Z"
        },
        {
          "id": 2,
          "author": {
            "id": 49,
            "name": "Cause wait health."
          },
          "name": "Left next pretty.",
          "created_at": "2000-07-07T03:37:10Z"
        },
        {
          "id": 3,
          "author": {
            "id": 7,
            "name": "No figure those."
          },
          "name": "Reflect American.",
          "created_at": "1994-08-14T03:54:38Z"
        },
        {
          "id": 4,
          "author": {
            "id": 35,
            "name": "Garden order table."
          },
          "name": "Throw minute.",
          "created_at": "1993-12-30T20:50:56Z"
        },
        {
          "id": 5,
          "author": {
            "id": 49,
            "name": "Cause wait health."
          },
          "name": "Congress now build.",
          "created_at": "1977-07-21T17:35:42Z"
        },
        {
          "id": 6,
          "author": {
            "id": 39,
            "name": "Involve section."
          },
          "name": "Activity drop fight.",
          "created_at": "2011-04-21T23:09:54Z"
        },
        {
          "id": 7,
          "author": {
            "id": 44,
            "name": "Cost spring our."
          },
          "name": "Because pattern.",
          "created_at": "2010-01-04T08:21:29Z"
        },
        {
          "id": 8,
          "author": {
            "id": 45,
            "name": "Entire we certainly."
          },
          "name": "Program use feel.",
          "created_at": "1972-11-30T15:49:50Z"
        },
        {
          "id": 9,
          "author": {
            "id": 42,
            "name": "Interest drop."
          },
          "name": "Purpose live might.",
          "created_at": "1987-01-31T16:48:54Z"
        },
        {
          "id": 10,
          "author": {
            "id": 12,
            "name": "Sell data contain."
          },
          "name": "Everyone thing seem.",
          "created_at": "2007-10-19T07:16:34Z"
        }
      ],
      "status": "success"
    }

    Ideally, we should be able to get data in 1 single SQL query. Now, let’s write a test case and see if our assumption is correct:

    from django.urls import reverse
    from django_seed import Seed
    
    from core.models import Author, Book
    from rest_framework.test import APITestCase
    
    seeder = Seed.seeder()
    
    
    class BooksTestCase(APITestCase):
        def test_list_books(self):
            # Add dummy data to the Author and Book Table
            seeder.add_entity(Author, 5)
            seeder.add_entity(Book, 10)
            seeder.execute()
            # we expect the result in 1 query
            with self.assertNumQueries(1):
                response = self.client.get(reverse("book_list"), format="json")
    
    # test output
    $ ./manage.py test
    .
    .
    .
    AssertionError: 11 != 1 : 11 queries executed, 1 expected
    Captured queries were:
    1. SELECT "core_book"."id", "core_book"."name", "core_book"."author_id", "core_book"."created_at" FROM "core_book"
    2. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 4 LIMIT 21
    3. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 1 LIMIT 21
    4. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 4 LIMIT 21
    5. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 4 LIMIT 21
    6. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 5 LIMIT 21
    7. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 5 LIMIT 21
    8. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 1 LIMIT 21
    9. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 3 LIMIT 21
    10. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 3 LIMIT 21
    11. SELECT "core_author"."id", "core_author"."name" FROM "core_author" WHERE "core_author"."id" = 5 LIMIT 21
    
    ----------------------------------------------------------------------
    Ran 1 test in 0.027s
    
    FAILED (failures=1)

    As we see, our test case has failed, and it shows that the number of queries running are 11 and not one. In our test case, we added 10 records in the Book model. The number of queries hitting the database is 1(to fetch books list) + the number of records in the Book model (to fetch author details for each book record). The test output shows the SQL queries executed. 

    The side effects of this can easily go unnoticed while working on a test database with a small number of records. But in production, when the data grows to thousands of records, this can seriously degrade the performance of the database and application.

    Let’s Do It the Right Way

    If we think this in terms of a raw SQL query, this can be achieved with a simple Inner Join operation between the Book and the Author table. We need to do something similar in our Django query. 

    Django provides selected_related and prefetch_related to handle query problems around related objects. 

    • select_related works on forward ForeignKey, OneToOne, and backward OneToOne relationships by creating a database JOIN and fetching the related field data in one single query. 
    • prefetch_related works on forward ManyToMany and in reverse, ManyToMany, ForeignKey. prefetch_related does a different query for every relationship and plays out the “joining” in Python. 

    Let’s rewrite the above code using select_related and check the number of queries. 

    We only need to change the queryset in the view. 

    class BookListCreateAPIView(generics.ListCreateAPIView):
    
       serializer_class = BookSerializer
    
       def get_queryset(self):
           queryset = Book.objects.select_related("author").all()
           return queryset

    Now, we will rerun the test, and this time it should pass:

    $ ./manage.py test	 
    Creating test database for alias 'default'...
    System check identified no issues (0 silenced).
    .
    ----------------------------------------------------------------------
    Ran 1 test in 0.024s
    
    OK
    Destroying test database for alias 'default'...

    If you are interested in knowing the SQL query executed, here it is:

    >> queryset = Book.objects.select_related("author").all()
    >> print(queryset.query)
    
    SELECT "core_book"."id",
           "core_book"."name",
           "core_book"."author_id",
           "core_book"."created_at",
           "core_author"."id",
           "core_author"."name"
    FROM "core_book"
             INNER JOIN "core_author" ON ("core_book"."author_id" = "core_author"."id")

    4. Custom Response Format

    It’s a good practice to decide the API endpoints and their request/response payload before starting the actual implementation. If you are the developer, by writing the implementation for the API where the response format is already decided, you can not go with the default response returned by DRF. 

    Let’s assume that, below is the decided format for returning the response: 

    {
      "message": "",
      "errors": [],
      "data": [
        {
          "id": 1,
          "author": {
            "id": 3,
            "name": "Meet teacher."
          },
          "name": "Body society.",
          "created_at": "1973-08-03T02:43:22Z"
        },
        {
          "id": 2,
          "author": {
            "id": 49,
            "name": "Cause wait health."
          },
          "name": "Left next pretty.",
          "created_at": "2000-07-07T03:37:10Z"
        }
      ],
      "status": "success"
    }

    We can see that the response format has a message, errors, status, and data attributes. Next, we will see how to write a custom renderer to achieve the above response format. Since the format is in JSON , we override the rest_framework.renderers.JSONRenderer.

    from rest_framework.renderers import JSONRenderer
    from rest_framework.views import exception_handler
    
    
    class CustomJSONRenderer(JSONRenderer):
       def render(self, data, accepted_media_type=None, renderer_context=None):
           # reformat the response
           response_data = {"message": "", "errors": [], "data": data, "status": "success"}
           # call super to render the response
           response = super(CustomJSONRenderer, self).render(
               response_data, accepted_media_type, renderer_context
           )
    
           return response

    To use this new renderer, we need to add it to  DRF settings:

    REST_FRAMEWORK = {
       "DEFAULT_RENDERER_CLASSES": (
           "core.renderer.CustomJSONRenderer",
           "rest_framework.renderers.JSONRenderer",
           "rest_framework.renderers.BrowsableAPIRenderer",
       )
    }

    5. Use the SerializerMethodField to add read-only derived data to the response

    The SerializerMethodField can be used when we want to add some derived data to the object. Consider the same Book listing API. If we want to send an additional property display name—which is the book name in uppercase—we can use the serializer method field as below.

    class BookSerializer(serializers.ModelSerializer):
       author = AuthorSerializer()
       book_display_name= serializers.SerializerMethodField(source="get_book_display_name")
    
       def get_book_display_name(self, book):
           return book.name.upper()
    
       class Meta:
           model = Book
           fields = "__all__"

    • The SerializerMethodField takes the source parameter, where we can pass the method name that should be called. 
    • The method gets self and the object as the argument.
    • By default, the DRF source parameter uses get_{field_name}, so in the example above, the source parameter can be omitted, and it will still give the same result.
    book_display_name = serializers.SerializerMethodField() 

    6. Use Mixin to Enable/disable Pagination with Query Param

    If you are developing APIs for an internal application and want to support APIs with pagination both enabled and disabled, you can make use of the Mixin below. This allows the caller to use the query parameter “pagination” to enable/disable pagination. This Mixin can be used with the generic views.

    class DynamicPaginationMixin(object):
       """
       Controls pagination enable disable option using query param "pagination".
       If pagination=false is passed in query params, data is returned without pagination
       """
       def paginate_queryset(self, queryset):
       	pagination = self.request.query_params.get("pagination", "true")
        	if bool(pagination):
            	return None
    
       	return super().paginate_queryset(queryset)

    # Remember to use mixin before the generics
    class BookListCreateAPIView(DynamicPaginationMixin, generics.ListCreateAPIView):
    
    	serializer_class = BookSerializer
    
    	def get_queryset(self):
        	    queryset = Book.objects.select_related("author").all()
        	    return queryset

    Conclusion

    This was just a small selection of all the awesome features provided by Django and DRF, so keep exploring. I hope you learned something new today. If you are interested in learning more about serverless deployment of Django Applications, you can refer to our comprehensive guide to deploy serverless, event-driven Python applications using Zappa.

    Further Reading

    1. Django Rest framework Documentation
    2. Django Documentation

  • Building High-performance Apps: A Checklist To Get It Right

    An app is only as good as the problem it solves. But your app’s performance can be extremely critical to its success as well. A slow-loading web app can make users quit and try out an alternative in no time. Testing an app’s performance should thus be an integral part of your development process and not an afterthought.

    In this article, we will talk about how you can proactively monitor and boost your app’s performance as well as fix common issues that are slowing down the performance of your app.

    I’ll use the following tools for this blog.

    • Lighthouse – A performance audit tool, developed by Google
    • Webpack – A JavaScript bundler

    You can find similar tools online, both free and paid. So let’s give our Vue a new Angular perspective to make our apps React faster.

    Performance Metrics

    First, we need to understand which metrics play an important role in determining an app’s performance. Lighthouse helps us calculate a score based on a weighted average of the below metrics:

    1. First Contentful Paint (FCP) – 15%
    2. Speed Index (SI) – 15%
    3. Largest Contentful Paint (LCP) – 25%
    4. Time to Interactive (TTI) – 15%
    5. Total Blocking Time (TBT) – 25%
    6. Cumulative Layout Shift (CLS) – 5%

    By taking the above stats into account, Lighthouse gauges your app’s performance as such:

    • 0 to 49 (slow): Red
    • 50 to 89 (moderate): Orange
    • 90 to 100 (fast): Green

    I would recommend going through Lighthouse performance scoring to learn more. Once you understand Lighthouse, you can audit websites of your choosing.

    I gathered audit scores for a few websites, including Walmart, Zomato, Reddit, and British Airways. Almost all of them had a performance score below 30. A few even secured a single-digit. 

    To attract more customers, businesses fill their apps with many attractive features. But they ignore the most important thing: performance, which degrades with the addition of each such feature.

    As I said earlier, it’s all about the user experience. You can read more about why performance matters and how it impacts the overall experience.

    Now, with that being said, I want to challenge you to conduct a performance test on your favorite app. Let me know if it receives a good score. If not, then don’t feel bad.

    Follow along with me. 

    Let’s get your app fixed!

    Source: Giphy

    Exploring Opportunities

    If you’re still reading this blog, I expect that your app received a low score, or maybe, you’re just curious.

    Source: Giphy

    Whatever the reason, let’s get started.

    Below your scores are the possible opportunities suggested by Lighthouse. Fixing these affects the performance metrics above and eventually boosts your app’s performance. So let’s check them out one-by-one.

    Here are all the possible opportunities listed by Lighthouse:

    1. Eliminate render-blocking resources
    2. Properly size images
    3. Defer offscreen images
    4. Minify CSS & JavaScript
    5. Serve images in the next-gen formats
    6. Enable text compression
    7. Preconnect to required origins
    8. Avoid multiple page redirects
    9. Use video formats for animated content

    A few other opportunities won’t be covered in this blog, but they are just an extension of the above points. Feel free to read them under the further reading section.

    Eliminate Render-blocking Resources

    Source: Giphy

    This section lists down all the render-blocking resources. The main goal is to reduce their impact by:

    • removing unnecessary resources,
    • deferring non-critical resources, and
    • in-lining critical resources.

    To do that, we need to understand what a render-blocking resource is.

    Render-blocking resource and how to identify

    As the name suggests, it’s a resource that prevents a browser from rendering processed content. Lighthouse identifies the following as render-blocking resources:

    • A <script> </script>tag in <head> </head>that doesn’t have a defer or async attribute
    • A <link rel=””stylesheet””> tag that doesn’t have a media attribute to match a user’s device or a disabled attribute to hint browser to not download if unnecessary
    • A <link rel=””import””> that doesn’t have an async attribute

    To reduce the impact, you need to identify what’s critical and what’s not. You can read how to identify critical resources using the Chrome dev tool.

    Classify Resources

    Classify resources as critical and non-critical based on the following color code:

    • Green (critical): Needed for the first paint.
    • Red (non-critical): Not needed for the first paint but will be needed later.

    Solution

    Now, to eliminate render-blocking resources:

    Extract the critical part into an inline resource and add the correct attributes to the non-critical resources. These attributes will indicate to the browser what to download asynchronously. This can be done manually or by using a JS bundler.

    Webpack users can use the libraries below to do it in a few easy steps:

    • For extracting critical CSS, you can use html-critical-webpack-plugin or critters-webpack-plugin. It’ll generate an inline <style></style> tag in the <head></head> with critical CSS stripped out of the main CSS chunk and preloading the main file
    • For extracting CSS depending on media queries, use media-query-splitting-plugin or media-query-plugin
    • The first paint doesn’t need to be dependent on the JavaScript files. Use lazy loading and code splitting techniques to achieve lazy loading resources (downloading only when requested by the browser). The magic comments in lazy loading make it easy
    • And finally, for the main chunk, vendor chunk, or any other external scripts (included in index.html), you can defer them using script-ext-html-webpack-plugin

    There are many more libraries for inlining CSS and deferring external scripts. Feel free to use as per the use case.

    Use Properly Sized Images

    This section lists all the images used in a page that aren’t properly sized, along with the stats on potential savings for each image.

    How Lighthouse Calculate Oversized Images? 

    Lighthouse calculates potential savings by comparing the rendered size of each image on the page with its actual size. The rendered image varies based on the device pixel ratio. If the size difference is at least 25 KB, the image will fail the audit.

    Solution 

    DO NOT serve images that are larger than their rendered versions! The wasted size just hampers the load time. 

    Alternatively,

    • Use responsive images. With this technique, create multiple versions of the images to be used in the application and serve them depending on the media queries, viewport dimensions, etc
    • Use image CDNs to optimize images. These are like a web service API for transforming images
    • Use vector images, like SVG. These are built on simple primitives and can scale without losing data or change in the file size

    You can resize images online or on your system using tools. Learn how to serve responsive images.

    Learn more about replacing complex icons with SVG. For browsers that don’t support SVG format, here’s A Complete Guide to SVG fallbacks.

    Defer Offscreen Images

    An offscreen image is an image located outside of the visible browser viewport. 

    The audit fails if the page has offscreen images. Lighthouse lists all offscreen or hidden images in your page, along with the potential savings. 

    Solution 

    Load offscreen images only when the user focuses on that part of the viewport. To achieve this, lazy-load these images after loading all critical resources.

    There are many libraries available online that will load images depending on the visible viewport. Feel free to use them as per the use case.

    Minify CSS and JavaScript

    Lighthouse identifies all the CSS and JS files that are not minified. It will list all of them along with potential savings.

    Solution 

    Do as the heading says!

    Source: Giphy

    Minifiers can do it for you. Webpack users can use mini-css-extract-plugin and terser-webpack-plugin for minifying CSS and JS, respectively.

    Serve Images in Next-gen Formats

    Following are the next-gen image formats:

    • Webp
    • JPEG 2000
    • JPEG XR

    The image formats we use regularly (i.e., JPEG and PNG) have inferior compression and quality characteristics compared to next-gen formats. Encoding images in these formats can load your website faster and consume less cellular data.

    Lighthouse converts each image of the older format to Webp format and reports those which ones have potential savings of more than 8 KB.

    Solution 

    Convert all, or at least the images Lighthouse recommends, into the above formats. Use your converted images with the fallback technique below to support all browsers.

    <picture>
      <source type="image/jp2" srcset="my-image.jp2">
      <source type="image/jxr" srcset="my-image.jxr">
      <source type="image/webp" srcset="my-image.webp">
      <source type="image/jpeg" srcset="my-image.jpg">
      <img src="my-image.jpg" alt="">
    </picture>

    Enable Text Compression

    Source: Giphy

    This technique of compressing the original textual information uses compression algorithms to find repeated sequences and replace them with shorter representations. It’s done to further minimize the total network bytes.

    Lighthouse lists all the text-based resources that are not compressed. 

    It computes the potential savings by identifying text-based resources that do not include a content-encoding header set to br, gzip or deflate and compresses each of them with gzip.

    If the potential compression savings is more than 10% of the original size, then the file fails the audit.

    Solution

    Webpack users can use compression-webpack-plugin for text compression. 

    The best part about this plugin is that it supports Google’s Brotli compression algorithm which is superior to gzip. Alternatively, you can also use brotli-webpack-plugin. All you need to do is configure your server to return Content-Encoding as br.

    Brotli compresses faster than gzip and produces smaller files (up to 20% smaller). As of June 2020, Brotli is supported by all major browsers except Safari on iOS and desktop and Internet Explorer.

    Don’t worry. You can still use gzip as a fallback.

    Preconnect to Required Origins

    This section lists all the key fetch requests that are not yet prioritized using <link rel=””preconnect””>.

    Establishing connections often involves significant time, especially when it comes to secure connections. It encounters DNS lookups, redirects, and several round trips to the final server handling the user’s request.

    Solution

    Establish an early connection to required origins. Doing so will improve the user experience without affecting bandwidth usage. 

    To achieve this connection, use preconnect or dns-prefetch. This informs the browser that the app wants to establish a connection to the third-party origin as soon as possible.

    Use preconnect for most critical connections. For non-critical connections, use dns-prefetch. Check out the browser support for preconnect. You can use dns-prefetch as the fallback.

    Avoid Multiple Page Redirects

    Source: Giphy

    This section focuses on requesting resources that have been redirected multiple times. One must avoid multiple redirects on the final landing pages.

    A browser encounters this response from a server in case of HTTP-redirect:

    HTTP/1.1 301 Moved Permanently
    Location: /path/to/new/location

    A typical example of a redirect looks like this:

    example.com → www.example.com → m.example.com – very slow mobile experience.

    This eventually makes your page load more slowly.

    Solution

    Don’t leave them hanging!

    Source: Giphy

    Point all your flagged resources to their current location. It’ll help you optimize your pages’ Critical Rendering Path.

    Use Video Formats for Animated Content

    This section lists all the animated GIFs on your page, along with the potential savings. 

    Large GIFs are inefficient when delivering animated content. You can save a significant amount of bandwidth by using videos over GIFs.

    Solution

    Consider using MPEG4 or WebM videos instead of GIFs. Many tools can convert a GIF into a video, such as FFmpeg.

    Use the code below to replicate a GIF’s behavior using MPEG4 and WebM. It’ll be played silent and automatically in an endless loop, just like a GIF. The code ensures that the unsupported format has a fallback.

    <video autoplay loop muted playsinline>  
      <source src="my-funny-animation.webm" type="video/webm">
      <source src="my-funny-animation.mp4" type="video/mp4">
    </video>

    Note: Do not use video formats for a small batch of GIF animations. It’s not worth doing it. It comes in handy when your website makes heavy use of animated content.

    Final Thoughts

    I found a great result in my app’s performance after trying out the techniques above.

    Source: Giphy

    While they may not all fit your app, try it and see what works and what doesn’t. I have compiled a list of some resources that will help you enhance performance. Hopefully, they help.

    Do share your starting and final audit scores with me.

    Happy optimized coding!

    Source: Giphy

    Further Reading

    Learn more – web.dev

    Other opportunities to explore:

    1. Remove unused CSS
    2. Efficiently encode images
    3. Reduce server response times (TTFB)
    4. Preload key requests
    5. Reduce the impact of third-party code

  • Idiot-proof Coding with Node.js and Express.js

    Node.js has become the most popular framework for web development surpassing Ruby on Rails and Django in terms of the popularity.The growing popularity of full stack development along with the performance benefits of asynchronous programming has led to the rise of Node’s popularity. ExpressJs is a minimalistic, unopinionated and the most popular web framework built for Node which has become the de-facto framework for many projects.
    Note — This article is about building a Restful API server with ExpressJs . I won’t be delving into a templating library like handlebars to manage the views.

    A quick search on google will lead you a ton of articles agreeing with what I just said which could validate the theory. Your next step would be to go through a couple of videos about ExpressJS on Youtube, try hello world with a boilerplate template, choose few recommended middleware for Express (Helmet, Multer etc), an ORM (mongoose if you are using Mongo or Sequelize if you are using relational DB) and start building the APIs. Wow, that was so fast!

    The problem starts to appear after a few weeks when your code gets larger and complex and you realise that there is no standard coding practice followed across the client and the server code, refactoring or updating the code breaks something else, versioning of the APIs becomes difficult, call backs have made your life hell (you are smart if you are using Promises but have you heard of async-await?).

    Do you think you your code is not so idiot-proof anymore? Don’t worry! You aren’t the only one who thinks this way after reading this.

    Let me break the suspense and list down the technologies and libraries used in our idiot-proof code before you get restless.

    1. Node 8.11.3: This is the latest LTS release from Node. We are using all the ES6 features along with async-await. We have the latest version of ExpressJs (4.16.3).
    2. Typescript: It adds an optional static typing interface to Javascript and also gives us familiar constructs like classes (Es6 also gives provides class as a construct) which makes it easy to maintain a large codebase.
    3. Swagger: It provides a specification to easily design, develop, test and document RESTful interfaces. Swagger also provides many open source tools like codegen and editor that makes it easy to design the app.
    4. TSLint: It performs static code analysis on Typescript for maintainability, readability and functionality errors.
    5. Prettier: It is an opinionated code formatter which maintains a consistent style throughout the project. This only takes care of the styling like the indentation (2 or 4 spaces), should the arguments remain on the same line or go to the next line when the line length exceeds 80 characters etc.
    6. Husky: It allows you to add git hooks (pre-commit, pre-push) which can trigger TSLint, Prettier or Unit tests to automatically format the code and to prevent the push if the lint or the tests fail.

    Before you move to the next section I would recommend going through the links to ensure that you have a sound understanding of these tools.

    Now I’ll talk about some of the challenges we faced in some of our older projects and how we addressed these issues in the newer projects with the tools/technologies listed above.

    Formal API definition

    A problem that everyone can relate to is the lack of formal documentation in the project. Swagger addresses a part of this problem with their OpenAPI specification which defines a standard to design REST APIs which can be discovered by both machines and humans. As a practice, we first design the APIs in swagger before writing the code. This has 3 benefits:

    • It helps us to focus only on the design without having to worry about the code, scaffolder, naming conventions etc. Our API designs are consistent with the implementation because of this focused approach.
    • We can leverage tools like swagger-express-mw to internally wire the routes in the API doc to the controller, validate request and response object from their definitions etc.
    • Collaboration between teams becomes very easy, simple and standardised because of the Swagger specification.

    Code Consistency

    We wanted our code to look consistent across the stack (UI and Backend)and we use ESlint to enforce this consistency.
    Example –
    Node traditionally used “require” and the UI based frameworks used “import” based syntax to load the modules. We decided to follow ES6 style across the project and these rules are defined with ESLint.

    Note — We have made slight adjustments to the TSlint for the backend and the frontend to make it easy for the developers. For example, we allow upto 120 characters in React as some of our DOM related code gets lengthy very easily.

    Code Formatting

    This is as important as maintaining the code consistency in the project. It’s easy to read a code which follows a consistent format like indentation, spaces, line breaks etc. Prettier does a great job at this. We have also integrated Prettier with Typescript to highlight the formatting errors along with linting errors. IDE like VS Code also has prettier plugin which supports features like auto-format to make it easy.

    Strict Typing

    Typescript can be leveraged to the best only if the application follows strict typing. We try to enforce it as much as possible with exceptions made in some cases (mostly when a third party library doesn’t have a type definition). This has the following benefits:

    • Static code analysis works better when your code is strongly typed. We discover about 80–90% of the issues before compilation itself using the plugins mentioned above.
    • Refactoring and enhancements becomes very simple with Typescript. We first update the interface or the function definition and then follow the errors thrown by Typescript compiler to refactor the code.

    Git Hooks

    Husky’s “pre-push” hook runs TSLint to ensure that we don’t push the code with linting issues. If you follow TDD (in the way it’s supposed to be done), then you can also run unit tests before pushing the code. We decided to go with pre-hooks because
    – Not everyone has CI from the very first day. With a git hook, we at least have some code quality checks from the first day.
    – Running lint and unit tests on the dev’s system will leave your CI with more resources to run integration and other complex tests which are not possible to do in local environment.
    – You force the developer to fix the issues at the earliest which results in better code quality, faster code merges and release.

    Async-await

    We were using promises in our project for all the asynchronous operations. Promises would often lead to a long chain of then-error blocks which is not very comfortable to read and often result in bugs when it got very long (it goes without saying that Promises are much better than the call back function pattern). Async-await provides a very clean syntax to write asynchronous operations which just looks like sequential code. We have seen a drastic improvement in the code quality, fewer bugs and better readability after moving to async-await.

    Hope this article gave you some insights into tools and libraries that you can use to build a scalable ExpressJS app.

  • Improving Elasticsearch Indexing in the Rails Model using Searchkick

    Searching has become a prominent feature of any web application, and a relevant search feature requires a robust search engine. The search engine should be capable of performing a full-text search, auto completion, providing suggestions, spelling corrections, fuzzy search, and analytics. 

    Elasticsearch, a distributed, fast, and scalable search and analytic engine, takes care of all these basic search requirements.

    The focus of this post is using a few approaches with Elasticsearch in our Rails application to reduce time latency for web requests. Let’s review one of the best ways to improve the Elasticsearch indexing in Rails models by moving them to background jobs.

    In a Rails application, Elasticsearch can be integrated with any of the following popular gems:

    We can continue with any of these gems mentioned above. But for this post, we will be moving forward with the Searchkick gem, which is a much more Rails-friendly gem.

    The default Searchkick gem option uses the object callbacks to sync the data in the respective Elasticsearch index. Being in the callbacks, it costs the request, which has the creation and updation of a resource to take additional time to process the web request.

    The below image shows logs from a Rails application, captured for an update request of a user record. We have added a print statement before Elasticsearch tries to sync in the Rails model so that it helps identify from the logs where the indexing has started. These logs show that the last two queries were executed for indexing the data in the Elasticsearch index.

    Since the Elasticsearch sync is happening while updating a user record, we can conclude that the user update request will take additional time to cover up the Elasticsearch sync.

    Below is the request flow diagram:

    From the request flow diagram, we can say that the end-user must wait for step 3 and 4 to be completed. Step 3 is to fetch the children object details from the database.

    To tackle the problem, we can move the Elasticsearch indexing to the background jobs. Usually, for Rails apps in production, there are separate app servers, database servers, background job processing servers, and Elasticsearch servers (in this scenario).

    This is how the request flow looks when we move Elasticsearch indexing:

    Let’s get to coding!

    For demo purposes, we will have a Rails app with models: `User` and `Blogpost`. The stack used here:

    • Rails 5.2
    • Elasticsearch 6.6.7
    • MySQL 5.6
    • Searchkick (gem for writing Elasticsearch queries in Ruby)
    • Sidekiq (gem for background processing)

    This approach does not require  any specific version of Rails, Elasticsearch or Mysql. Moreover, this approach is database agnostic. You can go through the code from this Github repo for reference.

    Let’s take a look at the user model with Elasticsearch index.

    # == Schema Information
    #
    # Table name: users
    #
    #  id            :bigint           not null, primary key
    #  name          :string(255)
    #  email         :string(255)
    #  mobile_number :string(255)
    #  created_at    :datetime         not null
    #  updated_at    :datetime         not null
    #
    class User < ApplicationRecord
     searchkick
    
     has_many :blogposts
     def search_data
       {
         name: name,
         email: email,
         total_blogposts: blogposts.count,
         last_published_blogpost_date: last_published_blogpost_date
       }
     end
     ...
    end

    Anytime a user object is inserted, updated, or deleted, Searchkick reindexes the data in the Elasticsearch user index synchronously.

    Searchkick already provides four ways to sync Elasticsearch index:

    • Inline (default)
    • Asynchronous
    • Queuing
    • Manual

    For more detailed information on this, refer to this page. In this post, we are looking in the manual approach to reindex the model data.

    To manually reindex, the user model will look like:

    class User < ApplicationRecord
     searchkick callbacks: false
    
     def search_data
       ...
     end
    end

    Now, we will need to define a callback that can sync the data to the Elasticsearch index. Typically, this callback must be written in all the models that have the Elasticsearch index. Instead, we can write a common concern and include it to required models.

    Here is what our concern will look like:

    module ElasticsearchIndexer
     extend ActiveSupport::Concern
    
     included do
       after_commit :reindex_model
       def reindex_model
         ElasticsearchWorker.perform_async(self.id, self.class.name)
       end
     end
    end

    In the above active support concern, we have called the Sidekiq worker named ElasticsearchWorker. After adding this concern, don’t forget to include the Elasticsearch indexer concern in the user model, like so:

    include ElasticsearchIndexer

    Now, let’s see the Elasticsearch Sidekiq worker:

    class ElasticsearchWorker
     include Sidekiq::Worker
     def perform(id, klass)
       begin
         klass.constantize.find(id.to_s).reindex
       rescue => e
         # Handle exception
       end
     end
    end

    That’s it, we’ve done it. Cool, huh? Now, whenever a user creates, updates, or deletes web request, a background job will be created. The background job can be seen in the Sidekiq web UI at localhost:3000/sidekiq

    Now, there is little problem in the Elasticsearch indexer concern. To reproduce this, go to your user edit page, click save, and look at localhost:3000/sidekiq—a job will be queued.

    We can handle this case by tracking the dirty attributes. 

    module ElasticsearchIndexer
     extend ActiveSupport::Concern
     included do
       after_commit :reindex_model
       def reindex_model
         return if self.previous_changes.keys.blank?
         ElasticsearchWorker.perform_async(self.id, klass)
       end
     end
    end

    Furthermore, there are few more areas of improvement. Suppose you are trying to update the field of user model that is not part of the Elasticsearch index, the Elasticsearch worker Sidekiq job will still get created and reindex the associated model object. This can be modified to create the Elasticsearch indexing worker Sidekiq job only if the Elasticsearch index fields are updated.

    module ElasticsearchIndexer
     extend ActiveSupport::Concern
     included do
       after_commit :reindex_model
       def reindex_model
         updated_fields = self.previous_changes.keys
        
         # For getting ES Index fields you can also maintain constant
         # on model level or get from the search_data method.
         es_index_fields = self.search_data.stringify_keys.keys
         return if (updated_fields & es_index_fields).blank?
         ElasticsearchWorker.perform_async(self.id, klass)
       end
     end
    end

    Conclusion

    Moving the Elasticsearch indexing to background jobs is a great way to boost the performance of the web app by reducing the response time of any web request. Implementing this approach for every model would not be ideal. I would recommend this approach only if the Elasticsearch index data are not needed in real-time.

    Since the execution of background jobs depends on the number of jobs it must perform, it might take time to reflect the changes in the Elasticsearch index if there are lots of jobs queued up. To solve this problem to some extent, the Elasticsearch indexing jobs can be added in a queue with high priority. Also, make sure you have a different app server and background job processing server. This approach works best if the app server is different than the background job processing server.

  • Build and Deploy a Real-Time React App Using AWS Amplify and GraphQL

    GraphQL is becoming a popular way to use APIs in modern web and mobile apps.

    However, learning new things is always time-consuming and without getting your hands dirty, it’s very difficult to understand the nuances of a new technology.

    So, we have put together a powerful and concise tutorial that will guide you through setting up a GraphQL backend and integration into your React app in the shortest time possible. This tutorial is light on opinions, so that once you get a hang of the fundamentals, you can go on and tailor your workflow.

    Key topics and takeaways:

    • Authentication
    • GraphQL API with AWS AppSync
    • Hosting
    • Working with multiple environments
    • Removing services

    What will we be building?

    We will build a basic real-time Restaurant CRUD app using authenticated GraphQL APIs. Click here to try the deployed version of the app to see what we’ll be building.

    Will this tutorial teach React or GraphQL concepts as well?

    No. The focus is to learn how to use AWS Amplify to build cloud-enabled, real-time web applications. If you are new to React or GraphQL, we recommend going through the official documentation and then coming back here.

    What do I need to take this tutorial?

    • Node >= v10.9.0
    • NPM >= v6.9.0 packaged with Node.

    Getting started – Creating the application

    To get started, we first need to create a React project using the create-react-app boilerplate:

    npx create-react-app amplify-app --typescript
    cd amplify-app

    Let’s now install the AWS Amplify and AWS Amplify React bindings and try running the application:

    npm install --save aws-amplify aws-amplify-react
    npm start

    If you have initialized the app with Typescript and see errors while using

    aws-amplify-react, add aws-amplify-react.d.ts to src with:

    declare module 'aws-amplify-react';

    Installing the AWS Amplify CLI and adding it to the project

    To install the CLI:

    npm install -g @aws-amplify/cli

    Now we need to configure the CLI with our credentials:

    amplify configure

    If you’d like to see a video walkthrough of this process, click here

    Here we’ll walk you through the amplify configure setup. After you sign in to the AWS console, follow these steps:

    • Specify the AWS region: ap-south-1 (Mumbai) <Select the region based on your location. Click here for reference>
    • Specify the username of the new IAM user: amplify-app <name of=”” your=”” app=””></name>

    In the AWS Console, click Next: Permissions, Next: Tags, Next: Review, and Create User to create the new IAM user. Then, return to the command line and press Enter.

    • Enter the credentials of the newly created user:
      accessKeyId: <your_access_key_id> </your_access_key_id>
      secretAccessKey: <your_secret_access_key></your_secret_access_key>
    • Profile Name: default

    To view the newly created IAM user, go to the dashboard. Also, make sure that your region matches your selection.

    To add amplify to your project:

    amplify init

    Answer the following questions:

    • Enter a name for the project: amplify-app <name of=”” your=”” app=””></name>
    • Enter a name for the environment: dev <name of=”” your=”” environment=””></name>
    • Choose your default editor: Visual Studio Code <your default editor=””></your>
    • Choose the type of app that you’re building: javaScript
    • What JavaScript framework are you using: React
    • Source Directory Path: src
    • Distribution Directory Path: build
    • Build Command: npm run build (for macOS/Linux), npm.cmd run-script build (for Windows)
    • Start Command: npm start (for macOS/Linux), npm.cmd run-script start (for Windows)
    • Do you want to use an AWS profile: Yes
    • Please choose the profile you want to use: default

    Now, the AWS Amplify CLI has initialized a new project and you will see a new folder: amplify. This folder has files that hold your project configuration.

    <amplify-app>
    |_ amplify
    |_ .config
    |_ #current-cloud-backend
    |_ backend
    team-provider-info.json

    Adding Authentication

    To add authentication:

    amplify add auth

    When prompted, choose:

    • Do you want to use default authentication and security configuration: Default configuration
    • How do you want users to be able to sign in when using your Cognito User Pool: Username
    • What attributes are required for signing up: Email

    Now, let’s run the push command to create the cloud resources in our AWS account:

    amplify push

    To quickly check your newly created Cognito User Pool, you can run

    amplify status

    To access the AWS Cognito Console at any time, go to the dashboard. Also, ensure that your region is set correctly.

    Now, our resources are created and we can start using them.

    The first thing is to connect our React application to our new AWS Amplify project. To do this, reference the auto-generated aws-exports.js file that is now in our src folder.

    To configure the app, open App.tsx and add the following code below the last import:

    import Amplify from 'aws-amplify';
    import awsConfig from './aws-exports';
    
    Amplify.configure(awsConfig);

    Now, we can start using our AWS services.
    To add the Authentication flow to the UI, export the app component by wrapping it with the authenticator HOC:

    import { withAuthenticator } from 'aws-amplify-react';
    ...
    // app component
    ...
    export default withAuthenticator(App);

    Now, let’s run the app to check if an Authentication flow has been added before our App component is rendered.

    This flow gives users the ability to sign up and sign in. To view any users that were created, go back to the Cognito dashboard. Alternatively, you can also use:

    amplify console auth

    The withAuthenticator HOC is a really easy way to get up and running with authentication, but in a real-world application, we probably want more control over how our form looks and functions. We can use the aws-amplify/Auth class to do this. This class has more than 30 methods including signUp, signIn, confirmSignUp, confirmSignIn, and forgotPassword. These functions return a promise, so they need to be handled asynchronously.

    Adding and Integrating the GraphQL API

    To add GraphQL API, use the following command:

    amplify add api

    Answer the following questions:

    • Please select from one of the below mentioned services: GraphQL
    • Provide API name: RestaurantAPI
    • Choose an authorization type for the API: API key
    • Do you have an annotated GraphQL schema: No
    • Do you want a guided schema creation: Yes
    • What best describes your project: Single object with fields (e.g., “Todo” with ID, name, description)
    • Do you want to edit the schema now: Yes

    When prompted, update the schema to the following:

    type Restaurant @model {
      id: ID!
      name: String!
      description: String!
      city: String!
    }

    Next, let’s run the push command to create the cloud resources in our AWS account:

    amplify push

    • Are you sure you want to continue: Yes
    • Do you want to generate code for your newly created GraphQL API: Yes
    • Choose the code generation language target: typescript
    • Enter the file name pattern of graphql queries, mutations and subscriptions: src/graphql/**/*.ts
    • Do you want to generate/update all possible GraphQL operations – queries, mutations and subscriptions: Yes
    • Enter maximum statement depth [increase from default if your schema is deeply nested]: 2
    • Enter the file name for the generated code: src/API.ts

    Notice your GraphQL endpoint and API KEY. This step has created a new AWS AppSync API and generated the GraphQL queries, mutations, and subscriptions on your local. To check, see src/graphql or visit the AppSync dashboard. Alternatively, you can use:

    amplify console api

    Please select from one of the below mentioned services: GraphQL

    Now, in the AppSync console, on the left side click on Queries. Execute the following mutation to create a restaurant in the API:

    mutation createRestaurant {
      createRestaurant(input: {
        name: "Nobu"
        description: "Great Sushi"
        city: "New York"
      }) {
        id name description city
      }
    }

    Now, let’s query for the restaurant:

    query listRestaurants {
      listRestaurants {
        items {
          id
          name
          description
          city
        }
      }
    }

    We can even search / filter data when querying:

    query searchRestaurants {
      listRestaurants(filter: {
        city: {
          contains: "New York"
        }
      }) {
        items {
          id
          name
          description
          city
        }
      }
    }

    Now that the GraphQL API is created, we can begin interacting with it from our client application. Here is how we’ll add queries, mutations, and subscriptions:

    import Amplify, { API, graphqlOperation } from 'aws-amplify';
    import { withAuthenticator } from 'aws-amplify-react';
    import React, { useEffect, useReducer } from 'react';
    import { Button, Col, Container, Form, Row, Table } from 'react-bootstrap';
    
    import './App.css';
    import awsConfig from './aws-exports';
    import { createRestaurant } from './graphql/mutations';
    import { listRestaurants } from './graphql/queries';
    import { onCreateRestaurant } from './graphql/subscriptions';
    
    Amplify.configure(awsConfig);
    
    type Restaurant = {
      name: string;
      description: string;
      city: string;
    };
    
    type AppState = {
      restaurants: Restaurant[];
      formData: Restaurant;
    };
    
    type Action =
      | {
          type: 'QUERY';
          payload: Restaurant[];
        }
      | {
          type: 'SUBSCRIPTION';
          payload: Restaurant;
        }
      | {
          type: 'SET_FORM_DATA';
          payload: { [field: string]: string };
        };
    
    type SubscriptionEvent<D> = {
      value: {
        data: D;
      };
    };
    
    const initialState: AppState = {
      restaurants: [],
      formData: {
        name: '',
        city: '',
        description: '',
      },
    };
    const reducer = (state: AppState, action: Action) => {
      switch (action.type) {
        case 'QUERY':
          return { ...state, restaurants: action.payload };
        case 'SUBSCRIPTION':
          return { ...state, restaurants: [...state.restaurants, action.payload] };
        case 'SET_FORM_DATA':
          return { ...state, formData: { ...state.formData, ...action.payload } };
        default:
          return state;
      }
    };
    
    const App: React.FC = () => {
      const createNewRestaurant = async (e: React.SyntheticEvent) => {
        e.stopPropagation();
        const { name, description, city } = state.formData;
        const restaurant = {
          name,
          description,
          city,
        };
        await API.graphql(graphqlOperation(createRestaurant, { input: restaurant }));
      };
    
      const [state, dispatch] = useReducer(reducer, initialState);
    
      useEffect(() => {
        getRestaurantList();
    
        const subscription = API.graphql(graphqlOperation(onCreateRestaurant)).subscribe({
          next: (eventData: SubscriptionEvent<{ onCreateRestaurant: Restaurant }>) => {
            const payload = eventData.value.data.onCreateRestaurant;
            dispatch({ type: 'SUBSCRIPTION', payload });
          },
        });
    
        return () => subscription.unsubscribe();
      }, []);
    
      const getRestaurantList = async () => {
        const restaurants = await API.graphql(graphqlOperation(listRestaurants));
        dispatch({
          type: 'QUERY',
          payload: restaurants.data.listRestaurants.items,
        });
      };
    
      const handleChange = (e: React.ChangeEvent<HTMLInputElement>) =>
        dispatch({
          type: 'SET_FORM_DATA',
          payload: { [e.target.name]: e.target.value },
        });
    
      return (
        <div className="App">
          <Container>
            <Row className="mt-3">
              <Col md={4}>
                <Form>
                  <Form.Group controlId="formDataName">
                    <Form.Control onChange={handleChange} type="text" name="name" placeholder="Name" />
                  </Form.Group>
                  <Form.Group controlId="formDataDescription">
                    <Form.Control onChange={handleChange} type="text" name="description" placeholder="Description" />
                  </Form.Group>
                  <Form.Group controlId="formDataCity">
                    <Form.Control onChange={handleChange} type="text" name="city" placeholder="City" />
                  </Form.Group>
                  <Button onClick={createNewRestaurant} className="float-left">
                    Add New Restaurant
                  </Button>
                </Form>
              </Col>
            </Row>
    
            {state.restaurants.length ? (
              <Row className="my-3">
                <Col>
                  <Table striped bordered hover>
                    <thead>
                      <tr>
                        <th>#</th>
                        <th>Name</th>
                        <th>Description</th>
                        <th>City</th>
                      </tr>
                    </thead>
                    <tbody>
                      {state.restaurants.map((restaurant, index) => (
                        <tr key={`restaurant-${index}`}>
                          <td>{index + 1}</td>
                          <td>{restaurant.name}</td>
                          <td>{restaurant.description}</td>
                          <td>{restaurant.city}</td>
                        </tr>
                      ))}
                    </tbody>
                  </Table>
                </Col>
              </Row>
            ) : null}
          </Container>
        </div>
      );
    };
    
    export default withAuthenticator(App);

    Finally, we have our app ready. You can now sign-up,sign-in, add new restaurants, see real-time updates of newly added restaurants.

    Hosting

    The hosting category enables you to deploy and host your app on AWS.

    amplify add hosting

    • Select the environment setup: DEV (S3 only with HTTP)
    • hosting bucket name: <YOUR_BUCKET_NAME>
    • index doc for the website: index.html
    • error doc for the website: index.html

    Now, everything is set up & we can publish it:

    amplify publish

    Working with multiple environments

    You can create multiple environments for your application to create & test out new features without affecting the main environment which you are working on.

    When you use an existing environment to create a new environment, you get a copy of the entire backend application stack (CloudFormation) for the current environment. When you make changes in the new environment, you are then able to test these new changes in the new environment & merge only the changes that have been made since the new environment was created.

    Let’s take a look at how to create a new environment. In this new environment, we’ll add another field for the restaurant owner to the GraphQL Schema.

    First, we’ll initialize a new environment using amplify init:

    amplify init

    • Do you want to use an existing environment: N
    • Enter a name for the environment: apiupdate
    • Do you want to use an AWS profile: Y

    Once the new environment is initialized, we should be able to see some information about our environment setup by running:

    amplify env list
    
    | Environments |
    | ------------ |
    | dev |
    | *apiupdate |

    Now, add the owner field to the GraphQL Schema in

    amplify/backend/api/RestaurantAPI/schema.graphql:

    type Restaurant @model {
      ...
      owner: String
    }

    Run the push command to create a new stack:

    amplify push.

    After testing it out, it can be merged into our original dev environment:

    amplify env checkout dev
    amplify status
    amplify push

    • Do you want to update code for your updated GraphQL API: Y
    • Do you want to generate GraphQL statements: Y

    Removing Services

    If at any time, you would like to delete a service from your project & your account, you can do this by running the amplify remove command:

    amplify remove auth
    amplify push

    If you are unsure of what services you have enabled at any time, amplify status will give you the list of resources that are currently enabled in your app.

    Sample code

    The sample code for this blog post with an end to end working app is available here.

    Summary

    Once you’ve worked through all the sections above, your app should now have all the capabilities of a modern app, and building GraphQL + React apps should now be easier and faster with Amplify.

  • Building Dynamic Forms in React Using Formik

    Every day we see a huge number of web applications allowing us customizations. It involves drag & drop or metadata-driven UI interfaces to support multiple layouts while having a single backend. Feedback taking system is one of the simplest examples of such products, where on the admin side, one can manage the layout and on the consumer side, users are shown that layout to capture the data. This post focuses on building a microframework to support such use cases with the help of React and Formik.

    Building big forms in React can be extremely time consuming and tedious when structural changes are requested. Handling their validations also takes too much time in the development life cycle. If we use Redux-based solutions to simplify this, like Redux-form, we see a lot of performance bottlenecks. So here comes Formik!

    Why Formik?

    “Why” is one of the most important questions while solving any problem. There are quite a few reasons to lean towards Formik for the implementation of such systems, such as:

    • Simplicity
    • Advanced validation support with Yup
    • Good community support with a lot of people helping on Github

    Being said that, it’s one of the easiest frameworks for quick form building activities. Formik’s clean API lets us use it without worrying about a lot of state management.

    Yup is probably the best library out there for validation and Formik provides out of the box support for Yup validations which makes it more programmer-friendly!!

    API Responses:

    We need to follow certain API structures to let our React code understand which component to render where.

    Let’s assume we will be getting responses from the backend API in the following fashion.

    [{
       “type” : “text”,
       “field”: “name”
       “name” : “User’s name”,
       “style” : {
             “width” : “50%
        }
    }]

    We can have any number of fields but each one will have two mandatory unique properties type and field. We will use those properties to build UI as well as response.

    So let’s start with building the simplest form with React and Formik.

    import React from 'react';
    import { useFormik } from 'formik';
    
    const SignupForm = () => {
      const formik = useFormik({
        initialValues: {
          email: '',
        },
        onSubmit: values => {
          alert(JSON.stringify(values, null, 2));
        },
      });
      return (
        <form onSubmit={formik.handleSubmit}>
          <label htmlFor="email">Email Address</label>
          <input
            id="email"
            name="email"
            type="email"
            onChange={formik.handleChange}
            value={formik.values.email}
          />
          <button type="submit">Submit</button>
        </form>
      );
    };
    
    export default SignupForm;

    import React from 'react';
    
    export default ({ name }) => <h1>Hello {name}!</h1>;

    <div id="root"></div>

    import React, { Component } from 'react';
    import { render } from 'react-dom';
    import Basic from './Basic';
    import './style.css';
    
    class App extends Component {
      constructor() {
        super();
        this.state = {
          name: 'React'
        };
      }
    
      render() {
        return (
          <div>
            <Basic />
          </div>
        );
      }
    }
    
    render(<App />, document.getElementById('root'));

    {
      "name": "react",
      "version": "0.0.0",
      "private": true,
      "dependencies": {
        "react": "^16.12.0",
        "react-dom": "^16.12.0",
        "formik": "latest"
      },
      "scripts": {
        "start": "react-scripts start",
        "build": "react-scripts build",
        "test": "react-scripts test --env=jsdom",
        "eject": "react-scripts eject"
      },
      "devDependencies": {
        "react-scripts": "latest"
      }
    }

    h1, p {
      font-family: Lato;
    }

    You can view the fiddle of above code here to see the live demo.

    We will go with the latest functional components to build this form. You can find more information on useFormik hook at useFormik Hook documentation.  

    It’s nothing more than just a wrapper for Formik functionality.

    Adding dynamic nature

    So let’s first create and import the mocked API response to build the UI dynamically.

    import React from 'react';
    import { useFormik } from 'formik';
    import response from "./apiresponse"
    
    const SignupForm = () => {
      const formik = useFormik({
        initialValues: {
          email: '',
        },
        onSubmit: values => {
          alert(JSON.stringify(values, null, 2));
        },
      });
      return (
        <form onSubmit={formik.handleSubmit}>
          <label htmlFor="email">Email Address</label>
          <input
            id="email"
            name="email"
            type="email"
            onChange={formik.handleChange}
            value={formik.values.email}
          />
          <button type="submit">Submit</button>
        </form>
      );
    };
    
    export default SignupForm;

    You can view the fiddle here.

    We simply imported the file and made it available for processing. So now, we need to write the logic to build components dynamically.
    So let’s visualize the DOM hierarchy of components possible:

    <Container>
    	<TextField />
    	<NumberField />
    	<Container />
    		<TextField />
    		<BooleanField />
    	</Container >
    </Container>

    We can have a recurring container within the container, so let’s address this by adding a children attribute in API response.

    export default [
      {
        "type": "text",
        "field": "name",
        "label": "User's name"
      },
      {
        "type": "number",
        "field": "number",
        "label": "User's age",
      },
      {
        "type": "none",
        "field": "none",
        "children": [
          {
            "type": "text",
            "field": "user.hobbies",
            "label": "User's hobbies"
          }
        ]
      }
    ]

    You can see the fiddle with response processing here with live demo.

    To process the recursive nature, we will create a separate component.

    import React, { useMemo } from 'react';
    
    const RecursiveContainer = ({config, formik}) => {
      const builder = (individualConfig) => {
        switch (individualConfig.type) {
          case 'text':
            return (
                    <>
                    <div>
                      <label htmlFor={individualConfig.field}>{individualConfig.label}</label>
                      <input type='text' 
                        name={individualConfig.field} 
                        onChange={formik.handleChange} style={{...individualConfig.style}} />
                      </div>
                    </>
                  );
          case 'number':
            return (
              <>
                <div>
                  <label htmlFor={individualConfig.field}>{individualConfig.label}</label>
                      <input type='number' 
                        name={individualConfig.field} 
                        onChange={formik.handleChange} style={{...individualConfig.style}} />
                </div>
              </>
            )
          case 'array':
            return (
              <RecursiveContainer config={individualConfig.children || []} formik={formik} />
            );
          default:
            return <div>Unsupported field</div>
        }
      }
    
      return (
        <>
          {config.map((c) => {
            return builder(c);
          })}
        </>
      );
    };
    
    export default RecursiveContainer;

    You can view the complete fiddle of the recursive component here.

    So what we do in this is pretty simple. We pass config which is a JSON object that is retrieved from the API response. We simply iterate through config and build the component based on type. When the type is an array, we create the same component RecursiveContainer which is basic recursion.

    We can optimize it by passing the depth and restricting to nth possible depth to avoid going out of stack errors at runtime. Specifying the depth will ultimately make it less prone to runtime errors. There is no standard limit, it varies from use case to use case. If you are planning to build a system that is based on a compliance questionnaire, it can go to a max depth of 5 to 7, while for the basic signup form, it’s often seen to be only 2.

    So we generated the forms but how do we validate them? How do we enforce required, min, max checks on the form?

    For this, Yup is very helpful. Yup is an object schema validation library that helps us validate the object and give us results back. Its chaining like syntax makes it very much easier to build incremental validation functions.

    Yup provides us with a vast variety of existing validations. We can combine them, specify error or warning messages to be thrown and much more.

    You can find more information on Yup at Yup Official Documentation

    To build a validation function, we need to pass a Yup schema to Formik.

    Here is a simple example: 

    import React from 'react';
    import { useFormik } from 'formik';
    import response from "./apiresponse"
    import RecursiveContainer from './RecursiveContainer';
    import * as yup from 'yup';
    
    const SignupForm = () => {
      const signupSchema = yup.object().shape({
          name: yup.string().required()
      });
    
      const formik = useFormik({
        initialValues: {
        },
        onSubmit: values => {
          alert(JSON.stringify(values, null, 2));
        },
        validationSchema: signupSchema
      });
      console.log(formik, response)
      return (
        <form onSubmit={formik.handleSubmit}>
          <RecursiveContainer config={response} formik={formik} />
          <button type="submit">Submit</button>
        </form>
      );
    };
    
    export default SignupForm;

    You can see the schema usage example here.

    In this example, we simply created a schema and passed it to useFormik hook. You can notice now unless and until the user enters the name field, the form submission is not working.

    Here is a simple hack to make the button disabled until all necessary fields are filled.

    import React from 'react';
    import { useFormik } from 'formik';
    import response from "./apiresponse"
    import RecursiveContainer from './RecursiveContainer';
    import * as yup from 'yup';
    
    const SignupForm = () => {
      const signupSchema = yup.object().shape({
          name: yup.string().required()
      });
    
      const formik = useFormik({
        initialValues: {
        },
        onSubmit: values => {
          alert(JSON.stringify(values, null, 2));
        },
        validationSchema: signupSchema
      });
      console.log(formik, response)
      return (
        <form onSubmit={formik.handleSubmit}>
          <RecursiveContainer config={response} formik={formik} />
          <button type="submit" disabled={!formik.isValid}>Submit</button>
        </form>
      );
    };
    
    export default SignupForm;

    You can see how to use submit validation with live fiddle here

    We do get a vast variety of output from Formik while the form is being rendered and we can use them the way it suits us. You can find the full API of Formik at Formik Official Documentation

    So existing validations are fine but we often get into cases where we would like to build our own validations. How do we write them and integrate them with Yup validations?

    For this, there are 2 different ways with Formik + Yup. Either we can extend the Yup to support the additional validation or pass validation function to the Formik. The validation function approach is much simpler. You just need to write a function that gives back an error object to Formik. As simple as it sounds, it does get messy at times.

    So we will see an example of adding custom validation to Yup. Yup provides us an addMethod interface to add our own user-defined validations in the application.

    Let’s say we want to create an alias for existing validation for supporting casing because that’s the most common mistake we see. Url becomes url, trim is coming from the backend as Trim. These method names are case sensitive so if we say Yup.Url, it will fail. But with Yup.url, we get a function. These are just some examples, but you can also alias them with some other names like I can have an alias required to be as readable as NotEmpty.

    The usage is very simple and straightforward as follows: 

    yup.addMethod(yup.string, “URL”, function(...args) {
    return this.url(...args);
    });

    This will create an alias for url as URL.

    Here is an example of custom method validation which takes Y and N as boolean values.

    const validator = function (message) {
        return this.test('is-string-boolean', message, function (value) {
          if (isEmpty(value)) {
            return true;
          }
    
          if (['Y', 'N'].indexOf(value) !== -1) {
            return true;
          } else {
            return false;
          }
        });
      };

    With the above, we will be able to execute yup.string().stringBoolean() and yup.string().StringBoolean().

    It’s a pretty handy syntax that lets users create their own validations. You can create many more validations in your project to be used with Yup and reuse them wherever required.

    So writing schema is also a cumbersome task and is useless if the form is dynamic. When the form is dynamic then validations also need to be dynamic. Yup’s chaining-like syntax lets us achieve it very easily.

    We will consider that the backend sends us additional following things with metadata.

    [{
       “type” : “text”,
       “field”: “name”
       “name” : “User’s name”,
       “style” : {
             “width” : “50%
        },
       “validationType”: “string”,
       “validations”: [{
              type: “required”,
              params: [“Name is required”]
        }]
    }]

    validationType will hold the Yup’s data types like string, number, date, etc and validations will hold the validations that need to be applied to that field.

    So let’s have a look at the following snippet which utilizes the above structure and generates dynamic validation.

    import * as yup from 'yup';
    
    /** Adding just additional methods here */
    
    yup.addMethod(yup.string, "URL", function(...args) {
        return this.url(...args);
    });
    
    
    const validator = function (message) {
        return this.test('is-string-boolean', message, function (value) {
          if (isEmpty(value)) {
            return true;
          }
    
          if (['Y', 'N'].indexOf(value) !== -1) {
            return true;
          } else {
            return false;
          }
        });
      };
    
    yup.addMethod(yup.string, "stringBoolean", validator);
    yup.addMethod(yup.string, "StringBoolean", validator);
    
    
    
    
    export function createYupSchema(schema, config) {
      const { field, validationType, validations = [] } = config;
      if (!yup[validationType]) {
        return schema;
      }
      let validator = yup[validationType]();
      validations.forEach((validation) => {
        const { params, type } = validation;
        if (!validator[type]) {
          return;
        }
        validator = validator[type](...params);
      });
      if (field.indexOf('.') !== -1) {
        // nested fields are not covered in this example but are eash to handle tough
      } else {
        schema[field] = validator;
      }
    
      return schema;
    }
    
    export const getYupSchemaFromMetaData = (
      metadata,
      additionalValidations,
      forceRemove
    ) => {
      const yepSchema = metadata.reduce(createYupSchema, {});
      const mergedSchema = {
        ...yepSchema,
        ...additionalValidations,
      };
    
      forceRemove.forEach((field) => {
        delete mergedSchema[field];
      });
    
      const validateSchema = yup.object().shape(mergedSchema);
    
      return validateSchema;
    };

    You can see the complete live fiddle with dynamic validations with formik here.

    Here we have added the above code snippets to show how easily we can add a new method to Yup. Along with it, there are two functions createYupSchema and getYupSchemaFromMetaData which drive the whole logic for building dynamic schema. We are passing the validations in response and building the validation from it.

    createYupSchema simply builds Yup validation based on the validation array and validationType. getYupSchemaFromMetaData basically iterates over the response array and builds Yup validation for each field and at the end, it wraps it in the Object schema. In this way, we can generate dynamic validations. One can even go further and create nested validations with recursion.

    Conclusion

    It’s often seen that adding just another field is time-consuming in the traditional approach of writing the large boilerplate for forms, while with this approach, it eliminates the need for hardcoding the fields and allows them to be backend-driven. 

    Formik provides very optimized state management which reduces performance issues that we generally see when Redux is used and updated quite frequently.

    As we see above, it’s very easy to build dynamic forms with Formik. We can save the templates and even create template libraries that are very common with question and answer systems. If utilized correctly, we can simply have the templates saved in some NoSQL databases, like MongoDB and can generate a vast number of forms quickly with ease along with validations.

    To learn more and build optimized solutions you can also refer to <fastfield> and <field> APIs at their </field></fastfield>official documentation. Thanks for reading!

  • Building a Collaborative Editor Using Quill and Yjs

    “Hope this email finds you well” is how 2020-2021 has been in a nutshell. Since we’ve all been working remotely since last year, actively collaborating with teammates became one notch harder, from activities like brainstorming a topic on a whiteboard to building documentation.

    Having tools powered by collaborative systems had become a necessity, and to explore the same following the principle of build fast fail fast, I started building up a collaborative editor using existing available, open-source tools, which can eventually be extended for needs across different projects.

    Conflicts, as they say, are inevitable, when multiple users are working on the same document constantly modifying it, especially if it’s the same block of content. Ultimately, the end-user experience is defined by how such conflicts are resolved.

    There are various conflict resolution mechanisms, but two of the most commonly discussed ones are Operational Transformation (OT) and Conflict-Free Replicated Data Type (CRDT). So, let’s briefly talk about those first.

    Operational Transformation

    The order of operations matter in OT, as each user will have their own local copy of the document, and since mutations are atomic, such as insert V at index 4 and delete X at index 2. If the order of these operations is changed, the end result will be different. And that’s why all the operations are synchronized through a central server. The central server can then alter the indices and operations and then forward to the clients. For example, in the below image, User2 makes a delete(0) operation, but as the OT server realizes that User1 has made an insert operation, the User2’s operation needs to be changed as delete(1) before applying to User1.

    OT with a central server is typically easier to implement. Plain text operations with OT in its basic form only has three defined operations: insert, delete, and apply.

    Source: Conclave

    “Fully distributed OT and adding rich text operations are very hard, and that’s why there’s a million papers.”

    CRDT

    Instead of performing operations directly on characters like in OT, CRDT uses a complex data structure to which it can then add/update/remove properties to signify transformation, enabling scope for commutativity and idempotency. CRDTs guarantee eventual consistency.

    There are different algorithms, but in general, CRDT has two requirements: globally unique characters and globally ordered characters. Basically, this involves a global reference for each object, instead of positional indices, in which the ordering is based on the neighboring objects. Fractional indices can be used to assign index to an object.

    Source: Conclave

    As all the objects have their own unique reference, delete operation becomes idempotent. And giving fractional indices is one way to give unique references while insertion and updation.

    There are two types of CRDT, one is state-based, where the whole state (or delta) is shared between the instances and merged continuously. The other is operational based, where only individual operations are sent between replicas. If you want to dive deep into CRDT, here’s a nice resource.

    For our purposes, we choose CRDT since it can also support peer-to-peer networks. If you directly want to jump to the code, you can visit the repo here.

    Tools used for this project:

    As our goal was for a quick implementation, we targeted off-the-shelf tools for editor and backend to manage collaborative operations.

    • Quill.js is an API-driven WYSIWYG rich text editor built for compatibility and extensibility. We choose Quill as our editor because of the ease to plug it into your application and availability of extensions.
    • Yjs is a framework that provides shared editing capabilities by exposing its different shared data types (Array, Map, Text, etc) that are synced automatically. It’s also network agnostic, so the changes are synced when a client is online. We used it because it’s a CRDT implementation, and surprisingly had readily available bindings for quill.js.

    Prerequisites:

    To keep it simple, we’ll set up a client and server both in the same code base. Initialize a project with npm init and install the below dependencies:

    npm i quill quill-cursors webpack webpack-cli webpack-dev-server y-quill y-websocket yjs

    • Quill: Quill is the WYSIWYG rich text editor we will use as our editor.
    • quill-cursors is an extension that helps us to display cursors of other connected clients to the same editor room.
    • Webpack, webpack-cli, and webpack-dev-server are developer utilities, webpack being the bundler that creates a deployable bundle for your application.
    • The Y-quill module provides bindings between Yjs and QuillJS with use of the SharedType y.Text. For more information, you can check out the module’s source on Github.
    • Y-websocket provides a WebsocketProvider to communicate with Yjs server in a client-server manner to exchange awareness information and data.
    • Yjs, this is the CRDT framework which orchestrates conflict resolution between multiple clients. 

    Code to use

    const path = require('path');
    
    module.exports = {
      mode: 'development',
      devtool: 'source-map',
      entry: {
        index: './index.js'
      },
      output: {
        globalObject: 'self',
        path: path.resolve(__dirname, './dist/'),
        filename: '[name].bundle.js',
        publicPath: '/quill/dist'
      },
      devServer: {
        contentBase: path.join(__dirname),
        compress: true,
        publicPath: '/dist/'
      }
    }

    This is a basic webpack config where we have provided which file is the starting point of our frontend project, i.e., the index.js file. Webpack then uses that file to build the internal dependency graph of your project. The output property is to define where and how the generated bundles should be saved. And the devServer config defines necessary parameters for the local dev server, which runs when you execute “npm start”.

    We’ll first create an index.html file to define the basic skeleton:

    <!DOCTYPE html>
    <html>
      <head>
        <title>Yjs Quill Example</title>
        <script src="./dist/index.bundle.js" async defer></script>
        <link rel=stylesheet href="//cdn.quilljs.com/1.3.6/quill.snow.css" async defer>
      </head>
      <body>
        <button type="button" id="connect-btn">Disconnect</button>
        <div id="editor" style="height: 500px;"></div>
      </body>
    </html>

    The index.html has a pretty basic structure. In <head>, we’ve provided the path of the bundled js file that will be created by webpack, and the css theme for the quill editor. And for the <body> part, we’ve just created a button to connect/disconnect from the backend and a placeholder div where the quill editor will be plugged.

    • Here, we’ve just made the imports, registered quill-cursors extension, and added an event listener for window load:
    import Quill from "quill";
    import * as Y from 'yjs';
    import { QuillBinding } from 'y-quill';
    import { WebsocketProvider } from 'y-websocket';
    import QuillCursors from "quill-cursors";
    
    // Register QuillCursors module to add the ability to show multiple cursors on the editor.
    Quill.register('modules/cursors', QuillCursors);
    
    window.addEventListener('load', () => {
      // We'll add more blocks as we continue
    });

    • Let’s initialize the Yjs document, socket provider, and load the document:
    window.addEventListener('load', () => {
      const ydoc = new Y.Doc();
      const provider = new WebsocketProvider('ws://localhost:3312', 'velotio-demo', ydoc);
      const type = ydoc.getText('Velotio-Blog');
    });

    • We’ll now initialize and plug the Quill editor with its bindings:
    window.addEventListener('load', () => {
      // ### ABOVE CODE HERE ###
    
      const editorContainer = document.getElementById('editor');
      const toolbarOptions = [
        ['bold', 'italic', 'underline', 'strike'],  // toggled buttons
        ['blockquote', 'code-block'],
        [{ 'header': 1 }, { 'header': 2 }],               // custom button values
        [{ 'list': 'ordered' }, { 'list': 'bullet' }],
        [{ 'script': 'sub' }, { 'script': 'super' }],      // superscript/subscript
        [{ 'indent': '-1' }, { 'indent': '+1' }],          // outdent/indent
        [{ 'direction': 'rtl' }],                         // text direction
        // array for drop-downs, empty array = defaults
        [{ 'size': [] }],
        [{ 'header': [1, 2, 3, 4, 5, 6, false] }],
        [{ 'color': [] }, { 'background': [] }],          // dropdown with defaults from theme
        [{ 'font': [] }],
        [{ 'align': [] }],
        ['image', 'video'],
        ['clean']                                         // remove formatting button
      ];
    
      const editor = new Quill(editorContainer, {
        modules: {
          cursors: true,
          toolbar: toolbarOptions,
          history: {
            userOnly: true  // only user changes will be undone or redone.
          }
        },
        placeholder: "collab-edit-test",
        theme: "snow"
      });
    
      const binding = new QuillBinding(type, editor, provider.awareness);
    });

    • Finally, let’s implement the Connect/Disconnect button and complete the callback:
    window.addEventListener('load', () => {
      // ### ABOVE CODE HERE ###
    
      const connectBtn = document.getElementById('connect-btn');
      connectBtn.addEventListener('click', () => {
    	if (provider.shouldConnect) {
      	  provider.disconnect();
      	  connectBtn.textContent = 'Connect'
    	} else {
      	  provider.connect();
      	  connectBtn.textContent = 'Disconnect'
    	}
      });
    
      window.example = { provider, ydoc, type, binding, Y }
    });

    Steps to run:

    • Server:

    For simplicity, we’ll directly use the y-websocket-server out of the box.

    NOTE: You can either let it run and open a new terminal for the next commands, or let it run in the background using `&` at the end of the command.

    • Client:

    Start the client by npm start. On successful compilation, it should open on your default browser, or you can just go to http://localhost:8080.

    Show me the repo

    You can find the repository here.

    Conclusion:

    Conflict resolution approaches are not relatively new, but with the trend of remote culture, it is important to have good collaborative systems in place to enhance productivity.

    Although this example was just on rich text editing capabilities, we can extend existing resources to build more features and structures like tabular data, graphs, charts, etc. Yjs shared types can be used to define your own data format based on how your custom editor represents data internally.

  • Implementing gRPC In Python: A Step-by-step Guide

    In the last few years, we saw a great shift in technology, where projects are moving towards “microservice architecture” vs the old “monolithic architecture”. This approach has done wonders for us. 

    As we say, “smaller things are much easier to handle”, so here we have microservices that can be handled conveniently. We need to interact among different microservices. I handled it using the HTTP API call, which seems great and it worked for me.

    But is this the perfect way to do things?

    The answer is a resounding, “no,” because we compromised both speed and efficiency here. 

    Then came in the picture, the gRPC framework, that has been a game-changer.

    What is gRPC?

    Quoting the official documentation

    gRPC or Google Remote Procedure Call is a modern open-source high-performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication.”

     

    Credit: gRPC

    RPC or remote procedure calls are the messages that the server sends to the remote system to get the task(or subroutines) done.

    Google’s RPC is designed to facilitate smooth and efficient communication between the services. It can be utilized in different ways, such as:

    • Efficiently connecting polyglot services in microservices style architecture
    • Connecting mobile devices, browser clients to backend services
    • Generating efficient client libraries

    Why gRPC? 

    HTTP/2 based transport – It uses HTTP/2 protocol instead of HTTP 1.1. HTTP/2 protocol provides multiple benefits over the latter. One major benefit is multiple bidirectional streams that can be created and sent over TCP connections parallelly, making it swift. 

    Auth, tracing, load balancing and health checking – gRPC provides all these features, making it a secure and reliable option to choose.

    Language independent communication– Two services may be written in different languages, say Python and Golang. gRPC ensures smooth communication between them.

    Use of Protocol Buffers – gRPC uses protocol buffers for defining the type of data (also called Interface Definition Language (IDL)) to be sent between the gRPC client and the gRPC server. It also uses it as the message interchange format. 

    Let’s dig a little more into what are Protocol Buffers.

    Protocol Buffers

    Protocol Buffers like XML, are an efficient and automated mechanism for serializing structured data. They provide a way to define the structure of data to be transmitted. Google says that protocol buffers are better than XML, as they are:

    • simpler
    • three to ten times smaller
    • 20 to 100 times faster
    • less ambiguous
    • generates data access classes that make it easier to use them programmatically

    Protobuf are defined in .proto files. It is easy to define them. 

    Types of gRPC implementation

    1. Unary RPCs:- This is a simple gRPC which works like a normal function call. It sends a single request declared in the .proto file to the server and gets back a single response from the server.

    rpc HelloServer(RequestMessage) returns (ResponseMessage);

    2. Server streaming RPCs:- The client sends a message declared in the .proto file to the server and gets back a stream of message sequence to read. The client reads from that stream of messages until there are no messages.

    rpc HelloServer(RequestMessage) returns (stream ResponseMessage);

    3. Client streaming RPCs:- The client writes a message sequence using a write stream and sends the same to the server. After all the messages are sent to the server, the client waits for the server to read all the messages and return a response.

    rpc HelloServer(stream RequestMessage) returns (ResponseMessage);

    4. Bidirectional streaming RPCs:- Both gRPC client and the gRPC server use a read-write stream to send a message sequence. Both operate independently, so gRPC clients and gRPC servers can write and read in any order they like, i.e. the server can read a message then write a message alternatively, wait to receive all messages then write its responses, or perform reads and writes in any other combination.

    rpc HelloServer(stream RequestMessage) returns (stream ResponseMessage);

    **gRPC guarantees the ordering of messages within an individual RPC call. In the case of Bidirectional streaming, the order of messages is preserved in each stream.

    Implementing gRPC in Python

    Currently, gRPC provides support for many languages like Golang, C++, Java, etc. I will be focussing on its implementation using Python.

    mkdir grpc_example
    cd grpc_example
    virtualenv -p python3 env
    source env/bin/activate
    pip install grpcio grpcio-tools

    This will install all the required dependencies to implement gRPC.

    Unary gRPC 

    For implementing gRPC services, we need to define three files:-

    • Proto file – Proto file comprises the declaration of the service that is used to generate stubs (<package_name>_pb2.py and <package_name>_pb2_grpc.py). These are used by the gRPC client and the gRPC server.</package_name></package_name>
    • gRPC client – The client makes a gRPC call to the server to get the response as per the proto file.
    • gRPC Server – The server is responsible for serving requests to the client.
    syntax = "proto3";
    
    package unary;
    
    service Unary{
      // A simple RPC.
      //
      // Obtains the MessageResponse at a given position.
     rpc GetServerResponse(Message) returns (MessageResponse) {}
    
    }
    
    message Message{
     string message = 1;
    }
    
    message MessageResponse{
     string message = 1;
     bool received = 2;
    }

    In the above code, we have declared a service named Unary. It consists of a collection of services. For now, I have implemented a single service GetServerResponse(). This service takes an input of type Message and returns a MessageResponse. Below the service declaration, I have declared Message and Message Response.

    Once we are done with the creation of the .proto file, we need to generate the stubs. For that, we will execute the below command:-

    python -m grpc_tools.protoc --proto_path=. ./unary.proto --python_out=. --grpc_python_out=.

    Two files are generated named unary_pb2.py and unary_pb2_grpc.py. Using these two stub files, we will implement the gRPC server and the client.

    Implementing the Server

    import grpc
    from concurrent import futures
    import time
    import unary.unary_pb2_grpc as pb2_grpc
    import unary.unary_pb2 as pb2
    
    
    class UnaryService(pb2_grpc.UnaryServicer):
    
        def __init__(self, *args, **kwargs):
            pass
    
        def GetServerResponse(self, request, context):
    
            # get the string from the incoming request
            message = request.message
            result = f'Hello I am up and running received "{message}" message from you'
            result = {'message': result, 'received': True}
    
            return pb2.MessageResponse(**result)
    
    
    def serve():
        server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
        pb2_grpc.add_UnaryServicer_to_server(UnaryService(), server)
        server.add_insecure_port('[::]:50051')
        server.start()
        server.wait_for_termination()
    
    
    if __name__ == '__main__':
        serve()

    In the gRPC server file, there is a GetServerResponse() method which takes `Message` from the client and returns a `MessageResponse` as defined in the proto file.

    server() function is called from the main function, and makes sure that the server is listening to all the time. We will run the unary_server to start the server

    python3 unary_server.py

    Implementing the Client

    import grpc
    import unary.unary_pb2_grpc as pb2_grpc
    import unary.unary_pb2 as pb2
    
    
    class UnaryClient(object):
        """
        Client for gRPC functionality
        """
    
        def __init__(self):
            self.host = 'localhost'
            self.server_port = 50051
    
            # instantiate a channel
            self.channel = grpc.insecure_channel(
                '{}:{}'.format(self.host, self.server_port))
    
            # bind the client and the server
            self.stub = pb2_grpc.UnaryStub(self.channel)
    
        def get_url(self, message):
            """
            Client function to call the rpc for GetServerResponse
            """
            message = pb2.Message(message=message)
            print(f'{message}')
            return self.stub.GetServerResponse(message)
    
    
    if __name__ == '__main__':
        client = UnaryClient()
        result = client.get_url(message="Hello Server you there?")
        print(f'{result}')

    In the __init__func. we have initialized the stub using ` self.stub = pb2_grpc.UnaryStub(self.channel)’ And we have a get_url function which calls to server using the above-initialized stub  

    This completes the implementation of Unary gRPC service.

    Let’s check the output:-

    Run -> python3 unary_client.py 

    Output:-

    message: “Hello Server you there?”

    message: “Hello I am up and running. Received ‘Hello Server you there?’ message from you”

    received: true

    Bidirectional Implementation

    syntax = "proto3";
    
    package bidirectional;
    
    service Bidirectional {
      // A Bidirectional streaming RPC.
      //
      // Accepts a stream of Message sent while a route is being traversed,
       rpc GetServerResponse(stream Message) returns (stream Message) {}
    }
    
    message Message {
      string message = 1;
    }

    In the above code, we have declared a service named Bidirectional. It consists of a collection of services. For now, I have implemented a single service GetServerResponse(). This service takes an input of type Message and returns a Message. Below the service declaration, I have declared Message.

    Once we are done with the creation of the .proto file, we need to generate the stubs. To generate the stub, we need the execute the below command:-

    python -m grpc_tools.protoc --proto_path=.  ./bidirecctional.proto --python_out=. --grpc_python_out=.

    Two files are generated named bidirectional_pb2.py and bidirectional_pb2_grpc.py. Using these two stub files, we will implement the gRPC server and client.

    Implementing the Server

    from concurrent import futures
    
    import grpc
    import bidirectional.bidirectional_pb2_grpc as bidirectional_pb2_grpc
    
    
    class BidirectionalService(bidirectional_pb2_grpc.BidirectionalServicer):
    
        def GetServerResponse(self, request_iterator, context):
            for message in request_iterator:
                yield message
    
    
    def serve():
        server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
        bidirectional_pb2_grpc.add_BidirectionalServicer_to_server(BidirectionalService(), server)
        server.add_insecure_port('[::]:50051')
        server.start()
        server.wait_for_termination()
    
    
    if __name__ == '__main__':
        serve()

    In the gRPC server file, there is a GetServerResponse() method which takes a stream of `Message` from the client and returns a stream of `Message` independent of each other. server() function is called from the main function and makes sure that the server is listening to all the time.

    We will run the bidirectional_server to start the server:

    python3 bidirectional_server.py

    Implementing the Client

    from __future__ import print_function
    
    import grpc
    import bidirectional.bidirectional_pb2_grpc as bidirectional_pb2_grpc
    import bidirectional.bidirectional_pb2 as bidirectional_pb2
    
    
    def make_message(message):
        return bidirectional_pb2.Message(
            message=message
        )
    
    
    def generate_messages():
        messages = [
            make_message("First message"),
            make_message("Second message"),
            make_message("Third message"),
            make_message("Fourth message"),
            make_message("Fifth message"),
        ]
        for msg in messages:
            print("Hello Server Sending you the %s" % msg.message)
            yield msg
    
    
    def send_message(stub):
        responses = stub.GetServerResponse(generate_messages())
        for response in responses:
            print("Hello from the server received your %s" % response.message)
    
    
    def run():
        with grpc.insecure_channel('localhost:50051') as channel:
            stub = bidirectional_pb2_grpc.BidirectionalStub(channel)
            send_message(stub)
    
    
    if __name__ == '__main__':
        run()

    In the run() function. we have initialised the stub using `  stub = bidirectional_pb2_grpc.BidirectionalStub(channel)’

    And we have a send_message function to which the stub is passed and it makes multiple calls to the server and receives the results from the server simultaneously.

    This completes the implementation of Bidirectional gRPC service.

    Let’s check the output:-

    Run -> python3 bidirectional_client.py 

    Output:-

    Hello Server Sending you the First message

    Hello Server Sending you the Second message

    Hello Server Sending you the Third message

    Hello Server Sending you the Fourth message

    Hello Server Sending you the Fifth message

    Hello from the server received your First message

    Hello from the server received your Second message

    Hello from the server received your Third message

    Hello from the server received your Fourth message

    Hello from the server received your Fifth message

    For code reference, please visit here.

    Conclusion‍

    gRPC is an emerging RPC framework that makes communication between microservices smooth and efficient. I believe gRPC is currently confined to inter microservice but has many other utilities that we will see in the coming years. To know more about modern data communication solutions, check out this blog.