Tag: high performance apps

Building High-performance Apps: A Checklist To Get It Right
An app is only as good as the problem it solves. But your app’s performance can be extremely critical to its success as well. A slow-loading web app can make users quit and try out an alternative in no time. Testing an app’s performance should thus be an integral part of your development process and not an afterthought.

In this article, we will talk about how you can proactively monitor and boost your app’s performance as well as fix common issues that are slowing down the performance of your app.

I’ll use the following tools for this blog.
- Lighthouse – A performance audit tool, developed by Google
- Webpack – A JavaScript bundler
You can find similar tools online, both free and paid. So let’s give our Vue a new Angular perspective to make our apps React faster.

Performance Metrics

First, we need to understand which metrics play an important role in determining an app’s performance. Lighthouse helps us calculate a score based on a weighted average of the below metrics:
1. First Contentful Paint (FCP) – 15%
2. Speed Index (SI) – 15%
3. Largest Contentful Paint (LCP) – 25%
4. Time to Interactive (TTI) – 15%
5. Total Blocking Time (TBT) – 25%
6. Cumulative Layout Shift (CLS) – 5%
By taking the above stats into account, Lighthouse gauges your app’s performance as such:
- 0 to 49 (slow): Red
- 50 to 89 (moderate): Orange
- 90 to 100 (fast): Green
I would recommend going through Lighthouse performance scoring to learn more. Once you understand Lighthouse, you can audit websites of your choosing.

I gathered audit scores for a few websites, including Walmart, Zomato, Reddit, and British Airways. Almost all of them had a performance score below 30. A few even secured a single-digit.

To attract more customers, businesses fill their apps with many attractive features. But they ignore the most important thing: performance, which degrades with the addition of each such feature.

As I said earlier, it’s all about the user experience. You can read more about why performance matters and how it impacts the overall experience.

Now, with that being said, I want to challenge you to conduct a performance test on your favorite app. Let me know if it receives a good score. If not, then don’t feel bad.

Follow along with me.

Let’s get your app fixed!

Source: Giphy

Exploring Opportunities

If you’re still reading this blog, I expect that your app received a low score, or maybe, you’re just curious.

Source: Giphy

Whatever the reason, let’s get started.

Below your scores are the possible opportunities suggested by Lighthouse. Fixing these affects the performance metrics above and eventually boosts your app’s performance. So let’s check them out one-by-one.

Here are all the possible opportunities listed by Lighthouse:
1. Eliminate render-blocking resources
2. Properly size images
3. Defer offscreen images
4. Minify CSS & JavaScript
5. Serve images in the next-gen formats
6. Enable text compression
7. Preconnect to required origins
8. Avoid multiple page redirects
9. Use video formats for animated content
‍

A few other opportunities won’t be covered in this blog, but they are just an extension of the above points. Feel free to read them under the further reading section.

Eliminate Render-blocking Resources

Source: Giphy

This section lists down all the render-blocking resources. The main goal is to reduce their impact by:
- removing unnecessary resources,
- deferring non-critical resources, and
- in-lining critical resources.
To do that, we need to understand what a render-blocking resource is.

Render-blocking resource and how to identify

As the name suggests, it’s a resource that prevents a browser from rendering processed content. Lighthouse identifies the following as render-blocking resources:
- A <script> </script>tag in <head> </head>that doesn’t have a defer or async attribute
- A <link rel=””stylesheet””> tag that doesn’t have a media attribute to match a user’s device or a disabled attribute to hint browser to not download if unnecessary
- A <link rel=””import””> that doesn’t have an async attribute
To reduce the impact, you need to identify what’s critical and what’s not. You can read how to identify critical resources using the Chrome dev tool.

Classify Resources

Classify resources as critical and non-critical based on the following color code:
- Green (critical): Needed for the first paint.‍
- Red (non-critical): Not needed for the first paint but will be needed later.
Solution

Now, to eliminate render-blocking resources:

Extract the critical part into an inline resource and add the correct attributes to the non-critical resources. These attributes will indicate to the browser what to download asynchronously. This can be done manually or by using a JS bundler.

Webpack users can use the libraries below to do it in a few easy steps:
- For extracting critical CSS, you can use html-critical-webpack-plugin or critters-webpack-plugin. It’ll generate an inline <style></style> tag in the <head></head> with critical CSS stripped out of the main CSS chunk and preloading the main file
- For extracting CSS depending on media queries, use media-query-splitting-plugin or media-query-plugin
- The first paint doesn’t need to be dependent on the JavaScript files. Use lazy loading and code splitting techniques to achieve lazy loading resources (downloading only when requested by the browser). The magic comments in lazy loading make it easy
- And finally, for the main chunk, vendor chunk, or any other external scripts (included in index.html), you can defer them using script-ext-html-webpack-plugin
There are many more libraries for inlining CSS and deferring external scripts. Feel free to use as per the use case.

Use Properly Sized Images

This section lists all the images used in a page that aren’t properly sized, along with the stats on potential savings for each image.

How Lighthouse Calculate Oversized Images?

Lighthouse calculates potential savings by comparing the rendered size of each image on the page with its actual size. The rendered image varies based on the device pixel ratio. If the size difference is at least 25 KB, the image will fail the audit.

Solution

DO NOT serve images that are larger than their rendered versions! The wasted size just hampers the load time.

Alternatively,
- Use responsive images. With this technique, create multiple versions of the images to be used in the application and serve them depending on the media queries, viewport dimensions, etc
- Use image CDNs to optimize images. These are like a web service API for transforming images
- Use vector images, like SVG. These are built on simple primitives and can scale without losing data or change in the file size
You can resize images online or on your system using tools. Learn how to serve responsive images.

Learn more about replacing complex icons with SVG. For browsers that don’t support SVG format, here’s A Complete Guide to SVG fallbacks.

Defer Offscreen Images

An offscreen image is an image located outside of the visible browser viewport.

The audit fails if the page has offscreen images. Lighthouse lists all offscreen or hidden images in your page, along with the potential savings.

Solution

Load offscreen images only when the user focuses on that part of the viewport. To achieve this, lazy-load these images after loading all critical resources.

There are many libraries available online that will load images depending on the visible viewport. Feel free to use them as per the use case.

Minify CSS and JavaScript

Lighthouse identifies all the CSS and JS files that are not minified. It will list all of them along with potential savings.

Solution

Do as the heading says!

Source: Giphy

Minifiers can do it for you. Webpack users can use mini-css-extract-plugin and terser-webpack-plugin for minifying CSS and JS, respectively.

Serve Images in Next-gen Formats

Following are the next-gen image formats:
- Webp
- JPEG 2000
- JPEG XR
The image formats we use regularly (i.e., JPEG and PNG) have inferior compression and quality characteristics compared to next-gen formats. Encoding images in these formats can load your website faster and consume less cellular data.

Lighthouse converts each image of the older format to Webp format and reports those which ones have potential savings of more than 8 KB.

Solution

Convert all, or at least the images Lighthouse recommends, into the above formats. Use your converted images with the fallback technique below to support all browsers.
```
<picture>
  <source type="image/jp2" srcset="my-image.jp2">
  <source type="image/jxr" srcset="my-image.jxr">
  <source type="image/webp" srcset="my-image.webp">
  <source type="image/jpeg" srcset="my-image.jpg">
  <img src="my-image.jpg" alt="">
</picture>
```
Enable Text Compression

Source: Giphy

This technique of compressing the original textual information uses compression algorithms to find repeated sequences and replace them with shorter representations. It’s done to further minimize the total network bytes.

Lighthouse lists all the text-based resources that are not compressed.

It computes the potential savings by identifying text-based resources that do not include a content-encoding header set to br, gzip or deflate and compresses each of them with gzip.

If the potential compression savings is more than 10% of the original size, then the file fails the audit.

Solution

Webpack users can use compression-webpack-plugin for text compression.

The best part about this plugin is that it supports Google’s Brotli compression algorithm which is superior to gzip. Alternatively, you can also use brotli-webpack-plugin. All you need to do is configure your server to return Content-Encoding as br.

Brotli compresses faster than gzip and produces smaller files (up to 20% smaller). As of June 2020, Brotli is supported by all major browsers except Safari on iOS and desktop and Internet Explorer.

Don’t worry. You can still use gzip as a fallback.

Preconnect to Required Origins

This section lists all the key fetch requests that are not yet prioritized using <link rel=””preconnect””>.

Establishing connections often involves significant time, especially when it comes to secure connections. It encounters DNS lookups, redirects, and several round trips to the final server handling the user’s request.

Solution

Establish an early connection to required origins. Doing so will improve the user experience without affecting bandwidth usage.

To achieve this connection, use preconnect or dns-prefetch. This informs the browser that the app wants to establish a connection to the third-party origin as soon as possible.

Use preconnect for most critical connections. For non-critical connections, use dns-prefetch. Check out the browser support for preconnect. You can use dns-prefetch as the fallback.

Avoid Multiple Page Redirects

Source: Giphy

This section focuses on requesting resources that have been redirected multiple times. One must avoid multiple redirects on the final landing pages.

A browser encounters this response from a server in case of HTTP-redirect:
```
HTTP/1.1 301 Moved Permanently
Location: /path/to/new/location
```
A typical example of a redirect looks like this:
‍
example.com → www.example.com → m.example.com – very slow mobile experience.

This eventually makes your page load more slowly.

Solution

Don’t leave them hanging!

Source: Giphy

Point all your flagged resources to their current location. It’ll help you optimize your pages’ Critical Rendering Path.

Use Video Formats for Animated Content

This section lists all the animated GIFs on your page, along with the potential savings.

Large GIFs are inefficient when delivering animated content. You can save a significant amount of bandwidth by using videos over GIFs.

Solution

Consider using MPEG4 or WebM videos instead of GIFs. Many tools can convert a GIF into a video, such as FFmpeg.

Use the code below to replicate a GIF’s behavior using MPEG4 and WebM. It’ll be played silent and automatically in an endless loop, just like a GIF. The code ensures that the unsupported format has a fallback.
```
<video autoplay loop muted playsinline>  
  <source src="my-funny-animation.webm" type="video/webm">
  <source src="my-funny-animation.mp4" type="video/mp4">
</video>
```
Note: Do not use video formats for a small batch of GIF animations. It’s not worth doing it. It comes in handy when your website makes heavy use of animated content.

Final Thoughts

I found a great result in my app’s performance after trying out the techniques above.

Source: Giphy

While they may not all fit your app, try it and see what works and what doesn’t. I have compiled a list of some resources that will help you enhance performance. Hopefully, they help.

Do share your starting and final audit scores with me.

Happy optimized coding!

Source: Giphy

Further Reading

Learn more – web.dev

Other opportunities to explore:
1. Remove unused CSS
2. Efficiently encode images
3. Reduce server response times (TTFB)
4. Preload key requests
5. Reduce the impact of third-party code
December 12, 2022
Building Scalable and Efficient React Applications Using GraphQL and Relay
Building a React application is not only about creating a user interface. It also has tricky parts like data fetching, re-render performance, and scalability. Many libraries and frameworks try to solve these problems, like Redux, Sagas, etc. But these tools come with their own set of difficulties.

Redux gives you a single data source, but all the data fetching and rendering logic is handled by developers. Immer gives you immutable data structures, but one needs to handle the re-render performance of applications.

GraphQL helps developers design and expose APIs on the backend, but no tool on the client side could utilize the full advantage of the single endpoint and data schema provided by GraphQL.

In this article, we will learn about Relay as a GraphQL client. What are the advantages of using Relay in your application, and what conventions are required to integrate it? We’ll also cover how following those conventions will give you a better developer experience and a performant app. We will also see how applications built with Relay are modular, scalable, efficient, and, by default, resilient to change.

About Relay

Relay is a JavaScript framework to declaratively fetch and manage your GraphQL data inside a React application. Relay uses static queries and ahead-of-time compilation to help you build a high-performance app.

But as the great saying goes, “With great power comes great responsibilities.” Relay comes with a set of costs (conventions), which—when compared with the benefits you get—is well worth it. We will explore the trade-offs in this article.

The Relay framework is built of multiple modules:

1. The compiler: This is a set of modules designed to extract GraphQL code from across the codebase and do validations and optimizations during build time.

2. Relay runtime: A high-performance GraphQL runtime that features a normalized cache for objects and highly optimized read/write operations, simplified abstractions over fetching data fields, garbage collection, subscriptions, and more.

3. React-relay: This provides the high-level APIs to integrate React with the Relay runtime.

The Relay compiler runs as a separate process, like how webpack works for React. It keeps watching and compiling the GraphQL code, and in case of errors, it simply does not build your code, which prevents bugs from going into higher environments.

Fragments

Fragments are at the heart of how Relay blends with GraphQL. A fragment is a selection of fields on a GraphQL type.
```
fragment Avatar_user on User {
  avatarImgUrl
  firstName
  lastName
  userName
}
```
If we look at the sample fragment definition above, the fragment name, Avatar_user, is not just a random name. One of the Relay framework’s important conventions is that fragments have globally unique fragment names and follow a structure of <modulename>_<propertyname>. The example above is a fragment definition for Avatar_user.</propertyname></modulename>

This fragment can then be reused throughout the queries instead of selecting the fields manually to render the avatar in each view.

In the below query, we see the author type, and the first two who liked the blog post can use the fragment definition of Avatar_user
```
query GetBlogPost($postId: ID!) {
      blogPostById(id: $postId) {
        author {
          firstName
          lastName
          avatarImgUrl
          userName
        }
        likedBy(first: 2) {
          edges {
            node {
              firstName
              lastName
              avatarImgUrl
              userName
            }
          }
        }
      }
    }
```
Now, our new query with fragments looks like this:
```
query GetBlogPost($postId: ID!) {
      blogPostById(id: $postId) {
        author {
          ...Avatar_user
        }
        likedBy(first: 2) {
          edges {
            node {
              ...Avatar_user
            }
          }
        }
      }
    }
```
Fragments not only allow us to reuse the definitions but more essentially, they let us add or remove fields needed to render our avatar as we evolve our application.

Another highly important client-side convention is colocation. This means the data required for a component lives inside the component. This makes maintenance and extending much easier. Just like how React allows us to break our UI elements into components and group/compose different views, fragments in Relay allow us to split the data definitions and colocate the data and the view definitions.

So, a good practice is to define single or multiple fragments that contain the data component to be rendered. This means that a component depends on some fields from the user type, irrespective of the parent component. In the example above, the <avatar> component will render an avatar using the fields specified in the Avatar_user fragment named.</avatar>

How Relay leverages the GraphQL Fragment

Relay wants all components to enlist all the data it needs to render, along with the component itself. Relay uses data and fragments to integrate the component and its data requirement. This convention mandates that every component lists the fields it needs access to.

Other advantages of the above are:
1. Components are not dependent on data they don’t explicitly request.
2. Components are modular and self-contained.
3. Reusing and refactoring the components becomes easier.
Performance

In Relay, the component re-renders only when its exact fields change, and this feature available is out of the box. The fragment subscribes to updates specifically for data the component selects. This lets Relay enhance how the view is updated, and performance is not affected as codebase scales.

Now, let’s look at an example of components in a single post of a blog application. Here is a wireframe of a sample post to give an idea of the data and view required.

Now, let’s write a plain query without Relay, which will fetch all the data in a single query. It will look like this for the above wireframe:
query GetBlogPost($postId: ID!) { blogPostById(id: $postId) { author { firstName lastName avatarUrl shortBio } title coverImgUrl createdAt tags { slug shortName } body likedByMe likedBy(first: 2) { totalCount edges { node { firstName lastName avatarUrl } } } } }
```
query GetBlogPost($postId: ID!) {
      blogPostById(id: $postId) {
        author {
          firstName
          lastName
          avatarUrl
          shortBio
        }
        title
        coverImgUrl
        createdAt
        tags {
          slug
          shortName
        }
        body
        likedByMe
        likedBy(first: 2) {
          totalCount
          edges {
            node {
              firstName
              lastName
              avatarUrl
            }
          }
        }
      }
    }
```
This one query has all the necessary data. Let’s also write down a sample structure of UI components for the query above:
<BlogPostContainer> <BlogPostHead> <BlogPostAuthor> <Avatar /> </BlogPostAuthor> </BlogPostHead> <BlogPostBody> <BlogPostTitle /> <BlogPostMeta> <CreatedAtDisplayer /> <TagsDisplayer /> </BlogPostMeta> <BlogPostContent /> <LikeButton> <LikedByDisplayer /> </LikeButton> </BlogPostBody> </BlogPostContainer>
```
<BlogPostContainer>
    <BlogPostHead>
      <BlogPostAuthor>
        <Avatar />
      </BlogPostAuthor>
    </BlogPostHead>
    <BlogPostBody>
      <BlogPostTitle />
      <BlogPostMeta>
        <CreatedAtDisplayer />
        <TagsDisplayer />
      </BlogPostMeta>
      <BlogPostContent />
      <LikeButton>
        <LikedByDisplayer />
      </LikeButton>
    </BlogPostBody>
 </BlogPostContainer>
```
In the implementation above, we have a single query that will be managed by the top-level component. It will be the top-level component’s responsibility to fetch the data and pass it down as props. Now, we will look at how we would build this in Relay:
import * as React from "react"; import { GetBlogPost } from "./__generated__/GetBlogPost.graphql"; import { useLazyLoadQuery } from "react-relay/hooks"; import { BlogPostHead } from "./BlogPostHead"; import { BlogPostBody } from "./BlogPostBody"; import { graphql } from "react-relay"; interface BlogPostProps { postId: string; } export const BlogPost = ({ postId }: BlogPostProps) => { const { blogPostById } = useLazyLoadQuery<GetBlogPost>( graphql` query GetBlogPost($postId: ID!) { blogPostById(id: $postId) { ...BlogPostHead_blogPost ...BlogPostBody_blogPost } } `, { variables: { postId } } ); if (!blogPostById) { return null; } return ( <div> <BlogPostHead blogPost={blogPostById} /> <BlogPostBody blogPost={blogPostById} /> </div> ); };
```
import * as React from "react";
    import { GetBlogPost } from "./__generated__/GetBlogPost.graphql";
    import { useLazyLoadQuery } from "react-relay/hooks";
    import { BlogPostHead } from "./BlogPostHead";
    import { BlogPostBody } from "./BlogPostBody";
    import { graphql } from "react-relay";


    interface BlogPostProps {
      postId: string;
    }

    export const BlogPost = ({ postId }: BlogPostProps) => {
      const { blogPostById } = useLazyLoadQuery<GetBlogPost>(
        graphql`
          query GetBlogPost($postId: ID!) {
            blogPostById(id: $postId) {
              ...BlogPostHead_blogPost
              ...BlogPostBody_blogPost
            }
          }
        `,
        {
          variables: { postId }
        }
      );

      if (!blogPostById) {
        return null;
      }

      return (
        <div>
          <BlogPostHead blogPost={blogPostById} />
          <BlogPostBody blogPost={blogPostById} />
        </div>
      );
    };
```
First, let’s look at the query used inside the component:
```
const { blogPostById } = useLazyLoadQuery<GetBlogPost>(
graphql`
  query GetBlogPost($postId: ID!) {
    blogPostById(id: $postId) {
      ...BlogPostHead_blogPost
      ...BlogPostBody_blogPost
    }
  }
`,
{
  variables: { postId }
}
);
```
The useLazyLoadQuery React hook from Relay will start fetching the GetBlogPost query just as the component renders.

NOTE: The useLazyLoadQuery is used here as it follows a common mental model of fetching data after the page is loaded. However, Relay encourages data to be fetched as early as possible using the usePreladedQuery hook.

For type safety, we are annotating the useLazyLoadQuery with the type GetBlogPost, which is imported from ./__generated__/GetBlogPost.graphql. This file is auto-generated and synced by the Relay compiler. It contains all the information about the types needed to be queried, along with the return type of data and the input variables for the query.

The Relay compiler takes all the declared fragments in the codebase and generates the type files, which can then be used to annotate a particular component.

The GetBlogPost query is defined by composing multiple fragments. Another great aspect of Relay is that there is no need to import the fragments manually. They are automatically included by the Relay compiler. Building the query by composing fragments, just like how we compose our component, is the key here.

Another approach can be to define queries per component, which takes full responsibility for its data requirements. But this approach has two problems:

1. Multiple queries are sent to the server instead of one.

2. The loading will be slower as components would have to wait till they render to start fetching the data.

In the above example, the GetBlogPost only deals with including the fragments for its child components, BlogPostHead and BlogPostBody. It is kept hidden from the actual data fields of the children component.

When using Relay, components define their data requirement by themselves. These components can then be composed along with other components that have their own separate data.

At the same time, no component knows what data the other component needs except from the GraphQL type that has the required component data. Relay makes sure the right data is passed to the respective component, and all input for a query is sent to the server.

This allows developers to think only about the component and fragments as one while Relay does all the heavy lifting in the background. Relay minimizes the round-trips to the server by placing the fragments from multiple components into optimized and efficient batches.

As we said earlier, the two fragments, BlogPostHead_blogPost and BlogPostBody_blogPost, which we referenced in the query, are not imported manually. This is because Relay imposes unique fragment names globally so that the compiler can include the definitions in queries sent to the server. This eliminates the chances of errors and takes away the laborious task of referencing the fragments by hand.
```
 if (!blogPostById) {
      return null;
  }

  return (
    <div>
      <BlogPostHead blogPost={blogPostById} />
      <BlogPostBody blogPost={blogPostById} />
    </div>
  );
```
Now, in the rendering logic above, we render the <BlogPostHead/> and <BlogPostBody/> and pass the blogPostById object as prop. It’s passed because it is the object inside the query that spreads the fragment needed by the two components. This is how Relay transfers fragment data. Because we spread both fragments on this object, it is guaranteed to satisfy both components.

To put it into simpler terms, we say that to pass the fragment data, we pass the object where the fragment is spread, and the component then uses this object to get the real fragment data. Relay, through its robust type systems, makes sure that the right object is passed with required fragment spread on it.

The previous component, the BlogPost, was the Parent component, i.e., the component with the root query object. The root query is necessary because it cannot fetch a fragment in isolation. Fragments must be included in the root query in a parent component. The parent can, in turn, be a fragment as long the root query exists in the hierarchy. Now, we will build the BlogPostHead component using fragments:
import * as React from "react"; import { useFragment } from "react-relay/hooks"; import { graphql } from "react-relay"; import { BlogPostHead_blogPost$key, BlogPostHead_blogPost } from "./__generated__/BlogPostHead_blogPost.graphql"; import { BlogPostAuthor } from "./BlogPostAuthor"; import { BlogPostLikeControls } from "./BlogPostLikeControls"; interface BlogPostHeadProps { blogPost: BlogPostHead_blogPost$key; } export const BlogPostHead = ({ blogPost }: BlogPostHeadProps) => { const blogPostData = useFragment<BlogPostHead_blogPost>( graphql` fragment BlogPostHead_blogPost on BlogPost { title coverImgUrl ...BlogPostAuthor_blogPost ...BlogPostLikeControls_blogPost } `, blogPost ); return ( <div> <img src={blogPostData.coverImgUrl} /> <h1>{blogPostData.title}</h1> <BlogPostAuthor blogPost={blogPostData} /> <BlogPostLikeControls blogPost={blogPostData} /> </div> ); };
```
 import * as React from "react";
    import { useFragment } from "react-relay/hooks";
    import { graphql } from "react-relay";
    import {
      BlogPostHead_blogPost$key, BlogPostHead_blogPost
    } from "./__generated__/BlogPostHead_blogPost.graphql";
    import { BlogPostAuthor } from "./BlogPostAuthor";
    import { BlogPostLikeControls } from "./BlogPostLikeControls";

    interface BlogPostHeadProps {
      blogPost: BlogPostHead_blogPost$key;
    }

    export const BlogPostHead = ({ blogPost }: BlogPostHeadProps) => {
      const blogPostData = useFragment<BlogPostHead_blogPost>(
        graphql`
          fragment BlogPostHead_blogPost on BlogPost {
            title
            coverImgUrl
            ...BlogPostAuthor_blogPost
            ...BlogPostLikeControls_blogPost
          }
        `,
        blogPost
      );

      return (
        <div>
          <img src={blogPostData.coverImgUrl} />
          <h1>{blogPostData.title}</h1>
          <BlogPostAuthor blogPost={blogPostData} />
          <BlogPostLikeControls blogPost={blogPostData} />
        </div>
      );
    };
```
NOTE: In our example, the BlogPostHead and BlogPostBody define only one fragment, but in general, a component can have any number of fragments or GraphQL types and even more than one fragments on the same type.

In the component above, two type definitions, namely BlogPostHead_blogPost$key and BlogPostHead_blogPost, are imported from the file BlogPostHead_blogPost.graphql, generated by the Relay compiler. The compiler extracts the fragment code from this file and generates the types. This process is followed for all the GraphQL code—queries, mutations, fragments, and subscriptions.

The blogPostHead_blogPost has the fragment type definitions, which is then passed to the useFragment hook to ensure type safety when using the data from the fragment. The other import, blogPostHead_blogPost$key, is used in the interface Props { … }, and this type definition makes sure that we pass the right object to useFragment. Otherwise, the type system will throw errors during build time. In the above child component, the blogPost object is received as a prop and is passed to useFragment as a second parameter. If the blogPost object did not have the correct fragment, i.e., BlogPostHead_blogPost, spread on it, we would have received a type error. Even if there were another fragment with exact same data selection spread on it, Relay makes sure it’s the right fragment that we use with the useFragement. This allows you to change the update fragment definitions without affecting other components.

Data masking

In our example, the fragment BlogPostHead_blogPost explicitly selects two fields for the component:
1. title
2. coverImgUrl
This is because we use/access only these two fields in the view for the <blogposthead></blogposthead> component. So, even if we define another fragment, BlogPostAuthor_blogPost, which selects the title and coverImgUrl, we don’t receive access to them unless we ask for them in the same fragment. This is enforced by Relay’s type system both at compile time and at runtime. This safety feature of Relay makes it impossible for components to depend on data they do not explicitly select. So, developers can refactor the components without risking other components. To reiterate, all components and their data dependencies are self-contained.

The data for this component, i.e., title and coverImgUrl, will not be accessible on the parent component, BlogPost, even though the props object is sent by the parent. The data becomes available only through the useFragment React hook. This hook can consume the fragment definition. The useFragment takes in the fragment definition and the object where the fragment is spread to get the data listed for the particular fragment.

Just like how we spread the fragment for the BlogPostHead component in the BlogPost root query, we an also extend this to the child components of BlogPostHead. We spread the fragments, i.e., BlogPostAuthor_blogPost, BlogPostLikeControls_blogPost, since we are rendering <BlogPostAuthor /> and <BlogPostLikeControls />.

NOTE: The useFragment hook does not fetch the data. It can be thought of as a selector that grabs only what is needed from the data definitions.

Performance

When using a fragment for a component, the component subscribes only to the data it depends on. In our example, the component BlogPostHead will only automatically re-render when the fields “coverImgUrl” or “title” change for a specific blog post the component renders. Since the BlogPostAuthor_blogPost fragment does not select those fields, it will not re-render. Subscription to any updates is made on fragment level. This is an essential feature that works out of the box with Relay for performance.

Let us now see how general data and components are updated in a different GraphQL framework than Relay. The data that gets rendered on view actually comes from an operation that requests data from the server, i.e., a query or mutation. We write the query that fetches data from the server, and that data is passed down to different components as per their needs as props. The data flows from the root component, i.e., the component with the query, down to the components.

Let’s look at a graphical representation of the data flow in other GraphQL frameworks:

Image source: Dev.to

NOTE: Here, the framework data store is usually referred to as cache in most frameworks:

1. The Profile component executes the operation ProfileQuery to a GraphQL server.

2. The data return is kept in some framework-specific representation of the data store.

3. The data is passed to the view rendering it.

4. The view then passes on the data to all the child components who need it. Example: Name, Avatar, and Bio. And finally React renders the view.

In contrast, the Relay framework takes a different approach:

Image source: Dev.to

Let’s breakdown the approach taken by Relay:
- For the initial part, we see nothing changes. We still have a query that is sent to the GraphQL server and the data is fetched and stored in the Relay data store.
- What Relay does after this is different. The components get the data directly from the cache-store(data store). This is because the fragments help Relay integrate deeply with the component data requirements.The component fragments get the data straight from the framework data store and do not rely on data to be passed down as props. Although some information is passed from the query to the fragments used to look up the particular data needed from the data store, the data is fetched by the fragment itself.
To conclude the above comparison, in other frameworks (like Apollo), the component uses the query as the data source. The implementation details of how the root component executing the query sends data to its descendants is left to us. But Relay takes a different approach of letting the component take care of the data in needs from the data store.

In an approach used by other GraphQL frameworks, the query is the data source, and updates in the data store forces the component holding the query to re-render. This re-render cascades down to any number of components even if those components do not have to do anything with the updated data other than acting as a layer to pass data from parent to child. In the Relay approach, the components directly subscribe to the updates for the data used. This ensures the best performance as our app scales in size and complexity.

Developer Experience

Relay removes the responsibility of developers to route the data down from query to the components that need it. This eliminates the changes of developer error. There is no way for a component to accidentally or deliberately depend on data that it should be just passing down in the component tree if it cannot access it. All the hard work is taken care of by the Relay framework if we follow the conventions discussed.

Conclusion

To summarize, we detailed all the work Relay does for us and the effects:
- The type system of the Relay framework makes sure the right components get the right data they need. Everything in Relay revolves around fragments.
- In Relay, fragments are coupled and colocated with components, which allows it to mask the data requirements from the outside world. This increases the readability and modularity.
- By default, Relay takes care of performance as components only re-render when the exact data they use change in the data store.
- Type generation is a main feature of Relay compiler. Through type generation, interactions with the fragment’s data is typesafe.
Conventions enforced by Relay’s philosophy and architecture allows it to take advantage of the information available about your component. It knows the exact data dependencies and types. It uses all this information to do a lot of work that developers are required to deal with.

Related Articles

1. Enable Real-time Functionality in Your App with GraphQL and Pusher

2. Build and Deploy a Real-Time React App Using AWS Amplify and GraphQL
December 12, 2022
Eliminate Render-blocking Resources using React and Webpack
In the previous blog, we learned how a browser downloads many scripts and useful resources to render a webpage. But not all of them are necessary to show a page’s content. Because of this, the page rendering is delayed. However, most of them will be needed as the user navigates through the website’s various pages.

In this article, we’ll learn to identify such resources and classify them as critical and non-critical. Once identified, we’ll inline the critical resources and defer the non-critical resources.

For this blog, we’ll use the following tools:
- Google Lighthouse and other Chrome DevTools to identify render-blocking resources.‍
- Webpack and CRACO to fix it.
Demo Configuration

For the demo, I have added the JavaScript below to the <head></head> of index.html as a render-blocking JS resource. This script loads two more CSS resources on the page.

https://use.fontawesome.com/3ec06e3d93.js

Other configurations are as follows:
- Create React App v4.0
- Formik and Yup for handling form validations
- Font Awesome and Bootstrap
- Lazy loading and code splitting using Suspense, React lazy, and dynamic import
- CRACO
- html-critical-webpack-plugin
- ngrok and serve for serving build
Render-Blocking Resources

A render-blocking resource typically refers to a script or link that prevents a browser from rendering the processed content.

Lighthouse will flag the below as render-blocking resources:
- A <script></script> tag in <head></head> that doesn’t have a defer or async attribute.
- A <link rel=””stylesheet””> tag that doesn’t have a media attribute to match a user’s device or a disabled attribute to hint browser to not download if unnecessary.
- A <link rel=””import””> that doesn’t have an async attribute.
Identifying Render-Blocking Resources

To reduce the impact of render-blocking resources, find out what’s critical for loading and what’s not.

To do that, we’re going to use the Coverage Tab in Chrome DevTools. Follow the steps below:

1. Open the Chrome DevTools (press F12)

2. Go to the Sources tab and press the keys to Run command

‍The below screenshot is taken on a macOS.

3. Search for Show Coverage and select it, which will show the Coverage tab below. Expand the tab.

4. Click on the reload button on the Coverage tab to reload the page and start instrumenting the coverage of all the resources loading on the current page.

5. After capturing the coverage, the resources loaded on the page will get listed (refer to the screenshot below). This will show you the code being used vs. the code loaded on the page.

The list will display coverage in 2 colors:

a. Green (critical) – The code needed for the first paint

b. Red (non-critical) – The code not needed for the first paint.

After checking each file and the generated index.html after the build, I found three primary non-critical files –

a. 5.20aa2d7b.chunk.css – 98% non-critical code

b. https://use.fontawesome.com/3ec06e3d93.js – 69.8% non-critical code. This script loads below CSS –

1. font-awesome-css.min.css – 100% non-critical code

2. https://use.fontawesome.com/3ec06e3d93.css – 100% non-critical code

c. main.6f8298b5.chunk.css – 58.6% non-critical code

The above resources satisfy the condition of a render-blocking resource and hence are prompted by the Lighthouse Performance report as an opportunity to eliminate the render-blocking resources (refer screenshot). You can reduce the page size by only shipping the code that you need.

Solution

Once you’ve identified critical and non-critical code, it is time to extract the critical part as an inline resource in index.html and deferring the non-critical part by using the webpack plugin configuration.

For Inlining and Preloading CSS:

Use html-critical-webpack-plugin to inline the critical CSS into index.html. This will generate a <style></style> tag in the <head> with critical CSS stripped out of the main CSS chunk and preloading the main file.</head>
const path = require('path'); const { whenProd } = require('@craco/craco'); const HtmlCriticalWebpackPlugin = require('html-critical-webpack-plugin'); module.exports = { webpack: { configure: (webpackConfig) => { return { ...webpackConfig, plugins: [ ...webpackConfig.plugins, ...whenProd( () => [ new HtmlCriticalWebpackPlugin({ base: path.resolve(__dirname, 'build'), src: 'index.html', dest: 'index.html', inline: true, minify: true, extract: true, width: 320, height: 565, penthouse: { blockJSRequests: false, }, }), ], [] ), ], }; }, }, };
```
const path = require('path');
const { whenProd } = require('@craco/craco');
const HtmlCriticalWebpackPlugin = require('html-critical-webpack-plugin');

module.exports = {
  webpack: {
    configure: (webpackConfig) => {
      return {
        ...webpackConfig,
        plugins: [
          ...webpackConfig.plugins,
          ...whenProd(
            () => [
              new HtmlCriticalWebpackPlugin({
                base: path.resolve(__dirname, 'build'),
                src: 'index.html',
                dest: 'index.html',
                inline: true,
                minify: true,
                extract: true,
                width: 320,
                height: 565,
                penthouse: {
                  blockJSRequests: false,
                },
              }),
            ],
            []
          ),
        ],
      };
    },
  },
};
```
Once done, create a build and deploy. Here’s a screenshot of the improved opportunities:

To use CRACO, refer to its README file.

‍NOTE: If you’re planning to use the critters-webpack-plugin please check these issues first: Could not find HTML asset and Incompatible with html-webpack-plugin v4.

For Deferring Routes/Pages:

Use lazy-loading and code-splitting techniques along with webpack’s magic comments as below to preload or prefetch a route/page according to your use case.
import { Suspense, lazy } from 'react'; import { Redirect, Route, Switch } from 'react-router-dom'; import Loader from '../../components/Loader'; import './style.scss'; const Login = lazy(() => import( /* webpackChunkName: "login" */ /* webpackPreload: true */ '../../containers/Login' ) ); const Signup = lazy(() => import( /* webpackChunkName: "signup" */ /* webpackPrefetch: true */ '../../containers/Signup' ) ); const AuthLayout = () => { return ( <Suspense fallback={<Loader />}> <Switch> <Route path="/auth/login" component={Login} /> <Route path="/auth/signup" component={Signup} /> <Redirect from="/auth" to="/auth/login" /> </Switch> </Suspense> ); }; export default AuthLayout;
```
import { Suspense, lazy } from 'react';
import { Redirect, Route, Switch } from 'react-router-dom';
import Loader from '../../components/Loader';

import './style.scss';

const Login = lazy(() =>
  import(
    /* webpackChunkName: "login" */ /* webpackPreload: true */ '../../containers/Login'
  )
);
const Signup = lazy(() =>
  import(
    /* webpackChunkName: "signup" */ /* webpackPrefetch: true */ '../../containers/Signup'
  )
);

const AuthLayout = () => {
  return (
    <Suspense fallback={<Loader />}>
      <Switch>
        <Route path="/auth/login" component={Login} />
        <Route path="/auth/signup" component={Signup} />
        <Redirect from="/auth" to="/auth/login" />
      </Switch>
    </Suspense>
  );
};

export default AuthLayout;
```
The magic comments enable webpack to add correct attributes to defer the scripts according to the use-case.

For Deferring External Scripts:

For those who are using a version of webpack lower than 5, use script-ext-html-webpack-plugin or resource-hints-webpack-plugin.

I would recommend following the simple way given below to defer an external script.
```
// Add defer/async attribute to external render-blocking script
<script async defer src="https://use.fontawesome.com/3ec06e3d93.js"></script>
```
The defer and async attributes can be specified on an external script. The async attribute has a higher preference. For older browsers, it will fallback to the defer behaviour.

If you want to know more about the async/defer, read the further reading section.

Along with defer/async, we can also use media attributes to load CSS conditionally.

It’s also suggested to load fonts locally instead of using full CDN in case we don’t need all the font-face rules added by Font providers.

Now, let’s create and deploy the build once more and check the results.

The opportunity to eliminate render-blocking resources shows no more in the list.

We have finally achieved our goal!

Final Thoughts

The above configuration is a basic one. You can read the libraries’ docs for more complex implementation.

Let me know if this helps you eliminate render-blocking resources from your app.

If you want to check out the full implementation, here’s the link to the repo. I have created two branches—one with the problem and another with the solution. Read the further reading section for more details on the topics.

Hope this helps.

Happy Coding!

Further Reading
- Eliminate render-blocking resources ‍
- Scripts: async, defer
December 12, 2022
Getting Started With Golang Channels! Here’s Everything You Need to Know
We live in a world where speed is important. With cutting-edge technology coming into the telecommunications and software industry, we expect to get things done quickly. We want to develop applications that are fast, can process high volumes of data and requests, and keep the end-user happy.

This is great, but of course, it’s easier said than done. That’s why concurrency and parallelism are important in application development. We must process data as fast as possible. Every programming language has its own way of dealing with this, and we will see how Golang does it.

Now, many of us choose Golang because of its concurrency, and the inclusion of goroutines and channels has massively impacted the concurrency.

This blog will cover channels and how they work internally, as well as their key components. To benefit the most from this content, it will help to know a little about goroutines and channels as this blog gets into the internals of channels. If you don’t know anything, then don’t worry, we’ll be starting off with an introduction to channels, and then we’ll see how they operate.

What are channels?

Normally, when we talk about channels, we think of the ones in applications like RabbitMQ, Redis, AWS SQS, and so on. Anyone with no or only a small amount of Golang knowledge would think like this. But Channels in Golang are different from a work queue system. In the work queue system like above, there are TCP connections to the channels, but in Go, the channel is a data structure or even a design pattern, which we’ll explain later. So, what are the channels in Golang exactly?

Channels are the medium through which goroutines can communicate with each other. In simple terms, a channel is a pipe that allows a goroutine to either put or read the data.

What are goroutines?

So, a channel is a communication medium for goroutines. Now, let’s give a quick overview of what goroutines are. If you know this already, feel free to skip this section.

Technically, a goroutine is a function that executes independently in a concurrent fashion. In simple terms, it’s a lightweight thread that’s managed by go runtime.

You can create a goroutine by using a Go keyword before a function call.

Let’s say there’s a function called PrintHello, like this:
```
func PrintHello() {
   fmt.Println("Hello")
}
```
You can make this into a goroutine simply by calling this function, as below:
```
//create goroutine
 go PrintHello()
```
Now, let’s head back to channels, as that’s the important topic of this blog.

How to define a channel?

Let’s see a syntax that will declare a channel. We can do so by using the chan keyword provided by Go.

You must specify the data type as the channel can handle data of the same data type.
```
//create channel
 var c chan int
```
Very simple! But this is not useful since it would create a Nil channel. Let’s print it and see.
```
fmt.Println(c)
fmt.Printf("Type of channel: %T", c)
<nil>
Type of channel: chan int
```
As you can see, we have just declared the channel, but we can’t transport data through it. So, to create a useful channel, we must use the make function.
```
//create channel
c := make(chan int)
fmt.Printf("Type of `c`: %T\n", c)
fmt.Printf("Value of `c` is %v\n", c)
 
Type of `c`: chan int
Value of `c` is 0xc000022120
```
As you may notice here, the value of c is a memory address. Keep in mind that channels are nothing but pointers. That’s why we can pass them to goroutines, and we can easily put the data or read the data. Now, let’s quickly see how to read and write the data to a channel.

Read and write operations on a channel:

Go provides an easy way to read and write data to a channel by using the left arrow.
```
c <- 10
```
This is a simple syntax to put the value in our created channel. The same syntax is used to define the “send” only type of channels.

And to get/read the data from channel, we do this:
```
<-c
```
This is also the way to define the “receive” only type of channels.

Let’s see a simple program to use the channels.
```
func printChannelData(c chan int) {
   fmt.Println("Data in channel is: ", <-c)
}
```
This simple function just prints whatever data is in the channel. Now, let’s see the main function that will push the data into the channel.
```
func main() {
   fmt.Println("Main started...")
   //create channel of int
   c := make(chan int)
   // call to goroutine
   go printChannelData(c)
   // put the data in channel
   c <- 10
   fmt.Println("Main ended...")
}
```
This yields to the output:
```
Main started...
Data in channel is:  10
Main ended...
```
Let’s talk about the execution of the program.

1. We declared a printChannelData function, which accepts a channel c of data type integer. In this function, we are just reading data from channel c and printing it.

2. Now, this method will first print “main started…” to the console.

3. Then, we have created the channel c of data type integer using the make keyword.

4. We now pass the channel to the function printChannelData, and as we saw earlier, it’s a goroutine.

5. At this point, there are two goroutines. One is the main goroutine, and the other is what we have declared.

6. Now, we are putting 10 as data in the channel, and at this point, our main goroutine is blocked and waiting for some other goroutine to read the data. The reader, in this case, is the printChannelData goroutine, which was previously blocked because there was no data in the channel. Now that we’ve pushed the data onto the channel, the Go scheduler (more on this later in the blog) now schedules printChannelData goroutine, and it will read and print the value from the channel.

7. After that, the main goroutine again activates and prints “main ended…” and the program stops.

So, what’s happening here? Basically, blocking and unblocking operations are done over goroutines by the Go scheduler. Unless there’s data in a channel you can’t read from it, which is why our printChannelData goroutine was blocked in the first place, the written data has to be read first to resume further operations. This happened in case of our main goroutine.

With this, let’s see how channels operate internally.

Internals of channels:

Until now, we have seen how to define a goroutine, how to declare a channel, and how to read and write data through a channel with a very simple example. Now, let’s look at how Go handles this blocking and unblocking nature internally. But before that, let’s quickly see the types of channels.

Types of channels:

There are two basic types of channels: buffered channels and unbuffered channels. The above example illustrates the behaviour of unbuffered channels. Let’s quickly see the definition of these:
- Unbuffered channel: This is what we have seen above. A channel that can hold a single piece of data, which has to be consumed before pushing other data. That’s why our main goroutine got blocked when we added data into the channel.
- Buffered channel: In a buffered channel, we specify the data capacity of a channel. The syntax is very simple. c := make(chan int,10) the second argument in the make function is the capacity of a channel. So, we can put up to ten elements in a channel. When the capacity is full, then that channel would get blocked so that the receiver goroutine can start consuming it.
Properties of a channel:

A channel does lot of things internally, and it holds some of the properties below:
- Channels are goroutine-safe.
- Channels can store and pass values between goroutines.
- Channels provide FIFO semantics.
- Channels cause goroutines to block and unblock, which we just learned about.
As we see the internals of a channel, you’ll learn about the first three properties.

Channel Structure:

As we learned in the definition, a channel is data structure. Now, looking at the properties above, we want a mechanism that handles goroutines in a synchronized manner and with a FIFO semantics. This can be solved using a queue with a lock. So, the channel internally behaves in that fashion. It has a circular queue, a lock, and some other fields.

When we do this c := make(chan int,10) Go creates a channel using hchan struct, which has the following fields:
type hchan struct { qcount uint // total data in the queue dataqsiz uint // size of the circular queue buf unsafe.Pointer // points to an array of dataqsiz elements elemsize uint16 closed uint32 elemtype *_type // element type sendx uint // send index recvx uint // receive index recvq waitq // list of recv waiters sendq waitq // list of send waiters // lock protects all fields in hchan, as well as several // fields in sudogs blocked on this channel. // // Do not change another G's status while holding this lock // (in particular, do not ready a G), as this can deadlock // with stack shrinking. lock mutex }
```
type hchan struct {
   qcount   uint           // total data in the queue
   dataqsiz uint           // size of the circular queue
   buf      unsafe.Pointer // points to an array of dataqsiz elements
   elemsize uint16
   closed   uint32
   elemtype *_type // element type
   sendx    uint   // send index
   recvx    uint   // receive index
   recvq    waitq  // list of recv waiters
   sendq    waitq  // list of send waiters
 
   // lock protects all fields in hchan, as well as several
   // fields in sudogs blocked on this channel.
   //
   // Do not change another G's status while holding this lock
   // (in particular, do not ready a G), as this can deadlock
   // with stack shrinking.
   lock mutex
}
```
(Above info taken from Golang.org]

This is what a channel is internally. Let’s see one-by-one what these fields are.

qcount holds the count of items/data in the queue.

dataqsize is the size of a circular queue. This is used in case of buffered channels and is the second parameter used in the make function.

elemsize is the size of a channel with respect to a single element.

buf is the actual circular queue where the data is stored when we use buffered channels.

closed indicates whether the channel is closed. The syntax to close the channel is close(<channel_name>). The default value of this field is 0, which is set when the channel gets created, and it’s set to 1 when the channel is closed.

sendx and recvx indicates the current index of a buffer or circular queue. As we add the data into the buffered channel, sendx increases, and as we start receiving, recvx increases.

recvq and sendq are the waiting queue for the blocked goroutines that are trying to either read data from or write data to the channel.

lock is basically a mutex to lock the channel for each read or write operation as we don’t want goroutines to go into deadlock state.

These are the important fields of a hchan struct, which comes into the picture when we create a channel. This hchan struct basically resides on a heap and the make function gives us a pointer to that location. There’s another struct known as sudog, which also comes into the picture, but we’ll learn more about that later. Now, let’s see what happens when we write and read the data.

Read and write operations on a channel:

We are considering buffered channels in this. When one goroutine, let’s say G1, wants to write the data onto a channel, it does following:
- Acquire the lock: As we saw before, if we want to modify the channel, or hchan struct, we must acquire a lock. So, G1 in this case, will acquire a lock before writing the data.
- Perform enqueue operation: We now know that buf is actually a circular queue that holds the data. But before enqueuing the data, goroutine does a memory copy operation on the data and puts the copy into the buffer slot. We will see an example of this.
- Release the lock: After performing an enqueue operation, it just releases the lock and goes on performing further executions.
When goroutine, let’s say G2, reads the above data, it performs the same operation, except instead of enqueue, it performs dequeue while also performing the memory copy operation. This states that in channels there’s no shared memory, so the goroutines only share the hchan struct, which is protected by mutex. Others are just copies of memory.

This satisfies the famous Golang quote: “Do not communicate by sharing memory instead share memory by communicating.”

Now, let’s look at a small example of this memory copy operation.
func printData(c chan *int) { time.Sleep(time.Second * 3) data := <-c fmt.Println("Data in channel is: ", *data) } func main() { fmt.Println("Main started...") var a = 10 b := &a //create channel c := make(chan *int) go printData(c) fmt.Println("Value of b before putting into channel", *b) c <- b a = 20 fmt.Println("Updated value of a:", a) fmt.Println("Updated value of b:", *b) time.Sleep(time.Second * 2) fmt.Println("Main ended...") }
```
func printData(c chan *int) {
   time.Sleep(time.Second * 3)
   data := <-c
   fmt.Println("Data in channel is: ", *data)
}
 
func main() {
   fmt.Println("Main started...")
   var a = 10
   b := &a
   //create channel
   c := make(chan *int)
   go printData(c)
   fmt.Println("Value of b before putting into channel", *b)
   c <- b
   a = 20
   fmt.Println("Updated value of a:", a)
   fmt.Println("Updated value of b:", *b)
   time.Sleep(time.Second * 2)
   fmt.Println("Main ended...")
}
```
And the output of this is:
```
Main started...
Value of b before putting into channel 10
Updated value of a: 20
Updated value of b: 20
Data in channel is:  10
Main ended...
```
So, as you can see, we have added the value of variable a into the channel, and we modify that value before the channel can access it. However, the value in the channel stays the same, i.e., 10. Because here, the main goroutine has performed a memory copy operation before putting the value onto the channel. So, even if you change the value later, the value in the channel does not change.

Write in case of buffer overflow:

We’ve seen that the Go routine can add data up to the buffer capacity, but what happens when the buffer capacity is reached? When the buffer has no more space and a goroutine, let’s say G1, wants to write the data, the go scheduler blocks/pauses G1, which will wait until a receive happens from another goroutine, say G2. Now, since we are talking about buffer channels, when G2 consumes all the data, the Go scheduler makes G1 active again and G2 pauses. Remember this scenario, as we’ll use G1 and G2 frequently here onwards.

We know that goroutine works in a pause and resume fashion, but who controls it? As you might have guessed, the Go scheduler does the magic here. There are few things that the Go scheduler does and those are very important considering the goroutines and channels.

Go Runtime Scheduler

You may already know this, but goroutines are user-space threads. Now, the OS can schedule and manage threads, but it’s overhead to the OS, considering the properties that threads carry.

That’s why the Go scheduler handles the goroutines, and it basically multiplexes the goroutines on the OS threads. Let’s see how.

There are scheduling models, like 1:1, N:1, etc., but the Go scheduler uses the M:N scheduling model.

Basically, this means that there are a number of goroutines and OS threads, and the scheduler basically schedules the M goroutines on N OS threads. For example:

OS Thread 1:

OS Thread 2:

As you can see, there are two OS threads, and the scheduler is running six goroutines by swapping them as needed. The Go scheduler has three structures as below:
- M: M represents the OS thread, which is entirely managed by the OS, and it’s similar to POSIX thread. M stands for machine.
- G: G represents the goroutine. Now, a goroutine is a resizable stack that also includes information about scheduling, any channel it’s blocked on, etc.
- P: P is a context for scheduling. This is like a single thread that runs the Go code to multiplex M goroutines to N OS threads. This is important part, and that’s why P stands for processor.
Diagrammatically, we can represent the scheduler as:

(This diagram is referenced from The Go scheduler]

The P processor basically holds the queue of runnable goroutines—or simply run queues.

So, anytime the goroutine (G) wants to run it on a OS thread (M), that OS thread first gets hold of P i.e., the context. Now, this behaviour occurs when a goroutine needs to be paused and some other goroutines must run. One such case is a buffered channel. When the buffer is full, we pause the sender goroutine and activate the receiver goroutine.

Imagine the above scenario: G1 is a sender that tries to send a full buffered channel, and G2 is a receiver goroutine. Now, when G1 wants to send a full channel, it calls into the runtime Go scheduler and signals it as gopark. So, now scheduler, or M, changes the state of G1 from running to waiting, and it will schedule another goroutine from the run queue, say G2.

This transition diagram might help you better understand:

As you can see, after the gopark call, G1 is in a waiting state and G2 is running. We haven’t paused the OS thread (M); instead, we’ve blocked the goroutine and scheduled another one. So, we are using maximum throughput of an OS thread. The context switching of goroutine is handled by the scheduler (P), and because of this, it adds complexity to the scheduler.

This is great. But how do we resume G1 now because it still wants to add the data/task on a channel, right? So, before G1 sends the gopark signal, it actually sets a state of itself on a hchan struct, i.e., our channel in the sendq field. Remember the sendq and recvq fields? They’re waiting senders and receivers.

Now, G1 stores the state of itself as a sudog struct. A sudog is simply a goroutine that is waiting on an element. The sudog struct has these elements:
```
type sudog struct{
   g *g
   isSelect bool
   next *sudog
   prev *sudog
   elem unsafe.Pointer //data element
   ...
}
```
g is a waiting goroutine, next and prev are the pointers to sudog/goroutine respectively if there’s any next or previous goroutine present, and elem is the actual element it’s waiting on.

So, considering our example, G1 is basically waiting to write the data so it will create a state of itself, which we’ll call sudog as below:

Cool. Now we know, before going into the waiting state, what operations G1 performs. Currently, G2 is in a running state, and it will start consuming the channel data.

As soon as it receives the first data/task, it will check the waiting goroutine in the sendq attribute of an hchan struct, and it will find that G1 is waiting to push data or a task. Now, here is the interesting thing: G2 will copy that data/task to the buffer, and it will call the scheduler, and the scheduler will put G1 from the waiting state to runnable, and it will add G1 to the run queue and return to G2. This call from G2 is known as goready, and it will happen for G1. Impressive, right? Golang behaves like this because when G1 runs, it doesn’t want to hold onto a lock and push the data/task. That extra overhead is handled by G2. That’s why the sudog has the data/task and the details for the waiting goroutine. So, the state of G1 is like this:

As you can see, G1 is placed on a run queue. Now we know what’s done by the goroutine and the go scheduler in case of buffered channels. In this example, the sender gorountine came first, but what if the receiver goroutine comes first? What if there’s no data in the channel and the receiver goroutine is executed first? The receiver goroutine (G2) will create a sudog in recvq on the hchan struct. Things are a little twisted when G1 goroutine activates. It will now see whether there are any goroutines waiting in the recvq, and if there is, it will copy the task to the waiting goroutine’s (G2) memory location, i.e., the elem attribute of the sudog.

This is incredible! Instead of writing to the buffer, it will write the task/data to the waiting goroutine’s space simply to avoid G2’s overhead when it activates. We know that each goroutine has its own resizable stack, and they never use each other’s space except in case of channels. Until now, we have seen how the send and receive happens in a buffered channel.

This may have been confusing, so let me give you the summary of the send operation.

Summary of a send operation for buffered channels:
1. Acquire lock on the entire channel or the hchan struct.
2. Check if there’s any sudog or a waiting goroutine in the recvq. If so, then put the element directly into its stack. We saw this just now with G1 writing to G2’s stack.
3. If recvq is empty, then check whether the buffer has space. If yes, then do a memory copy of the data.
4. If the buffer is full, then create a sudog under sendq of the hchan struct, which will have details, like a currently executing goroutine and the data to put on the channel.
We have seen all the above steps in detail, but concentrate on the last point.

It’s kind of similar to an unbuffered channel. We know that for unbuffered channels, every read must have a write operation first and vice versa.

So, keep in mind that an unbuffered channel always works like a direct send. So, a summary of a read and write operation in unbuffered channel could be:
- Sender first: At this point, there’s no receiver, so the sender will create a sudog of itself and the receiver will receive the value from the sudog.
- Receiver first: The receiver will create a sudog in recvq, and the sender will directly put the data in the receiver’s stack.
With this, we have covered the basics of channels. We’ve learned how read and write operates in a buffered and unbuffered channel, and we talked about the Go runtime scheduler.

Conclusion:

Channels is a very interesting Golang topic. They seem to be difficult to understand, but when you learn the mechanism, they’re very powerful and help you to achieve concurrency in applications. Hopefully, this blog helps your understanding of the fundamental concepts and the operations of channels.
December 12, 2022