Blog

  • Prow + Kubernetes – A Perfect Combination To Execute CI/CD At Scale

    Intro

    Kubernetes is currently the hottest and standard way of deploying workloads in the cloud. It’s well-suited for companies and vendors that need self-healing, high availability, cloud-agnostic characteristics, and easy extensibility.

    Now, on another front, a problem has arisen within the CI/CD domain. Since people are using Kubernetes as the underlying orchestrator, they need a robust CI/CD tool that is entirely Kubernetes-native.

    Enter Prow

    Prow compliments the Kubernetes family in the realm of automation and CI/CD.

    In fact, it is the only project that best exemplifies why and how Kubernetes is such a superb platform to execute CI/CD at scale.

    Prow (meaning: portion of a ship’s bow—ship’s front end–that’s above water) is a Kubernetes-native CI/CD system, and it has been used by many companies over the past few years like Kyma, Istio, Kubeflow, Openshift, etc.

    Where did it come from?

    Kubernetes is one of the largest and most successful open-source projects on GitHub. When it comes to Prow’s conception , the Kubernetes community was trying hard to keep its head above water in matters of CI/CD. Their needs included the execution of more than 10k CI/CD jobs/day, spanning over 100+ different repositories in various GitHub organizations—and other automation technology stacks were just not capable of handling everything at this scale.

    So, the Kubernetes Testing SIG created their own tools to compliment Prow. Because Prow is currently residing under Kubernetes test-infra project, one might underestimate its true prowess/capabilities. I personally would like to see Prow receive a dedicated repo coming out from under the umbrella of test-infra.

    What is Prow?

    Prow is not too complex to understand but still vast in a subtle way. It is designed and built on a distributed microservice architecture native to Kubernetes.

    It has many components that integrate with one another (plank, hook, etc.) and a bunch of standalone ones that are more of a plug-n-play nature (trigger, config-updater, etc.).

    For the context of this blog, I will not be covering Prow’s entire architecture, but feel free to dive into it on your own later. 

    Just to name the main building blocks for Prow:

    • Hook – acts as an API gateway to intercept all requests from Github, which then creates a Prow job custom resource that reads the job configuration as well as calls any specific plugin if needed.
    • Plank – is the Prow job controller; after Hook creates a Prow job, Plank processes it and creates a Kubernetes pod for it to run the tests.
    • Deck – serves as the UI for the history of jobs that ran in the past or are currently running.
    • Horologium – is the component that processes periodic jobs only.
    • Sinker responsible for cleaning up old jobs and pods from the cluster.

    More can be found here: Prow Architecture. Note that this link is not the official doc from Kubernetes but from another great open source project that uses Prow extensively day-in-day-out – Kyma.

    This is how Prow can be picturized:


     

     

    Here is a list of things Prow can do and why it was conceived in the first place.

    • GitHub Automation on a wide range

      – ChatOps via slash command like “/foo
      – Fine-tuned policies and permission management in GitHub via OWNERS files
      – tide – PR/merge automation
      ghProxy – to avoid hitting API limits and to use GitHub API request cache
      – label plugin – labels management 
      – branchprotector – branch protection configuration 
      – releasenote – release notes management
    • Job Execution engine – Plank‍
    • Job status Reporting to CI/CD dashboard – crier‍
    • Dashboards for comprehensive job/PR history, merge status, real-time logs, and other statuses – Deck‍
    • Plug-n-play service to interact with GCS and show job artifacts on dashboard – Spyglass‍
    • Super easy pluggable Prometheus stack for observability – metrics‍
    • Config-as-Code for Prow itself – updateconfig‍
    • And many more, like sinker, branch protector, etc.

    Possible Jobs in Prow

    Here, a job means any “task that is executed over a trigger.” This trigger can be anything from a github commit to a new PR or a periodic cron trigger. Possible jobs in Prow include:  

    • Presubmit – these jobs are triggered when a new github PR is created.
    • Postsubmit – triggered when there is a new commit.
    • Periodic – triggered on a specific cron time trigger.

    Possible states for a job

    • triggered – a new Prow-job custom resource is created reading the job configs
    • pending – a pod is created in response to the Prow-job to run the scripts/tests; Prow-job will be marked pending while the pod is getting created and running 
    • success – if a pod succeeds, the Prow-job status will change to success 
    • failure – if a pod fails, the Prow-job status will be marked failure
    • aborted – when a job is running and the same one is retriggered, then the first pro-job execution will be aborted and its status will change to aborted and the new one is marked pending

    What a job config looks like:

    presubmits:
      kubernetes/community:
      - name: pull-community-verify  # convention: (job type)-(repo name)-(suite name)
        branches:
        - master
        decorate: true
        always_run: true
        spec:
          containers:
          - image: golang:1.12.5
            command:
            - /bin/bash
            args:
            - -c
            - "export PATH=$GOPATH/bin:$PATH && make verify"

    • Here, this job is a “presubmit” type, meaning it will be executed when a PR is created against the “master” branch in repo “kubernetes/community”.
    • As shown in spec, a pod will be created from image “Golang” where this repo will be cloned, and the mentioned command will be executed at the start of the container.
    • The output of that command will decide if the pod has succeeded or failed, which will, in turn, decide if the Prow job has successfully completed.

    More jobs configs used by Kubernetes itself can be found here – Jobs

    Getting a minimalistic Prow cluster up and running on the local system in minutes.

    Pre-reqs:

    • Knowledge of Kubernetes 
    • Knowledge of Google Cloud and IAM

    For the context of this blog, I have created a sample github repo containing all the basic manifest files and config files. For this repo, the basic CI has also been configured. Feel free to clone/fork this and use it as a getting started guide.

    Let’s look at the directory structure for the repo:

    .
    ├── docker/     # Contains docker image in which all the CI jobs will run
    ├── hack/       # Contains small hack scripts used in a wide range of jobs 
    ├── hello.go
    ├── hello_test.go
    ├── Dockerfile
    ├── Makefile
    ├── prow
    │   ├── cluster/       # Install prow on k8s cluster
    │   ├── jobs/          # CI jobs config
    │   ├── labels.yaml    # Prow label config for managing github labels
    │   ├── config.yaml    # Prow config
    │   └── plugins.yaml   # Prow plugins config
    └── README.md

    1. Create a bot account. For info, look here. Add this bot as a collaborator in your repo. 

    2. Create an OAuth2 token from the GitHub GUI for the bot account.

    $ echo "PUT_TOKEN_HERE" > oauth
    $ kubectl create secret generic oauth --from-file=oauth=oauth

    3. Create an OpenSSL token to be used with the Hook.

    $ openssl rand -hex 20 > hmac
    $ kubectl create secret generic hmac --from-file=hmac=hmac

    4. Install all the Prow components mentioned in prow-starter.yaml.

    $ make deploy-prow

    5. Update all the jobs and plugins needed for the CI (rules mentioned in the Makefile). Use commands:

    • Updates in plugins.yaml and presubmits.yaml:
    • Change the repo name (velotio-tech/k8s-prow-guide) for the jobs to be configured 
    • Updates in config.yaml:
    • Create a GCS bucket 
    • Update the name of GCS bucket (GCS_BUCKET_NAME) in the config.yaml
    • Create a service_account.json with GCS storage permission and download the JSON file 
    • Create a secret from above service_account.json
    $ kubectl create secret generic gcs-sa --from-file=service-account.json=service-account.json

    • Update the secret name (GCS_SERVICE_ACC) in config.yaml
    $ make update-config
    $ make update-plugins
    $ make update-jobs

    6. For exposing a webhook from GitHub repo and pointing it to the local machine, use Ultrahook. Install Ultrahook. This will give you a publicly accessible endpoint. In my case, the result looked like this: http://github.sanster23.ultrahook.com. 

    $ echo "api_key: <API_KEY_ULTRAHOOK>" > ~/.ultrahook
    $ ultrahook github http://<MINIKUBE_IP>:<HOOK_NODE_PORT>/hook

    7. Create a webhook in your repo so that all events can be published to Hook via the public URL above:

    • Set the webhook URL from Step 6
    • Set Content Type as application/json
    • Set the value of token the same as hmac token secret, created in Step 2 
    • Check the “Send me everything” box

    8. Create a new PR and see the magic.

    9. Dashboard for Prow will be accessible at http://<minikube_ip>:<deck_node_port></deck_node_port></minikube_ip>

    • MINIKUBE_IP : 192.168.99.100  ( Run “minikube ip”)
    • DECK_NODE_PORT :  32710 ( Run “kubectl get svc deck” )

    I will leave you guys with an official reference of Prow Dashboard:

    What’s Next

    Above is an effort just to give you a taste of what Prow can do with and how easy it is to set up at any scale of infra and for a project of any complexity.

    P.S. – The content surrounding Prow is scarce, making it a bit unexplored in certain ways, but I found this helpful channel on the Kubernetes Slack #prow. Hopefully, this helps you explore the uncharted waters of Kubernetes Native CI/CD. 

  • A Primer To Flutter

    In this blog post, we will explore the basics of cross platform mobile application development using Flutter, compare it with existing cross-platform solutions and create a simple to-do application to demonstrate how quickly we can build apps with Flutter.

    Brief introduction

    Flutter is a free and open source UI toolkit for building natively compiled applications for mobile platforms like Android and iOS, and for the web and desktop as well. Some of the prominent features are native performance, single codebase for multiple platforms, quick development, and a wide range of beautifully designed widgets.

    Flutter apps are written in Dart programming language, which is a very intuitive language with a C-like syntax. Dart is optimized for performance and developer friendliness. Apps written in Dart can be as fast as native applications because Dart code compiles down to machine instructions for ARM and x64 processors and to Javascript for the web platform. This, along with the Flutter engine, makes Flutter apps platform agnostic.

    Other interesting Dart features used in Flutter apps is the just-in-time (JIT) compiler, used during development and debugging, which powers the hot reload functionality. And the ahead-of-time (AOT) compiler which is used when building applications for the target platforms such as Android or iOS, resulting in native performance.

    Everything composed on the screen with Flutter is a widget including stuff like padding, alignment or opacity. The Flutter engine draws and controls each pixel on the screen using its own graphics engine called Skia.

    Flutter vs React-Native

    Flutter apps are truly native and hence offer great performance, whereas apps built with react-native requires a JavaScript bridge to interact with OEM widgets. Flutter apps are much faster to develop because of a wide range of built-in widgets, good amount of documentation, hot reload, and several other developer-friendly choices made by Google while building Dart and Flutter. 

    React Native, on the other hand, has the advantage of being older and hence has a large community of businesses and developers who have experience in building react-native apps. It also has more third party libraries and packages as compared to Flutter. That said, Flutter is catching up and rapidly gaining momentum as evident from Stackoverflow’s 2019 developer survey, where it scored 75.4% under “Most Loved Framework, Libraries and Tools”.

     

    All in all, Flutter is a great tool to have in our arsenal as mobile developers in 2020.

    Getting started with a sample application

    Flutter’s official docs are really well written and include getting started guides for different OS platforms, API documentation, widget catalogue along with several cookbooks and codelabs that one can follow along to learn more about Flutter.

    To get started with development, we will follow the official guide which is available here. Flutter requires Flutter SDK as well as native build tools to be installed on the machine to begin development. To write apps, one may use Android Studios or VS Code, or any text editor can be used with Flutter’s command line tools. But a good rule of thumb is to install Android Studio because it offers better support for management of Android SDK, build tools and virtual devices. It also includes several built-in tools such as the icons and assets editor.

    Once done with the setup, we will start by creating a project. Open VS Code and create a new Flutter project:

    We should see the main file main.Dart with some sample code (the counter application). We will start editing this file to create our to-do app.

    Some of the features we will add to our to-do app:

    • Display a list of to-do items
    • Mark to-do items as completed
    • Add new item to the list

    Let’s start by creating a widget to hold our list of to-do items. This is going to be a StatefulWidget, which is a type of widget with some state. Flutter tracks changes to the state and redraws the widget when a new change in the state is detected.

    After creating theToDoList widget, our main.Dart file looks like this:

    /// imports widgets from the material design 
    import 'package:flutter/material.dart';
    
    void main() => runApp(TodoApp());
    
    /// Stateless widgets must implement the build() method and return a widget. 
    /// The first parameter passed to build function is the context in which this widget is built
    class TodoApp extends StatelessWidget {
      @override
      Widget build(BuildContext context) {
        return MaterialApp(
          title: 'TODO',
          theme: ThemeData(
            primarySwatch: Colors.blue,
          ),
          home: TodoList(),
        );
      }
    }
    
    /// Stateful widgets must implement the createState method
    /// State of a stateless widget against has a build() method with context
    class TodoList extends StatefulWidget {
      @override
      State<StatefulWidget> createState() => TodoListState();
    }
    
    class TodoListState extends State<TodoList> {
      @override
      Widget build(BuildContext context) {
        return Scaffold(
          appBar: AppBar(
            title: Text('Todo'),
          ),
          body: Text('Todo List'),
        );
      }
    }

    The ToDoApp class here extends Stateless widget i.e. a widget without any state whereas ToDoList extends StatefulWidget. All Flutter apps are a combination of these two types of widgets. StatelessWidgets must implement the build() method whereas Stateful widgets must implement the createState() method.

    Some built-in widgets used here are the MaterialApp widget, the Scaffold widget and AppBar and Text widgets. These all are imported from Flutter’s implementation of material design, available in the material.dart package. Similarly, to use native looking iOS widgets in applications, we can import widgets from the flutter/cupertino.dart package.

    Next, let’s create a model class that represents an individual to-do item. We will keep this simple i.e. only store label and completed status of the to-do item.

    class Todo {
      final String label;
      bool completed;
      Todo(this.label, this.completed);
    }

    The constructor we wrote in the code above is implemented using one of Dart’s syntactic sugar to assign a constructor argument to the instance variable. For more such interesting tidbits, take the Dart language tour.

    Now let’s modify the ToDoListState class to store a list of to-do items in its state and also display it in a list. We will use ListView.builder to create a dynamic list of to-do items. We will also use Checkbox and Text widget to display to-do items.

    /// State is composed all the variables declared in the State implementation of a Stateful widget
    class TodoListState extends State<TodoList> {
      final List<Todo> todos = List<Todo>();
      @override
      Widget build(BuildContext context) {
        return Scaffold(
          appBar: AppBar(
            title: Text('Todo'),
          ),
          body: Padding(
            padding: EdgeInsets.all(16.0),
            child: todos.length > 0
                ? ListView.builder(
                    itemCount: todos.length,
                    itemBuilder: _buildRow,
                  )
                : Text('There is nothing here yet. Start by adding some Todos'),
          ),
        );
      }
    
      /// build a single row of the list
      Widget _buildRow(context, index) => Row(
            children: <Widget>[
              Checkbox(
                  value: todos[index].completed,
                  onChanged: (value) => _changeTodo(index, value)),
              Text(todos[index].label,
                  style: TextStyle(
                      decoration: todos[index].completed
                          ? TextDecoration.lineThrough
                          : null))
            ],
          );
    
      /// toggle the completed state of a todo item
      _changeTodo(int index, bool value) =>
          setState(() => todos[index].completed = value);
    }

    A few things to note here are: private functions start with an underscore, functions with a single line of body can be written using fat arrows (=>) and most importantly, to change the state of any variable contained in a Stateful widget, one must call the setState method.

    The ListView.builder constructor allows us to work with very large lists, since list items are created only when they are scrolled.

    Another takeaway here is the fact that Dart is such an intuitive language that it is quite easy to understand and you can start writing Dart code immediately.

    Everything on a screen, like padding, alignment or opacity, is a widget. Notice in the code above, we have used Padding as a widget that wraps the list or a text widget depending on the number of to-do items. If there’s nothing in the list, a text widget is displayed with some default message.

    Also note how we haven’t used the new keyword when creating instances of a class, say Text. That’s because using the new keyword is optional in Dart and discouraged, according to the effective Dart guidelines.

    Running the application

    At this point, let’s run the code and see how the app looks on a device. Press F5, then select a virtual device and wait for the app to get installed. If you haven’t created a virtual device yet, refer to the getting started guide.

    Once the virtual device launches, we should see the following screen in a while. During development, the first launch always takes a while because the entire app gets built and installed on the virtual device, but subsequent changes to code are instantly reflected on the device, thanks to Flutter’s amazing hot reload feature. This reduces development time and also allows developers and designers to experiment more frequently with the interface changes.

    As we can see, there are no to-dos here yet. Now let’s add a floating action button that opens a dialog which we will use to add new to-do items.

    Adding the FAB is as easy as passing floatingActionButton parameter to the scaffold widget.

    floatingActionButton: FloatingActionButton(
      child: Icon(Icons.add),                                /// uses the built-in icons
      onPressed: () => _promptDialog(context),
    ),

    And declare a function inside ToDoListState that displays a popup (AlertDialog) with a text input box.

    /// display a dialog that accepts text
      _promptDialog(BuildContext context) {
        String _todoLabel = '';
        return showDialog(
            context: context,
            builder: (context) {
              return AlertDialog(
                title: Text('Enter TODO item'),
                content: TextField(
                    onChanged: (value) => _todoLabel = value,
                    decoration: InputDecoration(hintText: 'Add new TODO item')),
                actions: <Widget>[
                  FlatButton(
                    child: new Text('CANCEL'),
                    onPressed: () => Navigator.of(context).pop(),
                  ),
                  FlatButton(
                    child: new Text('ADD'),
                    onPressed: () {
                      setState(() => todos.add(Todo(_todoLabel, false)));
                      /// dismisses the alert dialog
                      Navigator.of(context).pop();
                    },
                  )
                ],
              );
            });
      }

    At this point, saving changes to the file should result in the application getting updated on the virtual device (hot reload), so we can just click on the new floating action button that appeared on the bottom right of the screen and start testing how the dialog looks.

    We used a few more built-in widgets here:

    • AlertDialog: a dialog prompt that opens up when clicking on the FAB
    • TextField: text input field for accepting user input
    • InputDecoration: a widget that adds style to the input field
    • FlatButton: a variation of button with no border or shadow
    • FloatingActionButton: a floating icon button, used to trigger primary action on the screen

    Here’s a quick preview of how the application should look and function at this point:

    And just like that, in less than 100 lines of code, we’ve built the user interface of a simple, cross platform to-do application.

    The source code for this application is available here.

    A few links to further explore Flutter:

    Conclusion:

    To conclude, Flutter is  an extremely powerful toolkit to build cross platform applications that have native performance and are beautiful to look at. Dart, the language behind Flutter, is designed considering the nuances of user interface development and Flutter offers a wide range of built-in widgets. This makes development fun and development cycles shorter; something that we experienced while building the to-do app. With Flutter, time to market is also greatly reduced which enables teams to experiment more often, collect more feedback and ship applications faster.  And finally, Flutter has a very enthusiastic and thriving community of designers and developers who are always experimenting and adding to the Flutter ecosystem.

  • How to Test the Performance of Flutter Apps – A Step-by-step Guide

    The rendering performance of a mobile app is crucial in ensuring a smooth and delightful user experience. 

    We will explore various ways of measuring the rendering performance of a mobile application, automate this process, and understand the intricacies of rendering performance metrics.

    Internals of Flutter and the Default Performance

    Before we begin exploring performance pitfalls and optimizations, we first need to understand the default performance of a basic hello-world Flutter app. Flutter apps are already highly optimized for speed and are known to perform better than existing cross-platform application development platforms, such as React Native or Apache Cordova.

    By default, Flutter apps aim to render at 60 frames per second on most devices and up to 120 frames per second on devices that support a 120 Hz refresh rate. This is made possible because of Flutter’s unique rendering mechanism.

    It doesn’t render UI components like a traditional mobile application framework, which composes native widgets on the screen. Instead, it uses a high-performance graphics engine, Skia, and renders all components on the screen as if they were part of a single two-dimensional scene.

    Skia is a highly optimized, two-dimensional graphics engine used by a variety of apps, such as Google Chrome, Fuchsia, Chrome OS, Flutter, etc. This game-like rendering behavior gives Flutter an advantage over existing applications SDK when it comes to default performance.

    Common Performance Pitfalls:

    Now, let’s understand some common performance issues or pitfalls seen in mobile applications. Some of them are listed below: 

    • Latency introduced because of network and disk IO 
    • When heavy computations are done on the main UI thread
    • Frequent and unnecessary state updates
    • Jittery UX due to lack of progressive or lazy loading of images and assets
    • Unoptimized or very large assets can take a lot of time to render

    To identify and fix these performance bottlenecks, mobile apps can be instrumented for time complexity and/or space complexity.

    Most of these issues can be identified using a profile. Profiling an app means dynamically analyzing the application’s code, in a runtime environment, for CPU and memory usage and sometimes the usage of other resources, such as network and battery. Performance profiling entails analyzing CPU usage for time complexity to identify parts of the application where CPU usage is high and beyond a certain threshold. Let’s see how profiling works in the Flutter ecosystem.

    How to Profile a Flutter App

    Below are a set of steps that you may follow along to set up profiling on a Flutter app.

    1. Launch the application in profile mode. To do so, we can run the app using the command
    flutter run --profile

    on the terminal or set up a launch configuration for the IDE or code editor. Testing the performance of Flutter apps in profile mode and not in debug (dev) mode ensures that the true release performance of the application is assessed. Dev mode has additional pieces of code running that aren’t part of release builds.

    1. Some developers may need to activate Flutter ‘devtools’ by executing this command: 
    flutter pub global activate devtools

    1. To set up ‘profile mode’, launch the configuration for a Flutter app in VSCode; edit or create the file at project_directory/.vscode/launch.json and create a launch configuration “Profile” as follows:
    {
     "version": "0.2.0",
     "configurations": [
       {
         "name": "Development",
         "request": "launch",
         "type": "dart"
       },
       {
         "name": "Profile",
         "request": "launch",
         "type": "dart",
         "flutterMode": "profile"
       }
     ]
    }

    1. Once the application is running on a real device, go to the timeline view of the DevTools and enable performance overlays. This allows developers to see two graphs on top of each other and overlaid on top of the application. The top graph represents the raster thread timeline, and the second graph below it represents the UI thread timeline. 

    ⚠️ Caution: It is recommended that performance profiling of a Flutter application should only be done on a real device and not on any simulator or emulator. Simulators are not an exact representation of a real device when it comes to hardware and software capabilities, disk IO latency, display refresh rate, etc. Furthermore, the profiling is best done on the slowest, oldest device that the application targets. This ensures that the application is well-tested for performance pitfalls on target platforms and will offer a smooth user experience to end-users.

    Understanding the Performance Overlays

    Once the timeline view is enabled in profile mode, the application’s running instance gets an overlay on the top area. This overlay has two charts on top of each other.

    Both charts display timeline metrics 300 frames at a time. Any frame going over the horizontal black lines on the chart means that the frame is taking more than 16 milliseconds to render, which leads to a frame drop and eventually a jittery user experience.

    Fig:- Dart profiler for optimal rendering

    Look at the timeline above. No frames are going over the black lines,  i.e., no frame takes more than 16 milliseconds to render. This represents an optimal rendering with no frame drops, i.e., no jank for end users.

    Fig:- Dart profiler for suboptimal rendering

    Here, some frames in the timeline above are going over the horizontal black lines, i.e., some frames are taking more than 16 milliseconds to render. That is because the application was trying to load an image from the network while the user was also scrolling through the page. This means there is some performance bottleneck in this part of the application, which can be further optimized to ensure smoother rendering, i.e., a jank-free end-user experience.

    The two graphs mentioned above can be described as:

    1. UI thread: This is the first chart, and it portrays the timeline view of all the dart code executions. Instructions written by developers are executed on this thread, and a layer tree (for rendering) is created, which is then sent to the raster thread for rendering. 
    2. Raster thread: The raster thread runs the Skia engine and talks to the GPU and is responsible for drawing the screen’s layer tree. Developers can not directly instruct the GPU thread. Most performance optimizations are applicable to the UI thread because the raster thread is already optimized by the Flutter dev team.

    Automatically Testing for Jank:

    Profiling the app gives some idea of which screens and user interaction may be optimized for performance, but it doesn’t actually give a concrete reproducible assessment. So, let’s write some code to automate the process of profiling and detecting sources of lag in our Flutter app.

    First, include the Flutter driver extension in the application’s main entrypoint file and enable the Flutter drive extension. In most cases, this file is called main.dart and invokes the runApp() method.

    import 'package:flutter_driver/driver_extension.dart';
     
    void main() {
      enableFlutterDriverExtension();
      runApp(MyApp());
    }

    Next, let’s write a Flutter driver script to drive parts of the application that need to be profiled. Any and all user behavior such as navigation, taps, scroll, multipoint touches, and gestures can be simulated by a driver script.

    To measure the app’s rendering performance, we will make sure that we are driving and testing parts of the application exactly like a user would do, i.e., we need to test interactions like click or scroll and transitions like page changes and back navigation. Flutter driver makes this simpler by introducing a huge set of methods such as find(), tap(), scroll(), etc.

    The driver script will also have to account for and mock any sources of latency, such as time taken during API calls or while reading a file from the local file system.

    We also need to run these automated tests multiple times to draw conclusions from average render times.

    The following test driver script checks for a simple user interaction:

    • Launches the app
    • Waits for a list of items
    • Finds and clicks on the first list item, which takes users to a different page
    • Views some information on the page
    • Presses the back button to go back to the list

    The script also does the following:

    • Tracks time taken during each user interaction by wrapping interactions inside the driver.traceAction() method
    • Records and writes the UI thread and the raster thread timelines to a file ui_timeline.json
    import 'package:flutter_driver/flutter_driver.dart';
    import 'package:test/test.dart';
     
    void main() {
     group('App name - home', () {
       FlutterDriver driver;
     
       setUpAll(() async {
         driver = await FlutterDriver.connect();
       });
     
       tearDownAll(() async {
         if (driver != null) {
           driver.close();
         }
       });
     
       test('list has row items', () async {
         final timeline = await driver.traceAction(() async {
           // wait for list items
           await driver.waitFor(find.byValueKey('placesList'));
     
           // get the first row in the list
           final firstRow = find.descendant(
               of: find.byValueKey('placesList'),
               matching: find.byType('PlaceRow'),
               firstMatchOnly: true);
     
           // tap on the first row
           await driver.tap(firstRow);
     
           // wait for place details
           await driver.waitFor(find.byValueKey("placeDetails"));
     
           // go back to lists
           await driver.tap(find.byTooltip('Back'));
         });
     
         // write summary to a file
         final summary = new TimelineSummary.summarize(timeline);
         await summary.writeSummaryToFile('ui_timeline', pretty: true);
         await summary.writeTimelineToFile('ui_timeline', pretty: true);
       });
     });

    To run the script, the following command can be executed on the terminal:

    flutter drive -t lib/main.dart --driver test_driver/main_test.dart --profile

    The test driver creates a release-like app bundle that is installed on the target device and driven by the driver script. This test is recommended to be run on a real device, preferably the slowest device targeted by the app.

    Once the script finishes execution, two json files are written to the build directory.

    ./build/ui_timeline.timeline_summary.json
    ./build/ui_timeline.timeline.json

    Viewing the Results:

    Launch the Google Chrome web browser and go to URL: chrome://tracing. Click on the load button on the top left and load the file ui_timeline.timeline.json.

    The timeline summary when loaded into the tracing tool can be used to walk through the hierarchical timeline of the application and exposes various metrics, such as CPU duration, start time, etc., to better understand sources of performance issues in the app. The tracing tool is versatile and displays methods invoked under the hood in a hierarchical view that can be navigated through by mouse or by pressing A, S, D, F keys. 

    Fig:- Chrome tracing in action

    The other file, i.e., the timeline_summary file, can be opened in a code editor and eye-balled for performance data. It provides a set of metrics related to the performance of the application. For example, the flutter_driver script above outputs the following timeline on a single run:

    "average_frame_build_time_millis": 1.6940195121951216,
     "90th_percentile_frame_build_time_millis": 2.678,
     "99th_percentile_frame_build_time_millis": 7.538,
     "worst_frame_build_time_millis": 14.687,
     "missed_frame_build_budget_count": 0,
     "average_frame_rasterizer_time_millis": 6.147395121951226,
     "90th_percentile_frame_rasterizer_time_millis": 9.029,
     "99th_percentile_frame_rasterizer_time_millis": 15.961,
     "worst_frame_rasterizer_time_millis": 21.476,
     "missed_frame_rasterizer_budget_count": 2,
     "frame_count": 205,
     "frame_rasterizer_count": 205,
     "average_vsync_transitions_missed": 1.5,
     "90th_percentile_vsync_transitions_missed": 2.0,
     "99th_percentile_vsync_transitions_missed": 2.0
    }

    Each of these metrics can be inspected, analyzed, and optimized. For example, the value of average_frame_build_time_millis should always be below 16 milliseconds to ensure that the app runs at 60 frames per second. 

    More details about each of these fields can be found here.

    Conclusion

    In this blog post, we explored how to profile and measure the performance of a Flutter application. We also explored ways to identify and fix performance pitfalls, if any.

    We then created a Flutter driver script to automate performance testing of Flutter apps and produce a summary of rendering timelines as well as various performance metrics such as average_frame_build_time_millis.

    The automated performance tests ensure that the app is tested for performance in a reproducible way against different devices and can be run multiple times to calculate a running average and draw various insights. These metrics can be objectively looked at to measure the performance of an application and fix any bottlenecks in the application.

    A performant app means faster rendering and optimal resource utilization, which is essential to ensuring a jank-free and smooth user experience. It also contributes greatly to an app’s popularity. Do try profiling and analyzing the performance of some of your Flutter apps!

    Related Articles

    1. A Primer To Flutter

    2. Building High-performance Apps: A Checklist To Get It Right

  • Parallelizing Heavy Read and Write Queries to SQL Datastores using Spark and more!

    The amount of data in our world has been exploding exponentially day by day. Processing and analyzing this Big Data has become key in the current age to make informed, data-driven decisions. Spark is a unified distributed data processing engine used for Big Data. Spark can be used to process Big Data in an efficient manner. Spark lets you process Big Data faster by splitting the work into chunks and assigning those chunks to computation resources across nodes. It can handle up to petabytes of data, which is millions of gigabytes of data. It processes all its data in memory, which makes it faster.

    We talked about processing Big Data in Spark, but we know spark doesn’t store any data like other file systems. So, to process data in Spark, we must read data from different data sources, clean or process the data, and again store this data in one of the target data sources. Data sources can be files, APIs, databases, or streams. 

    Database management systems have been present for a decade. Many applications generate huge amounts of data and store data in database management systems. And a lot of times, we need to connect spark to the database and process that data.

    In this blog, we are going to discuss how to use spark to read from and write to databases in parallel. Our focus will be on reading/writing data from/to the database using different methods, which will help us read/write TeraBytes of data in an efficient manner.

    Reading / Writing data from/to Database using Spark:

    To read data or write data from/to the database, we will need to perform a few basic steps regardless of any programming language or framework we are using. What follows is an overview of the steps to read data from databases.

    Step 1: Register Driver or Use Connector

    Get the respective driver of your database and register the driver, or use the connector to connect to the database.

    Step 2: Make a connection

    Next, the driver or connector makes a connection to the database.

    Step 3: Run query statement

    Using the connection created in the previous step, execute the query, which will return the result.

    Step 4: Process result

    For the result, we got in the previous step, process it as per your requirement.

    Step 5: Close the connection

    Dataset we are using:

    Covid data

    This dataset contains details of COVID patients across all states. It has different information such as State, Confirmed, Recovered, Deceased, Other, Tested, and Date.

    You can load this dataset in any of the databases you work with and can try out the entire discussion practically.

    The following image shows ten records of the entire dataset.

    Single-partition Spark program:

    ## Creating a spark session and adding Postgres Driver to spark.
    from pyspark.sql import SparkSession
    
    ## Creating spark session and adding Postgres Driver to spark.
    spark_session = SparkSession.builder 
        .master("local") 
        .appName("Databases") 
        .config("spark.jars.packages", "org.postgresql:postgresql:42.2.8") 
        .getOrCreate()
    
    hostname = "localhost",
    jdbc_port = 5432,
    dbname = "aniket",
    username = "postgres",
    password = "pass@123"
    
    jdbc_url = "jdbc:postgresql://{0}:{1}/{2}".format(hostname, jdbc_port, dbname)
    
    ## reading data
    table_data_df = spark_session.read 
        .format("jdbc") 
        .option("url", jdbc_url) 
        .option("dbtable", "aniket") 
        .option("user", username) 
        .option("password", password) 
        .option("driver", "org.postgresql.Driver") 
        .load()
    
    ## writing data
    table_data_df.write 
        .format("jdbc") 
        .option("url", jdbc_url) 
        .option("dbtable", "spark_schema.zipcode_table") 
        .option("user", username) 
        .option("password", password) 
        .option("driver", "org.postgresql.Driver") 
        .save()

    Spark provides an API to read data from a database and is very simple to use. First of all, we will need to create a Spark session. Then add the driver to Spark. It can be added through the program itself, or we can add it using shell also.

    The first line of code imports the SparkSession class. This is the entry point to programming Spark with the Dataset and DataFrame API

    From the fifth to the ninth line of the above code, we are creating a spark session on a local system with four cores, which will be used for interaction with our spark application. We specify the name for our application using appName(), which in our case, is ‘Databases.’ This app name will be shown on Webb UI for our cluster. Next, we can specify any configurations for the spark application using config(). In our case, we have specified the configuration of the driver for the Postgres database, which will be used to create a connection with the Postgres database. You can specify the driver of any of the available databases. 

    To connect to the database, we must have a hostname, port, database name, username, and password with us. Those details are in 10 through 16 lines of the above code.

    Refer to the code lines from 19 to 28 in the above snippet. Up until now, we have had our Spark session and all the information that we need to connect to the database. Using the Spark Read API, we read the data from the database. This will create a connection to the Postgres database from one of the cores that we have allocated for the Spark application. And using this connection, it will read the data into the table_data_df dataframe. Even if we have multiple cores for our application, it will still create only one connection from one of the cores. The rest of the cores will not be utilized. While we will discuss how to utilize all cores, our first focus is here.

    Refer to the code lines from 29 to 38 in the above snippet. We have the data now, so let’s try to write it to the database. Using the Spark Write API, we will write data to the database. This will also create only one connection to the database from one of the cores that we have allocated for the Spark application. Even if we have more cores for the application, it still uses only one core with the above code.

    Output of Program:

    /usr/bin/python3.8 /home/aniketrajput/aniket_work/Spark/main.py
    WARNING: An illegal reflective access operation has occurred
    WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/home/aniketrajput/.local/lib/python3.8/site-packages/pyspark/jars/spark-unsafe_2.12-3.2.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
    WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
    WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
    WARNING: All illegal access operations will be denied in a future release
    :: loading settings :: url = jar:file:/home/aniketrajput/.local/lib/python3.8/site-packages/pyspark/jars/ivy-2.5.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
    Ivy Default Cache set to: /home/aniketrajput/.ivy2/cache
    The jars for the packages stored in: /home/aniketrajput/.ivy2/jars
    org.postgresql#postgresql added as a dependency
    :: resolving dependencies :: org.apache.spark#spark-submit-parent-83662a49-0573-46c3-be8e-0a280f96c7d8;1.0
    	confs: [default]
    	found org.postgresql#postgresql;42.2.8 in central
    :: resolution report :: resolve 113ms :: artifacts dl 3ms
    	:: modules in use:
    	org.postgresql#postgresql;42.2.8 from central in [default]
    	---------------------------------------------------------------------
    	|                  |            modules            ||   artifacts   |
    	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    	---------------------------------------------------------------------
    	|      default     |   1   |   0   |   0   |   0   ||   1   |   0   |
    	---------------------------------------------------------------------
    :: retrieving :: org.apache.spark#spark-submit-parent-83662a49-0573-46c3-be8e-0a280f96c7d8
    	confs: [default]
    	0 artifacts copied, 1 already retrieved (0kB/5ms)
    22/04/22 11:55:33 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    +-------------+-----------------+---------+---------+--------+-----+------+----------+
    |        state|         district|confirmed|recovered|deceased|other|tested|      date|
    +-------------+-----------------+---------+---------+--------+-----+------+----------+
    |Uttar Pradesh|         Varanasi|    23512|    23010|     456|    0|595510|2021-02-24|
    |  Uttarakhand|           Almora|     3259|     3081|      25|  127| 84443|2021-02-24|
    |  Uttarakhand|        Bageshwar|     1534|     1488|      17|   26| 55626|2021-02-24|
    |  Uttarakhand|          Chamoli|     3486|     3373|      15|   88| 90390|2021-02-24|
    |  Uttarakhand|        Champawat|     1819|     1790|       9|    7| 95068|2021-02-24|
    |  Uttarakhand|         Dehradun|    29619|    28152|     962|  439|401496|2021-02-24|
    |  Uttarakhand|         Haridwar|    14137|    13697|     158|  175|369542|2021-02-24|
    |  Uttarakhand|         Nainital|    12636|    12254|     237|   79|204422|2021-02-24|
    |  Uttarakhand|    Pauri Garhwal|     5145|     5033|      60|   24|138878|2021-02-24|
    |  Uttarakhand|      Pithoragarh|     3361|     3291|      47|   11| 72686|2021-02-24|
    |  Uttarakhand|      Rudraprayag|     2270|     2251|      10|    7| 52378|2021-02-24|
    |  Uttarakhand|    Tehri Garhwal|     4227|     4026|      16|  170|105111|2021-02-24|
    |  Uttarakhand|Udham Singh Nagar|    11538|    11267|     117|  123|337292|2021-02-24|
    |  Uttarakhand|       Uttarkashi|     3789|     3645|      17|  118|120026|2021-02-24|
    |  West Bengal|       Alipurduar|     7705|     7616|      86|    0|  null|2021-02-24|
    |  West Bengal|          Bankura|    11940|    11788|      92|    0|  null|2021-02-24|
    |  West Bengal|          Birbhum|    10035|     9876|      89|    0|  null|2021-02-24|
    |  West Bengal|      Cooch Behar|    11835|    11756|      72|    0|  null|2021-02-24|
    |  West Bengal| Dakshin Dinajpur|     8179|     8099|      74|    0|  null|2021-02-24|
    |  West Bengal|       Darjeeling|    18423|    18155|     203|    0|  null|2021-02-24|
    +-------------+-----------------+---------+---------+--------+-----+------+----------+
    only showing top 20 rows
    
    
    Process finished with exit code 0

    Multiple Partition spark program:

    from pyspark.sql import SparkSession
    
    ## Creating a spark session and adding Postgres Driver to spark.
    spark_session = SparkSession.builder 
        .master("local[4]") 
        .appName("Databases") 
        .config("spark.jars.packages", "org.postgresql:postgresql:42.2.8")
        .getOrCreate()
    
    hostname = "localhost"
    jdbc_port = 5432
    dbname = "postgres"
    username = "postgres"
    password = "pass@123"
    
    jdbc_url = "jdbc:postgresql://{0}:{1}/{2}".format(hostname, jdbc_port, dbname)
    
    partition_column = 'date'
    lower_bound = '2021-02-20'
    upper_bound = '2021-02-28'
    num_partitions = 4
    
    ## reading data
    table_data_df = spark_session.read 
        .format("jdbc") 
        .option("url", jdbc_url) 
        .option("dbtable", "covid_data") 
        .option("user", username) 
        .option("password", password) 
        .option("driver", "org.postgresql.Driver") 
        .option("partitionColumn", partition_column) 
        .option("lowerBound", lower_bound) 
        .option("upperBound", upper_bound) 
        .option("numPartitions", num_partitions) 
        .load()
    
    table_data_df.show()
    
    ## writing data
    table_data_df.write 
        .format("jdbc") 
        .option("url", jdbc_url) 
        .option("dbtable", "covid_data_output") 
        .option("user", username) 
        .option("password", password) 
        .option("driver", "org.postgresql.Driver") 
        .option("numPartitions", num_partitions) 
        .save()

    As promised in the last section, we will discuss how we can optimize for resource utilization. In the last section, we had only one connection, utilizing very limited resources and causing resources to be idle or unused. To get over this, the Spark Read and Write API has a way by providing a few extra attributes. And those are partitionColumn, lowerBound, upperBound. These options must all be specified if any of them is specified. In addition, numPartitions must be specified. They describe how to partition the table when reading in parallel from multiple workers. For each partition, there will be an individual core with its own connection performing the reads or writes. Thus, making the database operation in parallel.

    This is an efficient way of reading and writing data from databases in spark rather than just doing it with one partition. 

    Partitions are decided by the Spark API in the following way.

    Let’s consider an example where:

    lowerBound: 0

    upperBound: 1000

    numPartitions: 10

    Stride is equal to 100, and partitions correspond to the following queries:

    SELECT * FROM table WHERE partitionColumn BETWEEN 0 AND 100 

    SELECT * FROM table WHERE partitionColumn BETWEEN 100 AND 200 

    SELECT * FROM table WHERE partitionColumn > 9000

    BETWEEN here is exclusive on the upper bound.

    Now we have data in multiple partitions. Each executor can have one or more partitions based on cluster configuration. Suppose we have 10 cores and 10 partitions. One partition of data can be fetched from one executor using one core. So, 10 partitions of data can be fetched from 10 executors. Each of these executors will create the connection to the database and will read the data.

    Note– lowerbound and upperbound does not filter the data. It just helps spark to decide the stride of data.

         partitionColumn must be a numeric, date, or timestamp column from the table

    Also, there are some attributes that can be used during the write operation to optimize the write operation. One of the attributes is “batchsize”. The JDBC batch size, which determines how many rows to insert per round trip. This can help the performance of JDBC drivers. This option applies only to writing. One more attribute called “truncate” can be helpful to optimize the write operation. This is a JDBC writer-related option. When SaveMode.Overwrite is enabled, it causes Spark to truncate an existing table instead of dropping and recreating it. This can be more efficient and prevents the table metadata (e.g., indices) from being removed.

    Output of Program:

    /usr/bin/python3.8 /home/aniketrajput/aniket_work/Spark/main.py
    WARNING: An illegal reflective access operation has occurred
    WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/home/aniketrajput/.local/lib/python3.8/site-packages/pyspark/jars/spark-unsafe_2.12-3.2.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
    WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
    WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
    WARNING: All illegal access operations will be denied in a future release
    :: loading settings :: url = jar:file:/home/aniketrajput/.local/lib/python3.8/site-packages/pyspark/jars/ivy-2.5.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
    Ivy Default Cache set to: /home/aniketrajput/.ivy2/cache
    The jars for the packages stored in: /home/aniketrajput/.ivy2/jars
    org.postgresql#postgresql added as a dependency
    :: resolving dependencies :: org.apache.spark#spark-submit-parent-8047b5cc-11e8-4efb-8a38-70edab0d5404;1.0
    	confs: [default]
    	found org.postgresql#postgresql;42.2.8 in central
    :: resolution report :: resolve 104ms :: artifacts dl 3ms
    	:: modules in use:
    	org.postgresql#postgresql;42.2.8 from central in [default]
    	---------------------------------------------------------------------
    	|                  |            modules            ||   artifacts   |
    	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    	---------------------------------------------------------------------
    	|      default     |   1   |   0   |   0   |   0   ||   1   |   0   |
    	---------------------------------------------------------------------
    :: retrieving :: org.apache.spark#spark-submit-parent-8047b5cc-11e8-4efb-8a38-70edab0d5404
    	confs: [default]
    	0 artifacts copied, 1 already retrieved (0kB/4ms)
    22/04/22 12:20:32 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    +-------------+-----------------+---------+---------+--------+-----+------+----------+
    |        state|         district|confirmed|recovered|deceased|other|tested|      date|
    +-------------+-----------------+---------+---------+--------+-----+------+----------+
    |Uttar Pradesh|         Varanasi|    23512|    23010|     456|    0|595510|2021-02-24|
    |  Uttarakhand|           Almora|     3259|     3081|      25|  127| 84443|2021-02-24|
    |  Uttarakhand|        Bageshwar|     1534|     1488|      17|   26| 55626|2021-02-24|
    |  Uttarakhand|          Chamoli|     3486|     3373|      15|   88| 90390|2021-02-24|
    |  Uttarakhand|        Champawat|     1819|     1790|       9|    7| 95068|2021-02-24|
    |  Uttarakhand|         Dehradun|    29619|    28152|     962|  439|401496|2021-02-24|
    |  Uttarakhand|         Haridwar|    14137|    13697|     158|  175|369542|2021-02-24|
    |  Uttarakhand|         Nainital|    12636|    12254|     237|   79|204422|2021-02-24|
    |  Uttarakhand|    Pauri Garhwal|     5145|     5033|      60|   24|138878|2021-02-24|
    |  Uttarakhand|      Pithoragarh|     3361|     3291|      47|   11| 72686|2021-02-24|
    |  Uttarakhand|      Rudraprayag|     2270|     2251|      10|    7| 52378|2021-02-24|
    |  Uttarakhand|    Tehri Garhwal|     4227|     4026|      16|  170|105111|2021-02-24|
    |  Uttarakhand|Udham Singh Nagar|    11538|    11267|     117|  123|337292|2021-02-24|
    |  Uttarakhand|       Uttarkashi|     3789|     3645|      17|  118|120026|2021-02-24|
    |  West Bengal|       Alipurduar|     7705|     7616|      86|    0|  null|2021-02-24|
    |  West Bengal|          Bankura|    11940|    11788|      92|    0|  null|2021-02-24|
    |  West Bengal|          Birbhum|    10035|     9876|      89|    0|  null|2021-02-24|
    |  West Bengal|      Cooch Behar|    11835|    11756|      72|    0|  null|2021-02-24|
    |  West Bengal| Dakshin Dinajpur|     8179|     8099|      74|    0|  null|2021-02-24|
    |  West Bengal|       Darjeeling|    18423|    18155|     203|    0|  null|2021-02-24|
    +-------------+-----------------+---------+---------+--------+-----+------+----------+
    only showing top 20 rows
    
    
    Process finished with exit code 0

    We have seen how to read and write data in Spark. Spark is not the only way to connect with databases, right? There are multiple ways we can access databases and try to achieve parallel read-writes. We will discuss this in further sections. We will mainly focus on reading and writing it from python.

    Single Thread Python Program: 

    import traceback
    import psycopg2
    import pandas as pd
    
    class PostgresDbClient:
        def __init__(self, postgres_hostname, postgres_jdbcport, postgres_dbname, username, password):
            self.db_host = postgres_hostname
            self.db_port = postgres_jdbcport
            self.db_name = postgres_dbname
            self.db_user = username
            self.db_pass = password
    
        def create_conn(self):
            conn = None
            try:
                print('Connecting to the Postgres database...')
                conn = psycopg2.connect("dbname={} user={} host={} password={} port={}".format(self.db_name, self.db_user, self.db_host, self.db_pass, self.db_port))
                print('Successfully connected to the Postgres database...')
            except Exception as e:
                print("Cannot connect to Postgres.")
                print(f'Error: {str(e)}nTrace: {traceback.format_exc()}')
            return conn
    
        def read(self, query):
            try:
                conn = self.create_conn()
                cursor = conn.cursor()
                print(f"Reading data !!!")
                cursor.execute(query)
                data = cursor.fetchall()
                print(f"Read Data !!!")
                cursor.close()
                conn.close()
                return data
            except Exception as e:
                print(f'Error: {str(e)}nTrace: {traceback.format_exc()}')
    
    
    if __name__ == "__main__":
        hostname = "localhost"
        jdbc_port = 5432
        dbname = "postgres"
        username = "postgres"
        password = "pass@123"
        table_name = "covid_data"
        query = f"select * from {table_name}"
    
        db_client = PostgresDbClient(postgres_hostname=hostname, postgres_jdbcport=jdbc_port, postgres_dbname=dbname, username=username,password=password)
        data = pd.DataFrame(db_client.read(query))
        print(data)

    To integrate Postgres with Python, we have different libraries or adopters that we can use. But Psycopg is the widely used adopter. First off all, you will need to install the Psycopg2 library. Psycopg2 is a slightly updated version of the Psycopg adapter. You install it using pip or any way you are comfortable with.

    To connect with the Postgres database, we need hostname, port, database name, username, and password. We are storing all these details as attributes in class. The create connection method will form a connection with the Postgres database using the connect() method of psycopg2 module. This method will return a connection object.
    In the read method, we call this connection method and get a connection object. Using this connection object, we create a cursor. This cursor is bound to have a connection with the database for its lifetime and execute all the commands or queries on the database. Using this query object, we execute a read query on the database. Then the data returned by the executing read query can be fetched using the fetchall() method. Then we close the connection.

    To run the program, we have specified details of database and query. Next, we create an object of PostgresDbClient and call the read method from class PostgresDbClient. This read method will return as data and we are converting this data into relational format using pandas.

    This implementation is very straightforward: this program creates one process in our system and fetches all the data using system resources, CPU, memory, etc. The drawback of this approach is that suppose this program uses 30 percent CPU and memory resources out of 100%, then the remaining 70% of resources are idle. We can maximize this usage by other means like multithreading or multiprocessing. 

    Output of Program:

    Connecting to the Postgres database...
    Successfully connected to the Postgres database...
    Reading data !!!
    Read Data !!!
                                  0              1    2   3  4  5     6           7
    0   Andaman and Nicobar Islands        Unknown   33  11  0  0  None  2020-04-26
    1                Andhra Pradesh      Anantapur   53  14  4  0  None  2020-04-26
    2                Andhra Pradesh       Chittoor   73  13  0  0  None  2020-04-26
    3                Andhra Pradesh  East Godavari   39  12  0  0  None  2020-04-26
    4                Andhra Pradesh         Guntur  214  29  8  0  None  2020-04-26
    ..                          ...            ...  ...  .. .. ..   ...         ...
    95                        Bihar         Araria    1   0  0  0  None  2020-04-30
    96                        Bihar          Arwal    4   0  0  0  None  2020-04-30
    97                        Bihar     Aurangabad    8   0  0  0  None  2020-04-30
    98                        Bihar          Banka    3   0  0  0  None  2020-04-30
    99                        Bihar      Begusarai   11   5  0  0  None  2020-04-30
    
    [100 rows x 8 columns]
    
    Process finished with exit code 0

    Multi Thread python program:

    In the previous section, we discussed the drawback of a single process and single-thread implementation. Let’s get started with how to maximize resource usage. Before getting into multithreading, let’s understand a few basic but important concepts.

    What is a process?

    When you execute any program, the operating system loads it in memory and then starts executing the program. This instance of the program being executed is called a process. Computing and memory resources are associated with each process separately.

    What is a thread?

    A thread is a sequential flow of execution. A process is also a thread. Usually, the process is called a main thread. Unlike a process, the same computing and memory resources can be shared with multiple threads.

    What is multithreading?

    This is when a process has multiple threads, along with the main thread, and these threads run independently but concurrently using the same computing and memory resources associated with the process. Such a program is called a multithreaded program or process. Multithreading uses resources very efficiently, which results in maximizing performance.

    What is multiprocessing?

    When multiple processes run independently, with separate resources associated with each process, it is called multiprocessing. Multiprocessing is achieved with multiple processors running separate processes on each processor. 

    Let’s get back to our program. Here you can see we have a connection and read method. These two methods are exactly the same as from the previous section. Here, we have one new function, which is get_thread(). Be careful, as a method belongs to the class, and afunction, it is not part of this class. So, this get_thred() function is global and acts as a wrapper function for calling the read method from the class PostgresDbClient. This is because we can’t create threads using class methods. Don’t get confused if you don’t understand it, as it is just how we write the code.

    To run the program, we have specified the Postgres database details and queries. In the previous approach, we fetched all the data from the table with one thread only. In this approach, the plan is to fetch one day of data using one thread so that we can maximize resource utilization. Here, each query reads one day’s worth of data from the table using one thread. Having 5 queries will fetch 5 days of data, and 5 threads will be running concurrently.

    To create a thread in Python, we will need to use the Thread() method from the threading library. We need to pass the function that we want to run and arguments of that function. The thread() object will create a new thread and return its object. The thread has been created but has not yet started. To start this thread, we will need to use the start() method. In our program, we are starting 5 threads. If you try executing this entire program multiple times, you will end up with different results. Some data will fetch prior, and some will fetch later. And at the time of the next execution, this order will be different again. This is because resource handling is done by the operating system. Depending on what the OS thinks about which thread to give what resources, the output is generated. If you want to know how this is done, you will need to go deep into operating systems concepts.

    In our use case, we are just printing the data to the console. To store the data, there are multiple ways. One simple way is to define the global variable and store the result in it, but we will need to achieve synchronization as multiple threads might access the global variable, which can lead to race conditions. Another way is to extend the thread class to your custom class, and you can define a class variable—and you can use this variable to save the data. Again, here, you will need to make sure you are achieving synchronization.

    So, whenever you want to store the data in a variable by any available method, you will need to achieve synchronization. So, synchronization will lead to the sequential execution of threads. And this sequential processing is not what we are looking for.
    To avoid synchronization, we can directly write the data to the target—so that when the thread reads the data, the same thread will write data again back to the target database. This way, we can avoid synchronization and store the data in the database for future use. This function can look as below, where db_client.write(data) is a function that writes the data to a database.

    def get_thread(thread_id, db_client, query):

        print(f”Starting thread id {thread_id}”)

        data = pd.DataFrame(db_client.read(query))

        print(f”Thread {thread_id} data “, data, sep=”n”)

        db_client.write(data)

        print(f”Stopping thread id {thread_id}”)

    Python Program:

    import threading
    import traceback
    import psycopg2
    import pandas as pd
    
    class PostgresDbClient:
        def __init__(self, postgres_hostname, postgres_jdbcport, postgres_dbname, username, password):
            self.db_host = postgres_hostname
            self.db_port = postgres_jdbcport
            self.db_name = postgres_dbname
            self.db_user = username
            self.db_pass = password
    
        def create_conn(self):
            conn = None
            try:
                print('Connecting to the Postgres database...')
                conn = psycopg2.connect("dbname={} user={} host={} password={} port={}".format(self.db_name, self.db_user, self.db_host, self.db_pass, self.db_port))
                print('Successfully connected to the Postgres database...')
            except Exception as e:
                print("Cannot connect to Postgres.")
                print(f'Error: {str(e)}nTrace: {traceback.format_exc()}')
            return conn
    
        def read(self, query):
            try:
                conn = self.create_conn()
                cursor = conn.cursor()
                print(f"Reading data !!!")
                cursor.execute(query)
                data = cursor.fetchall()
                print(f"Read Data !!!")
                cursor.close()
                conn.close()
                return data
            except Exception as e:
                print(f'Error: {str(e)}nTrace: {traceback.format_exc()}')
    
    
    def get_thread(thread_id, db_client, query):
        print(f"Starting thread id {thread_id}")
        data = pd.DataFrame(db_client.read(query))
        print(f"Thread {thread_id} data ", data, sep="n")
        print(f"Stopping thread id {thread_id}")
    
    
    if __name__ == "__main__":
        hostname = "localhost"
        jdbc_port = 5432
        dbname = "postgres"
        username = "postgres"
        password = "pass@123"
        table_name = "covid_data"
        query = f"select * from {table_name}"
    
        partition_column = 'date'
        lower_bound = '2020-04-26'
        upper_bound = '2020-04-30'
        num_partitions = 5
    
    
        query1 = f"select * from {table_name} where {partition_column} >= '{lower_bound}' and {partition_column} < '{upper_bound}'"
        query2 = f"select * from {table_name} where {partition_column} >= '{lower_bound}' and {partition_column} < '{upper_bound}'"
        query3 = f"select * from {table_name} where {partition_column} >= '{lower_bound}' and {partition_column} < '{upper_bound}'"
        query4 = f"select * from {table_name} where {partition_column} >= '{lower_bound}' and {partition_column} < '{upper_bound}'"
        query5 = f"select * from {table_name} where {partition_column} >= '{lower_bound}' and {partition_column} < '{upper_bound}'"
    
        db_client = PostgresDbClient(postgres_hostname=hostname, postgres_jdbcport=jdbc_port, postgres_dbname=dbname, username=username,password=password)
        x1 = threading.Thread(target=get_thread, args=(1, db_client, query1))
        x1.start()
        x2 = threading.Thread(target=get_thread, args=(2, db_client, query2))
        x2.start()
        x3 = threading.Thread(target=get_thread, args=(3, db_client, query3))
        x3.start()
        x4 = threading.Thread(target=get_thread, args=(4, db_client, query4))
        x4.start()
        x5 = threading.Thread(target=get_thread, args=(5, db_client, query5))
        x5.start()

    Output of Program:

    Starting thread id 1
    Connecting to the Postgres database...
    Starting thread id 2
    Connecting to the Postgres database...
    Starting thread id 3
    Connecting to the Postgres database...
    Starting thread id 4
    Connecting to the Postgres database...
    Starting thread id 5
    Connecting to the Postgres database...
    Successfully connected to the Postgres database...Successfully connected to the Postgres database...Successfully connected to the Postgres database...
    Reading data !!!
    
    Reading data !!!
    
    Reading data !!!
    Successfully connected to the Postgres database...
    Reading data !!!
    Successfully connected to the Postgres database...
    Reading data !!!
    Read Data !!!
    Read Data !!!
    Read Data !!!
    Read Data !!!
    Read Data !!!
    Thread 2 data 
    Thread 3 data 
    Thread 1 data 
    Thread 5 data 
    Thread 4 data 
                                  0               1    2  ...  5     6           7
    0   Andaman and Nicobar Islands         Unknown   33  ...  0  None  2020-04-27
    1                Andhra Pradesh       Anantapur   53  ...  0  None  2020-04-27
    2                Andhra Pradesh        Chittoor   73  ...  0  None  2020-04-27
    3                Andhra Pradesh   East Godavari   39  ...  0  None  2020-04-27
    4                Andhra Pradesh          Guntur  237  ...  0  None  2020-04-27
    5                Andhra Pradesh         Krishna  210  ...  0  None  2020-04-27
    6                Andhra Pradesh         Kurnool  292  ...  0  None  2020-04-27
    7                Andhra Pradesh        Prakasam   56  ...  0  None  2020-04-27
    8                Andhra Pradesh  S.P.S. Nellore   79  ...  0  None  2020-04-27
    9                Andhra Pradesh      Srikakulam    4  ...  0  None  2020-04-27
    10               Andhra Pradesh   Visakhapatnam   22  ...  0  None  2020-04-27
    11               Andhra Pradesh   West Godavari   54  ...  0  None  2020-04-27
    12               Andhra Pradesh   Y.S.R. Kadapa   58  ...  0  None  2020-04-27
    13            Arunachal Pradesh           Lohit    1  ...  0  None  2020-04-27
    14                        Assam         Unknown   36  ...  0  None  2020-04-27
    15                        Bihar           Arwal    4  ...  0  None  2020-04-27
    16                        Bihar      Aurangabad    7  ...  0  None  2020-04-27
    17                        Bihar           Banka    2  ...  0  None  2020-04-27
    18                        Bihar       Begusarai    9  ...  0  None  2020-04-27
    19                        Bihar       Bhagalpur    5  ...  0  None  2020-04-27
    
    [20 rows x 8 columns]
    Stopping thread id 2
                                  0               1    2  ...  5     6           7
    0   Andaman and Nicobar Islands         Unknown   33  ...  0  None  2020-04-26
    1                Andhra Pradesh       Anantapur   53  ...  0  None  2020-04-26
    2                Andhra Pradesh        Chittoor   73  ...  0  None  2020-04-26
    3                Andhra Pradesh   East Godavari   39  ...  0  None  2020-04-26
    4                Andhra Pradesh          Guntur  214  ...  0  None  2020-04-26
    5                Andhra Pradesh         Krishna  177  ...  0  None  2020-04-26
    6                Andhra Pradesh         Kurnool  279  ...  0  None  2020-04-26
    7                Andhra Pradesh        Prakasam   56  ...  0  None  2020-04-26
    8                Andhra Pradesh  S.P.S. Nellore   72  ...  0  None  2020-04-26
    9                Andhra Pradesh      Srikakulam    3  ...  0  None  2020-04-26
    10               Andhra Pradesh   Visakhapatnam   22  ...  0  None  2020-04-26
    11               Andhra Pradesh   West Godavari   51  ...  0  None  2020-04-26
    12               Andhra Pradesh   Y.S.R. Kadapa   58  ...  0  None  2020-04-26
    13            Arunachal Pradesh           Lohit    1  ...  0  None  2020-04-26
    14                        Assam         Unknown   36  ...  0  None  2020-04-26
    15                        Bihar           Arwal    4  ...  0  None  2020-04-26
    16                        Bihar      Aurangabad    2  ...  0  None  2020-04-26
    17                        Bihar           Banka    2  ...  0  None  2020-04-26
    18                        Bihar       Begusarai    9  ...  0  None  2020-04-26
    19                        Bihar       Bhagalpur    5  ...  0  None  2020-04-26
    
    [20 rows x 8 columns]
    Stopping thread id 1
                                  0               1    2  ...  5     6           7
    0   Andaman and Nicobar Islands         Unknown   33  ...  0  None  2020-04-28
    1                Andhra Pradesh       Anantapur   54  ...  0  None  2020-04-28
    2                Andhra Pradesh        Chittoor   74  ...  0  None  2020-04-28
    3                Andhra Pradesh   East Godavari   39  ...  0  None  2020-04-28
    4                Andhra Pradesh          Guntur  254  ...  0  None  2020-04-28
    5                Andhra Pradesh         Krishna  223  ...  0  None  2020-04-28
    6                Andhra Pradesh         Kurnool  332  ...  0  None  2020-04-28
    7                Andhra Pradesh        Prakasam   56  ...  0  None  2020-04-28
    8                Andhra Pradesh  S.P.S. Nellore   82  ...  0  None  2020-04-28
    9                Andhra Pradesh      Srikakulam    4  ...  0  None  2020-04-28
    10               Andhra Pradesh   Visakhapatnam   22  ...  0  None  2020-04-28
    11               Andhra Pradesh   West Godavari   54  ...  0  None  2020-04-28
    12               Andhra Pradesh   Y.S.R. Kadapa   65  ...  0  None  2020-04-28
    13            Arunachal Pradesh           Lohit    1  ...  0  None  2020-04-28
    14                        Assam         Unknown   38  ...  0  None  2020-04-28
    15                        Bihar          Araria    1  ...  0  None  2020-04-28
    16                        Bihar           Arwal    4  ...  0  None  2020-04-28
    17                        Bihar      Aurangabad    7  ...  0  None  2020-04-28
    18                        Bihar           Banka    3  ...  0  None  2020-04-28
    19                        Bihar       Begusarai    9  ...  0  None  2020-04-28
    
    [20 rows x 8 columns]
    Stopping thread id 3
                                  0               1    2  ...  5     6           7
    0   Andaman and Nicobar Islands         Unknown   33  ...  0  None  2020-04-30
    1                Andhra Pradesh       Anantapur   61  ...  0  None  2020-04-30
    2                Andhra Pradesh        Chittoor   80  ...  0  None  2020-04-30
    3                Andhra Pradesh   East Godavari   42  ...  0  None  2020-04-30
    4                Andhra Pradesh          Guntur  287  ...  0  None  2020-04-30
    5                Andhra Pradesh         Krishna  246  ...  0  None  2020-04-30
    6                Andhra Pradesh         Kurnool  386  ...  0  None  2020-04-30
    7                Andhra Pradesh        Prakasam   60  ...  0  None  2020-04-30
    8                Andhra Pradesh  S.P.S. Nellore   84  ...  0  None  2020-04-30
    9                Andhra Pradesh      Srikakulam    5  ...  0  None  2020-04-30
    10               Andhra Pradesh   Visakhapatnam   23  ...  0  None  2020-04-30
    11               Andhra Pradesh   West Godavari   56  ...  0  None  2020-04-30
    12               Andhra Pradesh   Y.S.R. Kadapa   73  ...  0  None  2020-04-30
    13            Arunachal Pradesh           Lohit    1  ...  0  None  2020-04-30
    14                        Assam         Unknown   43  ...  0  None  2020-04-30
    15                        Bihar          Araria    1  ...  0  None  2020-04-30
    16                        Bihar           Arwal    4  ...  0  None  2020-04-30
    17                        Bihar      Aurangabad    8  ...  0  None  2020-04-30
    18                        Bihar           Banka    3  ...  0  None  2020-04-30
    19                        Bihar       Begusarai   11  ...  0  None  2020-04-30
    
    [20 rows x 8 columns]
    Stopping thread id 5
                                  0               1    2  ...  5     6           7
    0   Andaman and Nicobar Islands         Unknown   33  ...  0  None  2020-04-29
    1                Andhra Pradesh       Anantapur   58  ...  0  None  2020-04-29
    2                Andhra Pradesh        Chittoor   77  ...  0  None  2020-04-29
    3                Andhra Pradesh   East Godavari   40  ...  0  None  2020-04-29
    4                Andhra Pradesh          Guntur  283  ...  0  None  2020-04-29
    5                Andhra Pradesh         Krishna  236  ...  0  None  2020-04-29
    6                Andhra Pradesh         Kurnool  343  ...  0  None  2020-04-29
    7                Andhra Pradesh        Prakasam   60  ...  0  None  2020-04-29
    8                Andhra Pradesh  S.P.S. Nellore   82  ...  0  None  2020-04-29
    9                Andhra Pradesh      Srikakulam    5  ...  0  None  2020-04-29
    10               Andhra Pradesh   Visakhapatnam   23  ...  0  None  2020-04-29
    11               Andhra Pradesh   West Godavari   56  ...  0  None  2020-04-29
    12               Andhra Pradesh   Y.S.R. Kadapa   69  ...  0  None  2020-04-29
    13            Arunachal Pradesh           Lohit    1  ...  0  None  2020-04-29
    14                        Assam         Unknown   38  ...  0  None  2020-04-29
    15                        Bihar          Araria    1  ...  0  None  2020-04-29
    16                        Bihar           Arwal    4  ...  0  None  2020-04-29
    17                        Bihar      Aurangabad    8  ...  0  None  2020-04-29
    18                        Bihar           Banka    3  ...  0  None  2020-04-29
    19                        Bihar       Begusarai   11  ...  0  None  2020-04-29
    
    [20 rows x 8 columns]
    Stopping thread id 4
    
    Process finished with exit code 0

    Note that in this blog, we have used a password as a hardcoded string, which is definitely not the way to define passwords. We should use secrets, .env files, etc., as input for passwords. We do not hardcode passwords in the production environment.

    Conclusion 

    After going through the above blog, you might have gotten more familiar with how to perform read and write operations on databases using spark, python, and multithreading concepts. You also know now what are multi processes and what multithreading is. You are now also able to analyze the best way to carry out read-and-write operations on a database based on your requirements. 

    In general, if you have a small amount of data, you can use a simple python approach to read and write data. If you have a relatively high amount of data, then you can use a multi-threaded approach or a single-partition Spark approach. If you have a huge amount of data, and where reading millions of records per second is a requirement, then you can use the Spark multi-partition approach. In the end, it’s just mostly personal preference, and using which approach depends on your requirements and availability of resources.

  • Game of Hackathon 2019: A Glimpse of Our First Hackathon

    Hackathons for technology startups are like picking up a good book. It may take a long time before you start, but once you do, you wonder why you didn’t do it sooner? Last Friday, on 31st May 2019, we conducted our first Hackathon at Velotio and it was a grand success!

    Although challenging projects from our clients are always pushing us to learn new things, we saw a whole new level of excitement and enthusiasm among our employees to bring their own ideas to life during the event. The 12-hour Hackathon saw participation from 15 teams, a lot of whom came well-prepared in advance with frameworks and ideas to start building immediately.

    Here are some pictures from the event:

    The intense coding session was then followed by a series of presentations where all the teams showcased their solutions.

    The first prize was bagged by Team Mitron who worked on a performance review app and was awarded a cash prize of 75,000 Rs.

    The second prize of 50,000 Rs. was awarded to Team WireQ. Their solution was an easy sync up platform that would serve as a single source of truth for the designers, developers, and testers to work seamlessly on projects together — a problem we have often struggled with in-house as well.

    Our QA Team put together a complete test suite framework that would perform all functional and non-functional testing activities, including maintaining consistency in testing, minimal code usage, improvement in test structuring and so on. They won the third prize worth 25,000 Rs.

    Our heartiest congratulations to all the winners!

    This Hackathon has definitely injected a lot of positive energy and innovation into our work culture and got so many of us to collaborate more effectively and learn from each other. We cannot wait to do our next Hackathon and share more with you all.

    Until then, stay tuned!

  • Optimize React App Performance By Code Splitting

    Prerequisites

    This blog post is written assuming you have an understanding of the basics of React and routing inside React SPAs with react-router. We will also be using Chrome DevTools for measuring the actual performance benefits achieved in the example. We will be using Webpack, a well-known bundler for JavaScript projects.

    What is Code Splitting?

    Code splitting is simply dividing huge code bundles into smaller code chunks that can be loaded ad hoc. Usually, when the SPAs grow in terms of components and plugins, the need to split the code into smaller-sized chunks arises. Bundlers like Webpack and Rollup provide support for code splitting.

    Several different code splitting strategies can be implemented depending on the application structure. We will be taking a look at an example in which we implement code splitting inside an admin dashboard for better performance.

    Let’s Get Started

    We will be starting with a project configured with Webpack as a bundler having a considerable bundle size. This simple Github repository dashboard will have four routes for showing various details regarding the same repository. The dashboard uses some packages to show details in the app such as react-table, TinyMCE, and recharts.

    Before optimizing the bundle

    Just to get an idea of performance changes, let us note the metrics from the prior bundle of the app. Let’s check loading time in the network tab with the following setup: 

    • Browser incognito tab
    • Cache disabled
    • Throttling enabled to Fast 3G

    Development Build

    As you can see, the development bundle without any optimization has around a 1.3 MBs network transfer size, which takes around 7.85 seconds to load for the first time on a fast 3G connection.

    However, we know that we will probably never want to serve this unoptimized development bundle in production. So, let’s figure out metrics for the production bundle with the same setup.

    Production Build

    The project is already configured for generating a webpack production build. The production bundle is much smaller, with a 534 kBs network transfer size compared to the development bundle, which takes around 3.54 seconds to load on a fast 3G connection. This is still a problem as the best practice suggests keeping the page load times below 3 seconds. Let’s check what happens if we check with a slow 3G connection.

    The production bundle took 12.70 seconds to load for the first time on a slow 3G connection. Now, this can annoy users.

    If we look at the lighthouse report, we see a warning indicating that we’re loading more code than needed:

    As per the warning, we’re loading some unused code while rendering the first time, which we can get rid of and load later instead. Lighthouse report indicates that we can save up to 404 KiBs while loading the page for the first time. 

    There’s one more suggestion for splitting the bundle using React.lazy(). The lighthouse also gives us various metrics that can be measured for improvement of the application. However, we will be focusing on bundle size in this case.

    The extra unused code inside the bundle is not only bad in terms of download size, but it also impacts the user experience. Let’s use the performance tab for figuring out how this is affecting the user experience. Navigate to the performance tab and profile the page. It shows that it takes around 10 seconds for the user to see actual content on the page reload:

    Webpack Bundle Analyzer Report

    We can visualize the bundles with the webpack bundle analyzer tool, which gives us a way to track and measure the bundle size changes over time. Please follow the installation instructions given here.

    So, this is what our production build bundle report looks like:

    As we can see, our current production build has a giant chunk man.201d82c8.js, which can be divided into smaller chunks.

    The bundle analyzer report not only gives us information about the chunk sizes but also what modules the chunk contains and their size. This gives an opportunity to find out and free up such modules and achieve better performance. Here for example that adds considerable size to our main bundle:

    Using React.lazy() for Code Splitting

    React.lazy allows us to use dynamically imported components. This means that we can load these components when they’re needed and reduce bundle size. As our dashboard app has four top-level routes that are wrapped inside react-router’s Switch, we know that they will never need to be at once. 

    So, apparently, we can split these top-level components into four different bundle chunks and load them ad hoc. For doing that, we need to convert our imports from:

    import Collaborators from './Collaborators';
    import PullRequests from './PullRequests';
    import Statistics from './Statistics';
    import Sidebar from './Sidebar';

    CODE: https://gist.github.com/velotiotech/972999ef8126c7618814be299d326a62.js

    To:

    const Commits = React.lazy(() => import('./Commits'));
    const Collaborators = React.lazy(() => import('./Collaborators'));
    const PullRequests = React.lazy(() => import('./PullRequests'));
    const Statistics = React.lazy(() => import('./Statistics'));

    This also requires us to implement a Suspense wrapper around our routes, which does the work of showing fallback visuals till the dynamically loading component is visible.

    <Suspense fallback={<div>Loading...</div>}>
               <Switch>
                 <Route path="/" exact component={Commits} />
                 <Route path="/collaborators" exact component={Collaborators} />
                 <Route path="/prs" exact component={PullRequests} />
                 <Route path="/stats" exact component={Statistics} />
               </Switch>
             </Suspense>

    Just after this, the Webpack recognizes the dynamic imports and splits the main chunk into smaller chunks. In the production build, we can notice the following bundles being downloaded. We have reduced the load time for the main bundle chunk from 12 seconds to 3.10 seconds, which is quite good. This is an improvement as we’re not loading unnecessary JS for the first time. 

    As we can see in the waterfall view of the requests tab, other required chunks are loaded parallel as soon as the main chunk is loaded.

    If we look at the lighthouse report, the warning for removing unused JS has been gone and we can see the check passing.

    This is good for the landing page. How about the other routes when we visit them? The following shows that we are now loading more small chunks when we render that lazily loaded component on menu item click.

    With the current setup, we should be able to see improved performance inside our applications. We can always go ahead and tweak Webpack chunks when needed.

    To measure how this change affects user experience, we can again generate the performance report with Chrome DevTools. We can quickly notice that the idle frame time has dropped to around 1 second—far better than the previous setup. 

    If we read through the timeline, we can see the user sees a blank frame up to 1 second, and they’re able to see the sidebar in the next second. Once the main bundle is loaded, we’re loading the lazy-loaded commits chunk till that time we see our fallback loading component.

    Also, when we navigate to the other routes, we can see the chunks loaded lazily when they’re needed.

    Let’s have a look at the bundle analyzer report generated after the changes. We can easily see that the chunks are divided into smaller chunks. Also, we can notice that the chunks contain only the code they need. For example, the 51.573370a6.js chunk is actually the commits route containing the react-table code. It’s similar for the charts module in the other chunk.

    Conclusion

    Depending on the project structure, we can easily set up code-splitting inside the React applications, which is useful for better-performing applications and leads to a positive impact for the users.

    You can find the referenced code in this repo.

  • Node.js vs Deno: Is Deno Really The Node.js Alternative We All Didn’t Know We Needed?

    Ryan Dahl gave an interesting talk at JSConf EU in 2018 on the 10 regrets he had after creating Node.js. He spoke about the flaws that developers don’t usually think about, such as how the entire package management was an afterthought. In addition, he was also not completely comfortable with the association of npm for package management—or how he might have jumped early to async/await, ignoring some potential advantages of promises

    However, the thing that caught most people’s attention was his pet project (Deno), which he started to answer most of these issues. The project came as no surprise since no one discusses problems at length unless they are planning to solve them.

    What first appeared to be a clever play on the word “node” turned out to be much more than that. Dahl was trying to make a secure V8 runtime with TypeScript and module management that was more in-line with what we already have on the front-end. This goal was his original focus—but fast-forward to May 2020 and, Deno 1.0 was launching with many improvements. Let’s see what it is all about.

    What is Deno?

    Let’s start with a simple explanation: Deno executes TypeScript on your system like Node.js executes JavaScript. Just like Node.js, you use it to code async desktop apps and servers. The first visible difference is that you will be coding in TypeScript. However, if you can easily integrate the TypeScript compiler into Node.js for static type-checking, then why should you use Deno? First of all, Deno isn’t just a combination of Node.js and TypeScript—it’s an entirely new system designed from scratch. Below is a high-level comparison of Node.js and Deno. Bear in mind that these are just paper specs.

    As you can see, on paper, Deno seems promising and future-proof, but let’s take a more in-depth look.

    Let’s Install Deno

    As Deno is new, it intends to avoid a lot of things that add complexity to the Node ecosystem. You just need to run the following command to install; these examples were done in Linux:

    curl -fsSL https://deno.land/x/install/install.sh | sh

    This simple shell script downloads the Deno binary to a .deno directory in your home directory. That’s all. It’s a single binary without any dependencies, and it includes the V8 JavaScript engine, the TypeScript compiler, Rust binding crates, etc. After that, add the ‘deno’ binary to your PATH variable using:

    echo 'export DENO_INSTALL="/home/siddharth/.deno"' >> ~/.zshrc
    echo 'export PATH="$DENO_INSTALL/bin:$PATH"' >> ~/.zshrc

    Let’s Play

    Now that we’ve finished our installation, we can start executing TypeScript with it. Just use ‘deno run’ to execute a script.

    So, here is the “hello world” program as per the tradition:

    echo ‘console.log(“Hello Deno”)’ > script.ts && deno run script.ts

    It can also run a script from an URL. For instance, there is a “hello world” example in the standard library hosted here. You can directly run it by entering:

    deno run https://deno.land/std/examples/welcome.ts

    Similar to an HTML’s script tag, Deno can fetch and execute scripts from anywhere. You can do it in the code, too. There is no concept of a “module” like there is with npm modules. Deno just needs a URL that points to a valid TypeScript file, and it will run/import it. You will also see an import statement like:

    import { serve } from "https://deno.land/std@0.57.0/http/server.ts";

    This may seem counterintuitive and chaotic at first, but it makes a lot of sense. Since libraries/modules are just TypeScript files over the Internet, you don’t need a module system like npm to handle your dependencies. You don’t need a package.json file, either. Your project will not suddenly blow up if something goes wrong with npm’s registry.

    Going Further

    Let’s do something more meaningful. Here is a basic server using the HTTP server available with the standard library.

    import { serve } from "https://deno.land/std@0.57.0/http/server.ts";
    const server = serve({ port: 5000 });
    console.log("http://localhost:5000/");
    for await (const request of server) {  
      request.respond({ body: "Hello Worldn" });
    }

    Take notice of the URL import that we were talking about. Next, make a file named server.ts, input this code, and try to run it with:

    deno run server.ts

    We get this:

    error: Uncaught PermissionDenied: network access to "0.0.0.0:5000", run again with the --allow-net flag
    at unwrapResponse ($deno$/ops/dispatch_json.ts:43:11)
    at Object.sendSync ($deno$/ops/dispatch_json.ts:72:10)
    at Object.listen ($deno$/ops/net.ts:51:10)
    at listen ($deno$/net.ts:154:22)
    at serve (https://deno.land/std@0.57.0/http/server.ts:260:20)
    at file:///home/siddharth/server.ts:2:11

    This seems like a permission issue, which we should be able to solve by doing ‘sudo’, right? Well, not exactly. As it says, you will have to run it with a ‘–allow-net’ flag to manually permit it to access the network. Let’s try again with:

    deno run --allow-net server.ts

    Now it runs. So, what’s happening here? 

    Security

    This is another aspect that is completely missing in Node.js. Deno, by default, does not allow access to system resources like network, disk, etc. for any script. You have to explicitly give it permission for these resources which adds a layer of security and consent.

    If you are using a lesser-known library from a small developer, which we often do, you can always limit its scope to ensure that nothing shady is happening in the background. This also complements the “free and distributed” nature of Deno when it comes to adding dependencies. As there is no centralized authority to watch over and audit all the modules, everyone needs to have their own security tools.

    There are also different flags available, which provides granular control over a system’s resources, like “–allow-env” for accessing the environment. But if you trust the script entirely, or it is something you have written from scratch, you can use the ‘-A’ flag to give it to access all the resources.

    Plugins and Experimental Features

    Let’s look at an example of interacting with a database. Connecting a MongoDB instance with Deno is another common thing that developers usually do.

    For a MongoDB instance, let’s spin up a basic MongoDB container with:

    docker run --name some-mongo -p 27017:27017 -d mongo

    Now, when we make a file named database.ts and put the following code into it, it will create a simple document into a new collection named “cities”:

    import { MongoClient } from "https://deno.land/x/mongo@v0.8.0/mod.ts";
    
    const client = new MongoClient();
    client.connectWithUri("mongodb://localhost:27017");
    
    const db = client.database("test");
    const users = db.collection("cities");
    
    // insert
    const insertId = await users.insertOne({
      name: "Delhi",
      country: "India",
      population: 10000
    });
    console.log(`Successfully inserted with the id: ${JSON.stringify(insertId)}`)

    Now, as you can see, this looks pretty similar to Node.js code. In fact, most of the programming style remains the same, and you can follow similar patterns. Next, let’s run this to insert that document into our MongoDB container:

    deno run -A database.ts

    What happens is that you get an error that looks something like this:

    INFO load deno plugin "deno_mongo" from local "/home/siddharth/.deno_plugins/deno_mongo_2970fbc7cebff869aa12ecd5b8a1e7e4.so"
    error: Uncaught TypeError: Deno.openPlugin is not a function
    return Deno.openPlugin(localPath);
    ^
    at prepare (https://deno.land/x/plugin_prepare@v0.6.0/mod.ts:64:15)
    at async init (https://deno.land/x/mongo@v0.8.0/ts/util.ts:41:3)
    at async https://deno.land/x/mongo@v0.8.0/mod.ts:13:1

    This happened becuase we even gave the ‘-A’ flag which allowed access to all kinds of resources.

    Let’s rerun this with the ‘–unstable’ flag:

    deno run -A --unstable database.ts

    Now, this seems to run, and the output should look something like:

    INFO load deno plugin "deno_mongo" from local "/home/siddharth/.deno_plugins/deno_mongo_2970fbc7cebff869aa12ecd5b8a1e7e4.so"
    Successfully inserted with the id: {"$oid":"5ef0ba4a000d214a00c4367f"}

    This happens because the MongoDB driver uses some extra capabilities (“ops” to be precise). They are not present in the Deno’s runtime, so it adds a plugin. While Deno has a plugin system, the interface itself is not finalized and is hidden behind the ‘–unstable’ flag. By default, Deno doesn’t allow scripts to use unstable APIs, but again, there is this flag to force it.

    The Bigger Picture

    Why do we need a different take? Are the problems with Node so big that we need a new system? Well, no. Many people won’t even consider them to be problems, but there is a central idea behind Deno that makes its existence reasonable and design choices understandable:

    Node deviates significantly from the browser’s way of doing things.

    For example, take the permissions for when a website wants to record audio; the browser will ask the user to give consent. These kinds of permissions were absent from Node, but Deno brings them back.

    Also, regarding dependencies, a browser doesn’t understand a Node module; it just understands scripts that can be linked from anywhere around the web. Node.js is different. You have to make and publish a module so that it can be imported and reused globally. Take the fetch API, for example. To use fetch in Node, there is a different node-fetch module. Deno goes back to simple scripts and tries its best to do things similar to a browser.

    This is the overall theme even with the implementational details. Deno tries to be as close to the browser as possible so that there is minimal friction while porting libraries from front-end to back-end or vice-versa. This can be better in the long term.

    This All Looks Great, but I Have Several Questions

    Like every new take on an already-established system, Deno also raises several questions. Here are some answers to some common ones:

    If it runs TypeScript natively, then what about the speed? Node is fast because of V8.

    The important question is whether Deno actually ‘run’ TypeScript.

    Well, yes, but actually no.

    Deno executes TypeScript, but it also uses V8 to run the code. All the type checks are done before then, and at the runtime, it’s only JavaScript. Everything is abstracted from the developer’s side, and you don’t have to install and configure tsc.

    So yes, it’s fast because it also runs on V8, and there are no runtime types.

    The URL imports look ugly and fragile. What happens when the website of one of the dependencies goes down?

    The first thing is that Deno downloads and caches every dependency, and they recommend checking these with your project so that they are always available. 

    And if you don’t want to see URL imports in your code, you can do two things:

    1. Re-export the dependencies locally: To export the standard HTTP server locally, you can make a file named ‘local_http.ts’ with the following line ‘export { serve } from “https://deno.land/std@0.57.0/http/server.ts“‘ and then import from this file in the original code.

    2. Use an import map: Create a JSON file that maps the URLs to the name you want to use in code. So, create a file named ‘importmap.json’ and add the following content to it:

    {
      "imports": {
         "http/": "https://deno.land/std/http/"
      }
    }

    Now, you just need to provide this as the importmap to use when you run the script:

    deno run -A --importmap=importmap.json script.ts

    And you can import the serve function from the HTTP name like:

    import { serve } from "http/server.ts";

    Is it safe to rely on the URLs for versioning? What happens if the developer pushes the latest build on the same URL and not a new one?

    Well, then it’s the developer’s fault, and this can also happen with the npm module system. But if you are still unsure whether you have cached the latest dependencies, then there is an option to reload some or all of them. 

    Conclusion

    Deno is an interesting project, to say the least. We only have the first stable version, and it has a long way to go. For instance, they are actively working on improving the performance of the TypeScript compiler. Also, there are a good number of APIs hidden behind the ‘–unstable’ flag. These may change in the upcoming releases. The ideas like TypeScript first and browser-compatible modules are certainly appealing, which makes Deno worth keeping an eye on.

  • MQTT Protocol Overview – Everything You Need To Know

    MQTT is the open protocol. This is used for asynchronous message queuing. This has been developed and matured over several years. MQTT is a machine to machine protocol. It’s been widely used with embedded devices. Microsoft is having its own MQTT tool with huge support. Here, we are going to overview the MQTT protocol & its details.

    MQTT Protocol:

    MQTT is a very simple publish / subscribe protocol. It allows you to send messages on a topic (channels) passed through a centralized message broker.

    The MQTT module of API will take care of the publish/ subscribe mechanism along with additional features like authentication, retaining messages and sending duplicate messages to unreachable clients.

    There are three parts of MQTT architecture –

    • MQTT Broker – All messages passed from the client to the server should be sent via the broker.
    • MQTT Server – The API acts as an MQTT server. The MQTT server will be responsible for publishing the data to the clients.
    • MQTT Client – Any third party client who wishes to subscribe to data published by API, is considered as an MQTT Client.

    The MQTT Client and the MQTT Server need to connect to the Broker in order to publish or subscribe messages.

    MQTT Communication Program

    Suppose our API is sending sensor data to get more ideas on MQTT.
    API gathers the sensor data through the Monitoring module, and the MQTT module publishes the data to provide different channels. On the successful connection of external client to the MQTT module of the API, the client would receive sensor data on the subscribed channel.

    Below diagram shows the flow of data from the API Module to the External clients.

    MQTT Broker – EMQTT:

    EMQTT (Erlang MQTT Broker) is a massively scalable and clusterable MQTT V3.1/V3.1.1 broker, written in Erlang/OTP.

    Main responsibilities of a Broker are-

    • Receive all messages
    • Filter messages
    • Decide which are interested clients
    • Publish messages to all the subscribed clients

    All messages published are passed through the broker. The broker generates the Client ID and Message ID, maintains the message queue, and publishes the message.

    There are several brokers that can be used. Default EMQTT broker developed in ErLang.

    MQTT Topics:

    A topic is a string(UTF-8). Using this string, Broker filters messages for all connected clients. One topic may consist of one or more topic levels. Forward slash(topic level separator) is used for separating each topic level.

     

    When API starts, the Monitoring API will monitor the sensor data and publish it in a combination of topics. The third party client can subscribe to any of those topics, based on the requirement.

    The topics are framed in such a way that it provides options for the user to subscribe at level 1, level 2, level 3, level 4, or individual sensor level data.

    While subscribing to each level of sensor data, the client needs to specify the hierarchy of the IDs. For e.g. to subscribe to level 4 sensor data, the client needs to specify level 1 id/ level 2 id/ level 3 id/ level 4 id.

    The user can subscribe to any type of sensor by specifying the sensor role as the last part of the topic.

    If the user doesn’t specify the role, the client will be subscribed to all types of sensors on that particular level.

    The user can also specify the sensor id that they wish to subscribe to. In that case, they need to specify the whole hierarchy of the sensor, starting from project id and ending with sensor id.

    Following is the list of topics exposed by API on startup.

     

    Features supported by MQTT:

    1. Authentication:

    EMQTT provides authentication of every user who intends to publish or subscribe to particular data. The user id and password is stored in the API database, into a separate collection called ‘mqtt

    While connecting to EMQTT broker, we provide the username name and password, and the MQTT Broker will validate the credentials based on the values present in the database.

    2. Access Control:

    EMQTT determines which user is allowed to access which topics. This information is stored in MongoDB under the table ‘mqtt_acl’

    By default, all users are allowed to access all topics by specifying ‘#’ as the allowed topic to publish and subscribe for all users.

    3. QoS:

    The Quality of Service (QoS) level is the Quality transfer of messages which ensures the delivery of messages between sending body & receiving body. There are 3 QoS levels in MQTT:

    • At most once(0) –The message is delivered at most once, or it is not delivered at all.
    • At least once(1) – The message is always delivered at least once.
    • Exactly once(2) – The message is always delivered exactly once.

    4. Last Will Message:

    MQTT uses the Last Will & Testament(LWT) mechanism to notify ungraceful disconnection of a client to other clients. In this mechanism, when a client is connecting to a broker, each client specifies its last will message which is a normal MQTT message with QoS, topic, retained flag & payload. This message is stored by the Broker until it it detects that the client has disconnected ungracefully.

    5. Retain Message:

    MQTT also has a feature of Message Retention. It is done by setting TRUE to retain the flag. It then retained the last message & QoS for the topic. When a client subscribes to a topic, the broker matches the topic with a retained message. Clients will receive messages immediately if the topic and the retained message are matched. Brokers only store one retained message for each topic.

    6. Duplicate Message:

    If a publisher doesn’t receive the acknowledgement of the published packet, it will resend the packet with DUP flag set to true. A duplicate message contains the same Message ID as the original message.

    7. Session:

    In general, when a client connects with a broker for the first time, the client needs to create subscriptions for all topics for which they are willing to receive data/messages from the broker. Suppose a session is not maintained, or there is no persistent session, or the client lost a connection with the broker, then users have to resubscribe to all the topics after reconnecting to the broker. For the clients with limited resources, it would be very tedious to subscribe to all topics again. So brokers use a persistent session mechanism, in which it saves all information relevant to the client. ‘clientId’ provided by client is used as ‘session identifier’, when the client establishes a connection with the broker.

    Features not-supported by MQTT:

    1. Not RESTful:

    MQTT does not allow a client to expose RESTful API endpoints. The only way to communicate is through the publish /subscribe mechanism.

    2. Obtaining Subscription List:

    The MQTT Broker doesn’t have the Client IDs and the subscribed topics by the clients. Hence, the API needs to publish all data to all possible combinations of topics. This would lead to a problem of network congestion in case of large data.

    MQTT Wildcards:

    MQTT clients can subscribe to one or more topics. At a time, one can subscribe to a single topic only. So we can use the following two wildcards to create a topic which can subscribe to many topics to receive data/message.

    1. Plus sign(+):

    This is a single level wildcard. This is used to match specific topic level. We can use this wildcard when we want to subscribe at topic level.

    Example: Suppose we want to subscribe for all Floor level ‘AL’(Ambient light) sensors, we can use Plus (+) sign level wild card instead of a specific zone level. We can use following topic:

    <project_id>/<building_id>/<floor_id>/+/AL</floor_id></building_id></project_id>

    2. Hash Sign(#):

    This is a multi level wildcard. This wildcard can be used only at the end of a topic. All data/messages get subscribed which match to left-hand side of the ‘#’ wildcard.

    Example: In case we want to receive all the messages related to all sensors for floor1 , we can use Hash sing(#) multi level wildcard after floor name & the slash( / ). We can use following topic-

    <level 1_id=””>/<level 2_id=””>/<level 3_id=””>/#</level></level></level>

    MQTT Test tools:

    Following are some popular open source testing tools for MQTT.

    1. MQTT Lens
    2. MQTT SPY
    3. MQTT FX

    Difference between MQTT & AMQP:

    MQTT is designed for lightweight devices like Embedded systems, where bandwidth is costly and the minimum overhead is required. MQTT uses byte stream to exchange data and control everything. Byte stream has optimized 2 byte fixed header, which is prefered for IoT.

    AMQP is designed with more advanced features and uses more system resources. It provides more advanced features related to messaging, topic-based publish & subscribe messaging, reliable queuing, transactions, flexible routing and security.

    Difference between MQTT & HTTP:

    MQTT is data-centric, whereas HTTP is document-centric. HTTP is a request-response protocol for client-server, on the other hand, MQTT uses publish-subscribe mechanism. Publish/subscribe model provides clients with the independent existence from one another and enhances the reliability of the whole system. Even if any of the client is out of network, the system keeps itself up and running

    As compared to HTTP, MQTT is lightweight (very short message header and the smallest packet message size of 2 bytes), and allows to compose lengthy headers and messages.

    MQTT Protocol ensures high delivery guarantees compared to HTTP.

    There are 3 levels of Quality of Services:

    at most once: it guarantees that message will be delivered with the best effort.

    at least once: It guarantees that message will be delivered at a minimum of one time. But the message can also be delivered again..

    exactly once: It guarantees that message will be delivered one and only one time.

    Last will & testament and Retained messages are the options provided by MQTT to users. With Last Will & Testament, in case of unexpected disconnection of a client, all subscribed clients will get a message from the broker. Newly subscribed clients will get immediate status updates via Retained message.

    HTTP Protocol has none of these abilities.

    Conclusion:

    MQTT is one of its kind message queuing protocols, best suited for embedded hardware devices. On the software level, it supports all major operating systems and platforms. It has proven its certainty as an ISO standard in IoT platforms because of its more pragmatic security and message reliability.

  • Monitoring a Docker Container with Elasticsearch, Kibana, and Metricbeat

    Since you are on this page, you have probably already started using Docker to deploy your applications and are enjoying it compared to virtual machines, because of it being lightweight, easy to deploy and its exceptional security management features.

    And, once the applications are deployed, monitoring your containers and tracking their activities in real time is very essential. Imagine a scenario where you are managing one or many virtual machines. Your pre-configured session will be doing everything, including monitoring. If you face any problems during production, then—with a handful of commands such as top, htop, iotop, and with flags like -o, %CPU, and %MEM—you are good to troubleshoot the issue.

    On the other hand, consider a scenario where you have the same nodes spread across 100-200 containers. You will need to see all activity in one place to query for information about what happened. Here, monitoring comes into the picture. We will be discussing more benefits as we move further.

    This blog will cover Docker monitoring with Elasticsearch, Kibana, and Metricbeat. Basically, Elasticsearch is a platform that allows us to have distributed search and analysis of data in real-time along with visualization. We’ll be discussing how all these work interdependently as we move ahead. Like Elasticsearch, Kibana is also open-source software. Kibana is an interface mainly used to visualize the data sent from Elasticsearch. Metricbeat is a lightweight shipper of collected metrics from your system to the desired target (Elasticsearch in this case). 

    What is Docker Monitoring?

    In simple terms, monitoring containers is how we keep track of the above metrics and analyze them to ensure the performance of applications built on microservices and to keep track of issues so that they can be solved more easily. This monitoring is vital for performance improvement and optimization and to find the RCA of various issues.

    There is a lot of software available for monitoring the Docker container, both open-source as well as proprietary, like Prometheus, AppOptics, Metricbeats, Datadog, Sumologic, etc.

    You can choose any of these based on convenience. 

    Why is Docker Monitoring needed?

    1. Monitoring helps early detection and to fix issues to avoid a breakdown during production
    2. New feature additions/updates implemented safely as the entire application is monitored
    3. Docker monitoring is beneficial for developers, IT pros, and enterprises as well.
    • For developers, Docker monitoring tracks bugs and helps to resolve them quickly along with enhancing security.
    • For IT pros, it helps with flexible integration of existing processes and enterprise systems and satisfies all the requirements.
    • For enterprises, it helps to build the application within a certified container within a secured ecosystem that runs smoothly. 

    Elasticsearch is a platform that allows us to have distributed search and analysis of data in real-time, along with visualization. Elasticsearch is free and open-source software. It goes well with a huge number of technologies, like Metricbeat, Kibana, etc. Let’s move onto the installation of Elasticsearch.

    Installation of Elasticsearch:

    Prerequisite: Elasticsearch is built in Java. So, make sure that your system at least has Java8 to run Elasticsearch.

    For installing Elasticsearch for your OS, please follow the steps at Installing Elasticsearch | Elasticsearch Reference [7.11].

    After installing,  check the status of Elasticsearch by sending an HTTP request on port 9200 on localhost.

    http://localhost:9200/

    This will give you a response as below:

    You can configure Elasticsearch by editing $ES_HOME/config/elasticsearch.yml 

    Learn more about configuring Elasticsearch here.

    Now, we are done with the Elasticsearch setup and are ready to move onto Kibana.

    Kibana:

    Like Elasticsearch, Kibana is also open-source software. Kibana is an interface mainly used to visualize the data from Elasticsearch. Kibana allows you to do anything via query and let’s you generate numerous visuals as per your requirements. Kibana lets you visualize enormous amounts of data in terms of line graphs, gauges, and all other graphs.

    Let’s cover the installation steps of Kibana.

    Installing Kibana

    Prerequisites: 

    • Must have Java1.8+ installed 
    • Elasticsearch v1.4.4+
    • Web browser such as Chrome, Firefox

    For installing Kibana with respect to your OS, please follow the steps at Install Kibana | Kibana Guide [7.11]

    Kibana runs on default port number 5601. Just send an HTTP request to port 5601 on localhost with http://localhost:5601/ 

    You should land on the Kibana dashboard, and it is now ready to use:

    You can configure Kibana by editing $KIBANA_HOME/config. For more about configuring Kibana, visit here.

    Let’s move onto the final part—setting up with Metricbeat.

    Metricbeat

    Metricbeat sends metrics frequently, and we can say it’s a lightweight shipper of collected metrics from your system.

    You can simply install Metricbeat to your system or servers to periodically collect metrics from the OS and the microservices running on services. The collected metrics are shipped to the output you specified, e.g., Elasticsearch, Logstash. 

    Installing Metricbeat

    For installing Metricbeat according to your OS, follow the steps at Install Kibana | Kibana Guide [7.11]

    As soon as we start the Metricbeat service, it sends Docker metrics to the Elasticsearch index, which can be confirmed by curling Elasticsearch indexes with the command:

    curl -XGET 'localhost:9200/_cat/indices?v&pretty'

    How Are They Internally Connected?

    We have now installed all three and they are up and running. As per the period mentioned, docker.yml will hit the Docker API and send the Docker metrics to Elasticsearch. Those metrics are now available in different indexes of Elasticsearch. As mentioned earlier, Kibana queries the data of Elasticsearch and visualizes it in the form of graphs. In this, all three are connected. 

    Please refer to the flow chart for more clarification:

    How to Create Dashboards?

    Now that we are aware of how these three tools work interdependently, let’s create dashboards to monitor our containers and understand those. 

    First of all, open the Dashboards section on Kibana (localhost:5601/) and click the Create dashboard button:

     

    You will be directed to the next page:

    Choose the type of visualization you want from all options:

    For example, let’s go with Lens

    (Learn more about Kibana Lens)

    Here, we will be looking for the number of containers vs. timestamps by selecting the timestamp on X-axis and the unique count of docker.container.created on Y-axis.

    As soon we have selected both parameters, it will generate a graph as shown in the snapshot, and we will be getting the count of created containers (here Count=1). If you create move containers on your system, when that data metric is sent to Kibana, the graph and the counter will be modified. In this way, you can monitor how many containers are created over time. In similar fashion, depending on your monitoring needs, you can choose a parameter from the left panel showing available fields like: 

    activemq.broker.connections.count

    docker.container.status

    Docker.container.tags

    Now, we will show one more example of how to create a bar graph:

    As mentioned above, to create a bar graph just choose “vertical bar” from the above snapshot. Here, I’m trying to get a bar graph for the count of documents vs. metricset names, such as network, file system, cpu, etc. So, as shown in the snapshot on the left, choose the Y-axis parameter as count and X-axis parameter as metricset.name as shown in the right side of the snapshot

    After hitting enter, a graph will be generated: 

    Similarly, you can try it out with multiple parameters with different types of graphs to monitor. Now, we will move onto the most important and widely used monitoring tool to track warnings, errors, etc., which is DISCOVER.

    Discover for Monitoring:

    Basically, Discover provides deep insights into data, showing you where you can apply searches and filters as well. With it, you can show which processes are taking more time and show only those. Filter out errors occurring with the message filter with a value of ERROR. Check the health of the container; check for logged-in users. These kinds of queries can be sent and the desired results can be achieved, leading to good monitoring of containers, same as the SQL queries. 

    [More about Discover here.]

    To apply filters, just click on the “filter by type” from the left panel, and you will see all available filtering options. From there, you can select one as per your requirements, and view those on the central panel. 

    Similar to filter, you can choose fields to be shown on the dashboard from the left panel with “Selected fields” right below the filters. (Here, we have only selected info for Source.)

    Now, if you take a look at the top part of the snapshot, you will find the search bar. This is the most useful part of Discover for monitoring.

    In that bar, you just need to put a query, and according to that query, logs will be filtered. For example, I will be putting a query for error messages equal to No memory stats data available.

    When we hit the update button on the right side, only logs containing that error message will be there and highlighted for differentiation, as shown in the snapshot. All other logs will not be shown. In this way, you can track a particular error and ensure that it does not exist after fixing it.

    In addition to query, it also provides keyword search. So, if you input a word like warning, error, memory, or user, then it will provide logs for that word, like “memory” in the snapshot:

     

    Similar to Kibana, we also receive logs in the terminal. For example, the following highlighted portion is about the state of your cluster. In the terminal, you can put a simple grep command for required logs. 

    With this, you can monitor Docker containers with multiple queries, such as nested queries for the Discover facility. There are many different graphs you can try depending on your requirements to keep your application running smoothly.

    Conclusion

    Monitoring requires a lot of time and effort. What we have seen here is a drop in the ocean. For some next steps, try:

    1. Monitoring network
    2. Aggregating logs from your different applications
    3. Aggregating logs from multiple containers
    4. Alerts setting and monitoring
    5. Nested queries for logs
  • Build ML Pipelines at Scale with Kubeflow

    Setting up a ML stack requires lots of tools, analyzing data, and training a model in the ML pipeline. But it is even harder to set up the same stack in multi-cloud environments. This is when Kubeflow comes into the picture and makes it easy to develop, deploy, and manage ML pipelines.

    In this article, we are going to learn how to install Kubeflow on Kubernetes (GKE), train a ML model on Kubernetes and publish the results. This introductory guide will be helpful for anyone who wants to understand how to use Kubernetes to run a ML pipeline in a simple, portable and scalable way.

    Kubeflow Installation on GKE

    You can install Kubeflow onto any Kubernetes cluster no matter which cloud it is, but the cluster needs to fulfill the following minimum requirements:

    • 4 CPU
    • 50 GB storage
    • 12 GB memory

    The recommended Kubernetes version is 1.14 and above.

    You need to download kfctl from the Kubeflow website and untar the file:
    tar -xvf kfctl_v1.0.2_<platform>.tar.gz -C /home/velotio/kubeflow</platform>

    Also, install kustomize using these instructions.

    Start by exporting the following environment variables:

    export PATH=$PATH:/home/velotio/kubeflow/
    export KF_NAME=kubeml
    export BASE_DIR=/home/velotio/kubeflow/
    export KF_DIR=${BASE_DIR}/${KF_NAME}
    export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.2.yaml"

    After we’ve exported these variables, we can build the kubebuilder and customize everything according to our needs. Run the following command:

    cd ${KF_DIR}
    kfctl build -V -f ${CONFIG_URI}

    This will download the file kfctl_k8s_istio.v1.0.2.yaml and a kustomize folder. If you want to expose the UI with LoadBalancer, change the file $KF_DIR/kustomize/istio-install/base/istio-noauth.yaml and edit the service istio-ingressgateway from NodePort to LoadBalancer.

    Now, you can install KubeFlow using the following commands:

    export CONFIG_FILE=${KF_DIR}/kfctl_k8s_istio.v1.0.2.yaml
    kfctl apply -V -f ${CONFIG_FILE}

    This will install a bunch of services that are required to run the ML workflows.

    Once successfully deployed, you can access the Kubeflow UI dashboard on the istio-ingressgateway service. You can find the IP using following command:

    kubectl get svc istio-ingressgateway -n istio-system -o jsonpath={.status.loadBalancer.ingress[0].ip}

    ML Workflow

    Developing your ML application consists of several stages:

    1. Gathering data and data analysis
    2. Researching the model for the type of data collected
    3. Training and testing the model
    4. Tuning the model
    5. Deploy the model

    These are multi-stage models for any ML problem you’re trying to solve, but where does Kubeflow fit in this model?

    Kubeflow provides its own pipelines to solve this problem. The Kubeflow pipeline consists of the ML workflow description, the different stages of the workflow, and how they combine in the form of graph. 

    Kubeflow provides an ability to run your ML pipeline on any hardware be it your laptop, cloud or multi-cloud environment. Wherever you can run Kubernetes, you can run your ML pipeline.

    Training your ML Model on Kubeflow

    Once you’ve deployed Kubeflow in the first step, you should be able to access the Kubeflow UI, which would look like:

    The first step is to upload your pipeline. However, to do that, you need to prepare your pipeline in the first place. We are going to use a financial series database and train our model. You can find the example code here:

    git clone https://github.com/kubeflow/examples.git
    cd examples/financial_time_series/
    export TRAIN_PATH=gcr.io/<project>/<image-name>/cpu:v1
    gcloud builds submit --tag $TRAIN_PATH .

    This command above will build the docker images, and we will create the bucket to store our data and model artifacts. 

    # create storage bucket that will be used to store models
    BUCKET_NAME=<your-bucket-name>
    gsutil mb gs://$BUCKET_NAME/

    Once we have our image ready on the GCR repo, we can start our training job on Kubernetes. Please have a look at the tfjob resource in CPU/tfjob1.yaml and update the image and bucket reference.

    kubectl apply -f CPU/tfjob1.yaml
    POD_NAME=$(kubectl get pods -n kubeflow --selector=tf-job-name=tfjob-flat 
          --template '{{range .items}}{{.metadata.name}}{{"n"}}{{end}}')
    kubectl logs -f $POD_NAME -n kubeflow

    Kubeflow Pipelines needs our pipeline file into a domain-specific-language. We can compile our python3 file with a tool called dsl-compile that comes with the Python3 SDK, which compile our pipeline into DSL.  So, first, install that SDK:

    pip3 install python-dateutil kfp==0.1.36

    Next, inspect the ml_pipline.py and update the ml_pipeline.py with the CPU image path that you built in the previous steps. Then, compile the DSL, using:

    python3 ml_pipeline.py

    Now, a file ml_pipeline.py.tar_gz is generated, which we can upload to the Kubeflow pipelines UI.

    Once the pipeline is uploaded, you can see the stages in a graph-like format.

    Next, we can click on the pipeline and create a run. For each run, you need to specify the params that you want to use. When the pipeline is running, you can inspect the logs:

    Run Jupyter Notebook in your ML Pipeline

    You can also interactively define your pipeline from the Jupyter notebook:

    1. Navigate to the Notebook Servers through the Kubeflow UI

    2. Select the namespace and click on “new server.”

    3. Give the server a name and provide the docker image for the TensorFlow on which you want to train your model. I took the TensorFlow 1.15 image.

    4. Once a notebook server is available, click on “connect” to connect to the server.

    5. This will open up a new window and a Jupyter terminal.

    6. Input the following command: pip install -U kfp.

    7. Download the notebook using following command: 

    curl -O https://raw.githubusercontent.com/kubeflow/examples/master/github_issue_summarization/pipelines/example_pipelines/pipelines-notebook.ipynb

    8. Now that you have notebook, you can replace the environment variables like WORKING_DIR, PROJECT_NAME and GITHUB_TOKEN. Once you do that, you can run the notebook step-by-step (one cell at a time) by pressing shift+enter, or you can run the whole notebook by clicking on menu and run all options.

    Conclusion

    The ML world has its own challenges; the environments are tightly coupled and the tools you needed to deploy to build an ML stack was extremely hard to set up and configure. This becomes harder in production environments because you have to be extremely cautious you are not breaking the components that are already present.

    Kubeflow makes getting started on ML highly accessible. You can run your ML workflows anywhere you can run Kubernetes. Kubeflow made it possible to run your ML stack on multi cloud environments, which enables ML engineers to easily train their models at scale with the scalability of Kubernetes.

    Related Articles

    1. The Ultimate Beginner’s Guide to Jupyter Notebooks
    2. Demystifying High Availability in Kubernetes Using Kubeadm