Category: Blogs

Building a Progressive Web Application in React [With Live Code Examples]
What is PWA:

A Progressive Web Application or PWA is a web application that is built to look and behave like native apps, operates offline-first, is optimized for a variety of viewports ranging from mobile, tablets to FHD desktop monitors and more. PWAs are built using front-end technologies such as HTML, CSS and JavaScript and bring native-like user experience to the web platform. PWAs can also be installed on devices just like native apps.

For an application to be classified as a PWA, it must tick all of these boxes:
- PWAs must implement service workers. Service workers act as a proxy between the web browsers and API servers. This allows web apps to manage and cache network requests and assets
- PWAs must be served over a secure network, i.e. the application must be served over HTTPS
- PWAs must have a web manifest definition, which is a JSON file that provides basic information about the PWA, such as name, different icons, look and feel of the app, splash screen, version of the app, description, author, etc
Why build a PWA?

Businesses and engineering teams should consider building a progressive web app instead of a traditional web app. Here are some of the most prominent arguments in favor of PWAs:
- PWAs are responsive. The mobile-first design approach enables PWAs to support a variety of viewports and orientation
- PWAs can work on slow Internet or no Internet environment. App developers can choose how a PWA will behave when there’s no Internet connectivity, whereas traditional web apps or websites simply stop working without an active Internet connection
- PWAs are secure because they are always served over HTTPs
- PWAs can be installed on the home screen, making the application more accessible
- PWAs bring in rich features, such as push notification, application updates and more
PWA and React

There are various ways to build a progressive web application. One can just use Vanilla JS, HTML and CSS or pick up a framework or library. Some of the popular choices in 2020 are Ionic, Vue, Angular, Polymer, and of course React, which happens to be my favorite front-end library.

Building PWAs with React

To get started, let’s create a PWA which lists all the users in a system.
```
npm init react-app users
cd users
yarn add react-router-dom
yarn run start
```
Next, we will replace the default App.js file with our own implementation.
import React from "react"; import { BrowserRouter, Route } from "react-router-dom"; import "./App.css"; const Users = () => { // state const [users, setUsers] = React.useState([]); // effects React.useEffect(() => { fetch("https://jsonplaceholder.typicode.com/users") .then((res) => res.json()) .then((users) => { setUsers(users); }) .catch((err) => {}); }, []); // render return ( <div> <h2>Users</h2> <ul> {users.map((user) => ( <li key={user.id}> {user.name} ({user.email}) </li> ))} </ul> </div> ); }; const App = () => ( <BrowserRouter> <Route path="/" exact component={Users} /> </BrowserRouter> ); export default App;
```
import React from "react";
import { BrowserRouter, Route } from "react-router-dom";
import "./App.css";
const Users = () => {
 // state
 const [users, setUsers] = React.useState([]);
 // effects
 React.useEffect(() => {
   fetch("https://jsonplaceholder.typicode.com/users")
     .then((res) => res.json())
     .then((users) => {
       setUsers(users);
     })
     .catch((err) => {});
 }, []);
 // render
 return (
   <div>
     <h2>Users</h2>
     <ul>
       {users.map((user) => (
         <li key={user.id}>
           {user.name} ({user.email})
         </li>
       ))}
     </ul>
   </div>
 );
};
const App = () => (
 <BrowserRouter>
   <Route path="/" exact component={Users} />
 </BrowserRouter>
);
export default App;
```
This displays a list of users fetched from the server.

Let’s also remove the logo.svg file inside the src directory and truncate the App.css file that is populated as a part of the boilerplate code.

To make this app a PWA, we need to follow these steps:

1. Register service worker
- In the file /src/index.js, replace serviceWorker.unregister() with serviceWorker.register().
```
import React from 'react';
import ReactDOM from 'react-dom';
import './index.css';
import App from './App';
import * as serviceWorker from './serviceWorker';
ReactDOM.render(
 <React.StrictMode>
   <App />
 </React.StrictMode>,
 document.getElementById('root')
);
serviceWorker.register();
```
- The default behavior here is to not set up a service worker, i.e. the CRA boilerplate allows the users to opt-in for the offline-first experience.
2. Update the manifest file
- The CRA boilerplate provides a manifest file out of the box. This file is located at /public/manifest.json and needs to be modified to include the name of the PWA, description, splash screen configuration and much more. You can read more about available configuration options in the manifest file here.
Our modified manifest file looks like this:
{ "short_name": "User Mgmt.", "name": "User Management", "icons": [ { "src": "favicon.ico", "sizes": "64x64 32x32 24x24 16x16", "type": "image/x-icon" }, { "src": "logo192.png", "type": "image/png", "sizes": "192x192" }, { "src": "logo512.png", "type": "image/png", "sizes": "512x512" } ], "start_url": ".", "display": "standalone", "theme_color": "#aaffaa", "background_color": "#ffffff" }
```
{
 "short_name": "User Mgmt.",
 "name": "User Management",
 "icons": [
   {
     "src": "favicon.ico",
     "sizes": "64x64 32x32 24x24 16x16",
     "type": "image/x-icon"
   },
   {
     "src": "logo192.png",
     "type": "image/png",
     "sizes": "192x192"
   },
   {
     "src": "logo512.png",
     "type": "image/png",
     "sizes": "512x512"
   }
 ],
 "start_url": ".",
 "display": "standalone",
 "theme_color": "#aaffaa",
 "background_color": "#ffffff"
}
```
PWA Splash Screen

‍

Here the display mode selected is “standalone” which tells the web browsers to give this PWA the same look and feel as that of a standalone app. Other display options include, “browser,” which is the default mode and launches the PWA like a traditional web app and “fullscreen,” which opens the PWA in fullscreen mode – hiding all other elements such as navigation, the address bar and the status bar.

The manifest can be inspected using Chrome dev tools > Application tab > Manifest.

1. Test the PWA:
- To test a progressive web app, build it completely first. This is because PWA features, such as caching aren’t enabled while running the app in dev mode to ensure hassle-free development
- Create a production build with: npm run build
- Change into the build directory: cd build
- Host the app locally: http-server or python3 -m http.server 8080‍
- Test the application by logging in to http://localhost:8080‍
‍

2. Audit the PWA: If you are testing the app for the first time on a desktop or laptop browser, PWA may look like just another website. To test and audit various aspects of the PWA, let’s use Lighthouse, which is a tool built by Google specifically for this purpose.

PWA on mobile

‍

At this point, we already have a simple PWA which can be published on the Internet and made available to billions of devices. Now let’s try to enhance the app by improving its offline viewing experience.

1. Offline indication: Since service workers can operate without the Internet as well, let’s add an offline indicator banner to let users know the current state of the application. We will use navigator.onLine along with the “online” and “offline” window events to detect the connection status.‍
```
 // state
  const [offline, setOffline] = React.useState(false);
  // effects
  React.useEffect(() => {
    window.addEventListener("offline", offlineListener);
    return () => {
      window.removeEventListener("offline", offlineListener);
    };
  }, []);
  
  {/* add to jsx */}
  {offline ? (
    <div className="banner-offline">The app is currently offline</div>
  ) : null}
```
The easiest way to test this is to just turn off the Wi-Fi on your dev machine. Chrome dev tools also provide an option to test this without actually going offline. Head over to Dev tools > Network and then select “Offline” from the dropdown in the top section. This should bring up the banner when the app is offline.

2. Let’s cache a network request using service worker

CRA comes with its own service-worker.js file which caches all static assets such as JavaScript and CSS files that are a part of the application bundle. To put custom logic into the service worker, let’s create a new file called ‘custom-service-worker.js’ and combine the two.
- Install react-app-rewired and update package.json:
1. yarn add react-app-rewired
2. Update the package.json as follows:
```
"scripts": {
   "start": "react-app-rewired start",
   "build": "react-app-rewired build",
   "test": "react-app-rewired test",
   "eject": "react-app-rewired eject"
},
```
- Create a config file to override how CRA generates service workers and inject our custom service worker, i.e. combine the two service worker files.
const WorkboxWebpackPlugin = require("workbox-webpack-plugin"); module.exports = function override(config, env) { config.plugins = config.plugins.map((plugin) => { if (plugin.constructor.name === "GenerateSW") { return new WorkboxWebpackPlugin.InjectManifest({ swSrc: "./src/service-worker-custom.js", swDest: "service-worker.js" }); } return plugin; }); return config; };
```
const WorkboxWebpackPlugin = require("workbox-webpack-plugin");
module.exports = function override(config, env) {
  config.plugins = config.plugins.map((plugin) => {
    if (plugin.constructor.name === "GenerateSW") {
      return new WorkboxWebpackPlugin.InjectManifest({
       swSrc: "./src/service-worker-custom.js",
       swDest: "service-worker.js"
      });
    }
    return plugin;
  });
  return config;
};
```
- Create service-worker-custom.js file and cache network request in there:
```
workbox.skipWaiting();
workbox.clientsClaim();
workbox.routing.registerRoute(
  new RegExp("/users"),
  workbox.strategies.NetworkFirst()
);
workbox.precaching.precacheAndRoute(self.__precacheManifest || [])
```
Your app should now work correctly in the offline mode.

Distributing and publishing a PWA

‍PWAs can be published just like any other website and only have one additional requirement, i.e. it must be served over HTTPs. When a user visits PWA from mobile or tablet, a pop-up is displayed asking the user if they’d like to install the app to their home screen.

Conclusion

Building PWAs with React enables engineering teams to develop, deploy and publish progressive web apps for billions of devices using technologies they’re already familiar with. Existing React apps can also be converted to a PWA. PWAs are fun to build, easy to ship and distribute, and add a lot of value to customers by providing native-live experience, better engagement via features, such as add to homescreen, push notifications and more without any installation process.
December 12, 2022
Amazon Lex + AWS Lambda: Beyond Hello World
In my previous blog, I explained how to get started with Amazon Lex and build simple bots. This blog aims at exploring the Lambda functions used by Amazon Lex for code validation and fulfillment. We will go along with the same example we created in our first blog i.e. purchasing a book and will see in details how the dots are connected.

This blog is divided into following sections:
1. Lambda function input format
2. Response format
3. Managing conversation context
4. An example (demonstration to understand better how context is maintained to make data flow between two different intents)
NOTE: Input to a Lambda function will change according to the language you use to create the function. Since we have used NodeJS for our example, everything will thus be explained using it.

Section 1: Lambda function input format

When communication is started with a Bot, Amazon Lex passes control to Lambda function, we have defined while creating the bot.

There are three arguments that Amazon Lex passes to a Lambda function:

1. Event:

event is a JSON variable containing all details regarding a bot conversation. Every time lambda function is invoked, event JSON is sent by Amazon Lex which contains the details of the respective message sent by the user to the bot.

Below is a sample event JSON:
```
{  
currentIntent: {    
name: 'orderBook',    
slots: {      
  bookType: null,      
  bookName: 'null'    
 },    
  confirmationStatus: 'None'
 },
 bot: {  
  name: 'PurchaseBook',  
  alias: '$LATEST',  
  version: '$LATEST'
 },
 userId: 'user-1',
 inputTranscript: 'buy me a book',
 invocationSource: 'DialogCodeHook',
 outputDialogMode: 'Text',
 messageVersion: '1.0'
 };
```
Format of event JSON is explained below:-
- currentIntent: It will contain information regarding the intent of message sent by the user to the bot. It contains following keys:
- name: intent name (for e.g orderBook, we defined this intent in our previous blog).
- slots: It will contain a map of slot names configured for that particular intent, populated with values recognized by Amazon Lex during the conversation. Default values are null.
- confirmationStatus: It provides the user response to a confirmation prompt if there is one. Possible values for this variable are:
- None: Default value
- Confirmed: When the user responds with a confirmation w.r.t confirmation prompt.
- Denied: When the user responds with a deny w.r.t confirmation prompt.
- inputTranscipt: Text input by the user for processing. In case of audio input, the text will be extracted from audio. This is the text that is actually processed to recognize intents and slot values.
- invocationSource: Its value directs the reason for invoking the Lambda function. It can have following two values:
- DialogCodeHook: This value directs the Lambda function to initialize the validation of user’s data input. If the intent is not clear, Amazon Lex can’t invoke the Lambda function.
- FulfillmentCodeHook: This value is set to fulfil the intent. If the intent is configured to invoke a Lambda function as a fulfilment code hook, Amazon Lex sets the invocationSource to this value only after it has all the slot data to fulfil the intent.
- bot: Details of bot that processed the request. It consists of below information:
- name: name of the bot.
- alias: alias of the bot version.
- version: the version of the bot.
- userId: Its value is defined by the client application. Amazon Lex passes it to the Lambda function.
- outputDialogMode: Its value depends on how you have configured your bot. Its value can be Text / Voice.
- messageVersion: The version of the message that identifies the format of the event data going into the Lambda function and the expected format of the response from a Lambda function. In the current implementation, only message version 1.0 is supported. Therefore, the console assumes the default value of 1.0 and doesn’t show the message version.
- sessionAttributes: Application-specific session attributes that the client sent in the request. It is optional.
2. Context:

AWS Lambda uses this parameter to provide the runtime information of the Lambda function that is executing. Some useful information we can get from context object are:-
- The time is remaining before AWS Lambda terminates the Lambda function.
- The CloudWatch log stream associated with the Lambda function that is executing.
- The AWS request ID returned to the client that invoked the Lambda function which can be used for any follow-up inquiry with AWS support.
Section 2: Response Format

Amazon Lex expects a response from a Lambda function in the following format:
```
{  
sessionAttributes: {},  
dialogAction: {   
type: "ElicitIntent/ ElicitSlot/ ConfirmIntent/ Delegate/ Close",
<structure based on type> 
}
}
```
The response consists of two fields. The sessionAttributes field is optional, the dialogAction field is required. The contents of the dialogAction field depends on the value of the type field.
- sessionAttributes: This is an optional field, it can be empty. If the function has to send something back to the client it should be passed under sessionAttributes. We will see its use-case in Section-4.
- dialogAction (Required): Type of this field defines the next course of action. There are five types of dialogAction explained below:-
1) Close: Informs Amazon Lex not to expect a response from the user. This is the case when all slots get filled. If you don’t specify a message, Amazon Lex uses the goodbye message or the follow-up message configured for the intent.
```
dialogAction: {   
type: "Close",   
fulfillmentState: "Fulfilled/ Failed", // (required)   
message: { // (optional)     
contentType: "PlainText or SSML",     
content: "Message to convey to the user"   
} 
}
```
2) ConfirmIntent: Informs Amazon Lex that the user is expected to give a yes or no answer to confirm or deny the current intent. The slots field must contain an entry for each of the slots configured for the specified intent. If the value of a slot is unknown, you must set it to null. The message and responseCard fields are optional.
```
dialogAction: {   
type: "ConfirmIntent",   
intentName: "orderBook",   
slots: {     
  bookName: "value",     
  bookType: "value",   
 }   
 message: { // (optional)     
  contentType: "PlainText or SSML",     
  content: "Message to convey to the user"   
  } 
  }
```
‍

3) Delegate: Directs Amazon Lex to choose the next course of action based on the bot configuration. The response must include any session attributes, and the slots field must include all of the slots specified for the requested intent. If the value of the field is unknown, you must set it to null. You will get a DependencyFailedException exception if your fulfilment function returns the Delegate dialog action without removing any slots.
```
dialogAction: {   
type: "Delegate",   
slots: {     
  slot1: "value",     
  slot2: "value"   
 } 
 }
```
4) ElicitIntent: Informs Amazon Lex that the user is expected to respond with an utterance that includes an intent. For example, “I want a buy a book” which indicates the OrderBook intent. The utterance “book,” on the other hand, is not sufficient for Amazon Lex to infer the user’s intent
```
dialogAction: 
{   type: "ElicitIntent",   
message: { // (optional)     
contentType: "PlainText or SSML",     
content: "Message to convey to the user"   
} 
}
```
5) ElicitSlot: Informs Amazon Lex that the user is expected to provide a slot value in the response. In below structure, we are informing Amazon lex that user response should provide value for the slot named ‘bookName’.
```
dialogAction: {   
  type: "ElicitSlot",   
  intentName: "orderBook",   
  slots: {     
    bookName: "",     
    bookType: "fiction",   
   },   
   slotToElicit: "bookName",   
   message: { // (optional)     
   contentType: "PlainText or SSML",     
   content: "Message to convey to the user"   
   }
   }
```
Section 3: Managing Conversation Context

Conversation context is the information that a user, your application, or a Lambda function provides to an Amazon Lex bot to fulfill an intent. Conversation context includes slot data that the user provides, request attributes set by the client application, and session attributes that the client application and Lambda functions create.

1. Setting session timeout

Session timeout is the length of time that a conversation session lasts. For in-progress conversations, Amazon Lex retains the context information, slot data, and session attributes till the session ends. Default session duration is 5 minutes but it can be changed upto 24 hrs while creating the bot in Amazon Lex console.

2.Setting session attributes

Session attributes contain application-specific information that is passed between a bot and a client application during a session. Amazon Lex passes session attributes to all Lambda functions configured for a bot. If a Lambda function adds or updates session attributes, Amazon Lex passes the new information back to the client application.

Session attributes persist for the duration of the session. Amazon Lex stores them in an encrypted data store until the session ends.

3. Sharing information between intents

If you have created a bot with more than one intent, information can be shared between them using session attributes. Attributes defined while fulfilling an intent can be used in other defined intent.

For example, a user of the book ordering bot starts by ordering books. the bot engages in a conversation with the user, gathering slot data, such as book name, and quantity. When the user places an order, the Lambda function that fulfils the order sets the lastConfirmedReservation session attribute containing information regarding ordered book and currentReservationPrice containing the price of the book. So, when the user has fulfilled the intent orderMagazine, the final price will be calculated on the bases of currentReservationPrice.

lastConfirmedReservation session attribute containing information regarding ordered book and currentReservationPrice containing the price of the book. So, when the user also fulfilled the intent orderMagazine, the final price will be calculated on the basis of currentReservationPrice.

Section 4: Example

The details of example Bot are below:

Bot Name: PurchaseBot

Intents :
- orderBook – bookName, bookType
- orderMagazine – magazineName, issueMonth
Session attributes set while fulfilling the intent “orderBook” are:
1. lastConfirmedReservation: In this variable, we are storing slot values corresponding to intent orderBook.
2. currentReservationPrice: Book price is calculated and stored in this variable
When intent orderBook gets fulfilled we will ask the user if he also wants to order a magazine. If the user responds with a confirmation bot will start fulfilling the intent “orderMagazine”.

Conclusion

AWS Lambda functions are used as code hooks for your Amazon Lex bot. You can identify Lambda functions to perform initialization and validation, fulfillment, or both in your intent configuration. This blog bought more technical insight of how Amazon Lex works and how it communicates with Lambda functions. This blog explains how a conversation context is maintained using the session attributes. I hope you find the information useful.
December 12, 2022
Ensure Continuous Delivery On Kubernetes With GitOps’ Argo CD
What is GitOps?

GitOps is a Continuous Deployment model for cloud-native applications. In GitOps, the Git repositories which contain the declarative descriptions of the infrastructure are considered as the single source of truth for the desired state of the system and we need to have an automated way to ensure that the deployed state of the system always matches the state defined in the Git repository. All the changes (deployment/upgrade/rollback) on the environment are triggered by changes (commits) made on the Git repository

“The artifacts that we run on any environment always have a corresponding code for them on some Git repositories. Can we say the same thing for our infrastructure code?”

Infrastructure as code tools, completely declarative orchestration tools like Kubernetes allow us to represent the entire state of our system in a declarative way. GitOps intends to make use of this ability and make infrastructure-related operations more developer-centric.

Role of Infrastructure as Code (IaC) in GitOps

The ability to represent the infrastructure as code is at the core of GitOps. But just having versioned controlled infrastructure as code doesn’t mean GitOps, we also need to have a mechanism in place to keep (try to keep) our deployed state in sync with the state we define in the Git repository.

“Infrastructure as Code is necessary but not sufficient to achieve GitOps”

GitOps does pull-based deployments

Most of the deployment pipelines we see currently, push the changes in the deployed environment. For example, consider that we need to upgrade our application to a newer version then we will update its docker image tag in some repository which will trigger a deployment pipeline and update the deployed application. Here the changes were pushed on the environment. In GitOps, we just need to update the image tag on the Git repository for that environment and the changes will be pulled to the environment to match the updated state in the Git repository. The magic of keeping the deployed state in sync with state-defined on Git is achieved with the help of operators/agents. The operator is like a control loop which can identify differences between the deployed state and the desired state and make sure they are the same.

Key benefits of GitOps:
1. All the changes are verifiable and auditable as they make their way into the system through Git repositories.
2. Easy and consistent replication of the environment as Git repository is the single source of truth. This makes disaster recovery much quicker and simpler.
3. More developer-centric experience for operating infrastructure. Also a smaller learning curve for deploying dev environments.
4. Consistent rollback of application as well as infrastructure state.
Introduction to Argo CD

Argo CD is a continuous delivery tool that works on the principles of GitOps and is built specifically for Kubernetes. The product was developed and open-sourced by Intuit and is currently a part of CNCF.

Key components of Argo CD:
1. API Server: Just like K8s, Argo CD also has an API server that exposes APIs that other systems can interact with. The API server is responsible for managing the application, repository and cluster credentials, enforcing authentication and authorization, etc.
2. Repository server: The repository server keeps a local cache of the Git repository, which holds the K8s manifest files for the application. This service is called by other services to get the K8s manifests.
3. Application controller: The application controller continuously watches the deployed state of the application and compares it with the desired state of the application, reports the API server whenever they are not in sync with each other and seldom takes corrective actions as well. It is also responsible for executing user-defined hooks for various lifecycle events of the application.
Key objects/resources in Argo CD:
1. Application: Argo CD allows us to represent the instance of the application which we want to deploy in an environment by creating Kubernetes objects of a custom resource definition(CRD) named Application. In the specification of Application type objects, we specify the source (repository) of our application’s K8s manifest files, the K8s server where we want to deploy those manifests, namespace, and other information.
2. AppProject: Just like Application, Argo CD provides another CRD named AppProject. AppProjects are used to logically group related-applications.
3. Repo Credentials: In the case of private repositories, we need to provide access credentials. For credentials, Argo CD uses the K8s secrets and config map. First, we create objects of secret types and then we update a special-purpose configuration map named argocd-cm with the repository URL and the secret which contains the credentials.‍
4. Cluster Credentials: Along with Git repository credentials, we also need to provide the K8s cluster credentials. These credentials are also managed using K8s secret, we are required to add the label argocd.argoproj.io/secret-type: cluster to these secrets.
Demo:

Enough of theory, let’s try out the things we discussed above. For the demo, I have created a simple app named message-app. This app reads a message set in the environment variable named MESSAGE. We will populate the values of this environment variable using a K8s config map. I have kept the K8s manifest files for the app in a separate repository. We have the application and the K8s manifest files ready. Now we are all set to install Argo CD and deploy our application.

Installing Argo CD:

For installing Argo CD, we first need to create a namespace named argocd.
```
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
```
Applying the files from the argo-cd repo directly is fine for demo purposes, but in actual environments, you must copy the file in your repository before applying them.

We can see that this command has created the core components and CRDs we discussed earlier in the blog. There are some additional resources as well but we can ignore them for the time being.

Accessing the Argo CD GUI

We have the Argo CD running in our cluster, Argo CD also provides a GUI which gives us a graphical representation of our k8s objects. It allows us to view events, pod logs, and other configurations.

By default, the GUI service is not exposed outside the cluster. Let us update its service type to LoadBalancer so that we can access it from outside.
```
kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'
```
After this, we can use the external IP of the argocd-server service and access the GUI.

The initial username is admin and the password is the name of the api-server pod. The password can be obtained by listing the pods in the argocd namespace or directly by this command.
```
kubectl get pods -n argocd -l app.kubernetes.io/name=argocd-server -o name | cut -d'/' -f 2 
```
Deploy the app:

Now let’s go ahead and create our application for the staging environment for our message app.
apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: message-app-staging namespace: argocd environment: staging finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default # Source of the application manifests source: repoURL: https://github.com/akash-gautam/message-app-manifests.git targetRevision: HEAD path: manifests/staging # Destination cluster and namespace to deploy the application destination: server: https://kubernetes.default.svc namespace: staging syncPolicy: automated: prune: false selfHeal: false
```
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: message-app-staging
  namespace: argocd
  environment: staging
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default

  # Source of the application manifests
  source:
    repoURL: https://github.com/akash-gautam/message-app-manifests.git
    targetRevision: HEAD
    path: manifests/staging

  # Destination cluster and namespace to deploy the application
  destination:
    server: https://kubernetes.default.svc
    namespace: staging

  syncPolicy:
    automated:
      prune: false
      selfHeal: false
```
In the application spec, we have specified the repository, where our manifest files are stored and also the path of the files in the repository.

We want to deploy our app in the same k8s cluster where ArgoCD is running so we have specified the local k8s service URL in the destination. We want the resources to be deployed in the staging namespace, so we have set it accordingly.

In the sync policy, we have enabled automated sync. We have kept the project as default.

Adding the resources-finalizer.argocd.argoproj.io ensures that all the resources created for the application are deleted when the Application is deleted. This is fine for our demo setup but might not always be desirable in real-life scenarios.

Our git repos are public so we don’t need to create secrets for git repo credentials.

We are deploying in the same cluster where Argo CD itself is running. As this is a demo setup, we can use the admin user created by Argo CD, so we don’t need to create secrets for cluster credentials either.

Now let’s go ahead and create the application and see the magic happen.
```
kubectl apply -f message-app-staging.yaml
```
As soon as the application is created, we can see it on the GUI.

By clicking on the application, we can see all the Kubernetes objects created for it.

It also shows the objects which are indirectly created by the objects we create. In the above image, we can see the replica set and endpoint object which are created as a result of creating the deployment and service respectively.

We can also click on the individual objects and see their configuration. For pods, we can see events and logs as well.

As our app is deployed now, we can grab public IP of message-app service and access it on the browser.

We can see that our app is deployed and accessible.

Updating the app

For updating our application, all we need to do is commit our changes to the GitHub repository. We know the message-app just displays the message we pass to it via. Config map, so let’s update the message and push it to the repository.
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: message-configmap
  labels:
    app: message-app
data:
  MESSAGE: "This too shall pass" #Put the message you want to display here.
```
Once the commit is done, Argo CD will start to sync again.

Once the sync is done, we will restart our message app pod, so that it picks up the latest values in the config map. Then we need to refresh the browser to see updated values.

As we discussed earlier, for making any changes to the environment, we just need to update the repo which is being used as the source for the environment and then the changes will get pulled in the environment.

We can follow an exact similar approach and deploy the application to the production environment as well. We just need to create a new application object and set the manifest path and deployment namespace accordingly.

Conclusion:

It’s still early days for GitOps, but it has already been successfully implemented at scale by many organizations. As the GitOps tools mature along with the ever-growing adoption of Kubernetes, I think many organizations will consider adopting GitOps soon. GitOps is not limited only to Kubernetes, but the completely declarative nature of Kubernetes makes it simpler to achieve GitOps. Argo CD is a deployment tool that’s tailored for Kubernetes and allows us to do deployments in a Kubernetes native way while following the principles of GitOps.I hope this blog helped you in understanding how what and why of GitOps and gave some insights to Argo CD.
December 12, 2022

Building a WebSocket Service with AWS Lambda & DynamoDB

WebSocket is an effective way for full-duplex, real-time communication between a web server and a client. It is widely used for building real-time web applications along with helper libraries that offer better features. Implementing WebSockets requires a persistent connection between two parties. Serverless functions are known for short execution time and non-persistent behavior. However, with the API Gateway support for WebSocket endpoints, it is possible to implement a Serverless service built on AWS Lambda, API Gateway, and DynamoDB.

Prerequisites

A basic understanding of real-time web applications will help with this implementation. Throughout this article, we will be using Serverless Framework for developing and deploying the WebSocket service. Also, Node.js is used to write the business logic.

Behind the scenes, Serverless uses Cloudformation to create various required resources, like API Gateway APIs, AWS Lambda functions, IAM roles and policies, etc.

Why Serverless?

Serverless Framework abstracts the complex syntax needed for creating the Cloudformation stacks and helps us focus on the business logic of the services. Along with that, there are a variety of plugins available that help developing serverless applications easier.

Why DynamoDB?

We need persistent storage for WebSocket connection data, along with AWS Lambda. DynamoDB, a serverless key-value database from AWS, offers low latency, making it a great fit for storing and retrieving WebSocket connection details.

Overview

In this application, we’ll be creating an AWS Lambda service that accepts the WebSocket connections coming via API Gateway. The connections and subscriptions to topics are persisted using DynamoDB. We will be using ws for implementing basic WebSocket clients for the demonstration. The implementation has a Lambda consuming WebSocket that receives the connections and handles the communication.

Base Setup

We will be using the default Node.js boilerplate offered by Serverless as a starting point.

serverless create --template aws-nodejs

serverless create --template aws-nodejs

A few of the Serverless plugins are installed and used to speed up the development and deployment of the Serverless stack. We also add the webpack config given here to support the latest JS syntax.

Adding Lambda role and policies:

The lambda function requires a role attached to it that has enough permissions to access DynamoDB and Execute API. These are the links for the configuration files:

Link to dynamoDB.yaml

Link to lambdaRole.yaml

Adding custom config for plugins:

The plugins used for local development must have the custom config added in the yaml file.

This is how our serverless.yaml file should look like after the base serverless configuration:

service: websocket-app
frameworkVersion: '2'
custom:
 dynamodb:
   stages:
     - dev
   start:
     port: 8000
     inMemory: true
     heapInitial: 200m
     heapMax: 1g
     migrate: true
     convertEmptyValues: true
 webpack:
   keepOutputDirectory: true
   packager: 'npm'
   includeModules:
     forceExclude:
       - aws-sdk
 
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
plugins:
 - serverless-dynamodb-local
 - serverless-plugin-existing-s3
 - serverless-dotenv-plugin
 - serverless-webpack
 - serverless-offline
resources:
 - Resources: ${file(./config/dynamoDB.yaml)}
 - Resources: ${file(./config/lambdaRoles.yaml)}
functions:
 hello:
   handler: handler.hello

service: websocket-app
frameworkVersion: '2'
custom:
 dynamodb:
   stages:
     - dev
   start:
     port: 8000
     inMemory: true
     heapInitial: 200m
     heapMax: 1g
     migrate: true
     convertEmptyValues: true
 webpack:
   keepOutputDirectory: true
   packager: 'npm'
   includeModules:
     forceExclude:
       - aws-sdk
 
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
plugins:
 - serverless-dynamodb-local
 - serverless-plugin-existing-s3
 - serverless-dotenv-plugin
 - serverless-webpack
 - serverless-offline
resources:
 - Resources: ${file(./config/dynamoDB.yaml)}
 - Resources: ${file(./config/lambdaRoles.yaml)}
functions:
 hello:
   handler: handler.hello

Add WebSocket Lambda:

We need to create a lambda function that accepts WebSocket events from API Gateway. As you can see, we’ve defined 3 WebSocket events for the lambda function.

$connect
$disconnect
$default

These 3 events stand for the default routes that come with WebSocket API Gateway offering. $connect and $disconnect are used for initialization and termination of the socket connection, where $default route is for data transfer.

functions:
 websocket:
   handler: lambda/websocket.handler
   events:
     - websocket:
         route: $connect
     - websocket:
         route: $disconnect
     - websocket:
         route: $default

functions:
 websocket:
   handler: lambda/websocket.handler
   events:
     - websocket:
         route: $connect
     - websocket:
         route: $disconnect
     - websocket:
         route: $default

We can go ahead and update how data is sent and add custom WebSocket routes to the application.

The lambda needs to establish a connection with the client and handle the subscriptions. The logic for updating the DynamoDB is written in a utility class client. Whenever a connection is received, we create a record in the topics table.

console.log(`Received socket connectionId: ${event.requestContext && event.requestContext.connectionId}`);
       if (!(event.requestContext && event.requestContext.connectionId)) {
           throw new Error('Invalid event. Missing `connectionId` parameter.');
       }
       const connectionId = event.requestContext.connectionId;
       const route = event.requestContext.routeKey;
       console.log(`data from ${connectionId} ${event.body}`);
       const connection = new Client(connectionId);
       const response = { statusCode: 200, body: '' };
 
       if (route === '$connect') {
           console.log(`Route ${route} - Socket connectionId connectedconected: ${event.requestContext && event.requestContext.connectionId}`);
           await new Client(connectionId).connect();
           return response;
       }

console.log(`Received socket connectionId: ${event.requestContext && event.requestContext.connectionId}`);
       if (!(event.requestContext && event.requestContext.connectionId)) {
           throw new Error('Invalid event. Missing `connectionId` parameter.');
       }
       const connectionId = event.requestContext.connectionId;
       const route = event.requestContext.routeKey;
       console.log(`data from ${connectionId} ${event.body}`);
       const connection = new Client(connectionId);
       const response = { statusCode: 200, body: '' };
 
       if (route === '$connect') {
           console.log(`Route ${route} - Socket connectionId connectedconected: ${event.requestContext && event.requestContext.connectionId}`);
           await new Client(connectionId).connect();
           return response;
       }

The Client utility class internally creates a record for the given connectionId in the DynamoDB topics table.

async subscribe({ topic, ttl }) {
   return dynamoDBClient
     .put({ 
        Item: {
         topic,
         connectionId: this.connectionId,
        ttl: typeof ttl === 'number' ? ttl : Math.floor(Date.now() / 1000) + 60 * 60 * 2,
       },
       TableName: process.env.TOPICS_TABLE,
     }).promise();
 }

async subscribe({ topic, ttl }) {
   return dynamoDBClient
     .put({ 
        Item: {
         topic,
         connectionId: this.connectionId,
        ttl: typeof ttl === 'number' ? ttl : Math.floor(Date.now() / 1000) + 60 * 60 * 2,
       },
       TableName: process.env.TOPICS_TABLE,
     }).promise();
 }

Similarly, for the $disconnect route, we remove the INITIAL_CONNECTION topic record when a client disconnects.

else if (route === '$disconnect') {
 console.log(`Route ${route} - Socket disconnected: ${ event.requestContext.connectionId}`);
           await new Client(connectionId).unsubscribe();
           return response;
       }

else if (route === '$disconnect') {
 console.log(`Route ${route} - Socket disconnected: ${ event.requestContext.connectionId}`);
           await new Client(connectionId).unsubscribe();
           return response;
       }

The client.unsubscribe method internally removes the connection entry from the DynamoDB table. Here, the getTopics method fetches all the topics the particular client has subscribed to.

async unsubscribe() {
   const topics = await this.getTopics();
   if (!topics) {
     throw Error(`Topics got undefined`);
   }
   return this.removeTopics({
     [process.env.TOPICS_TABLE]: topics.map(({ topic, connectionId }) => ({
       DeleteRequest: { Key: { topic, connectionId } },
     })),
   });
 }

async unsubscribe() {
   const topics = await this.getTopics();
   if (!topics) {
     throw Error(`Topics got undefined`);
   }
   return this.removeTopics({
     [process.env.TOPICS_TABLE]: topics.map(({ topic, connectionId }) => ({
       DeleteRequest: { Key: { topic, connectionId } },
     })),
   });
 }

Now comes the default route part of the lambda where we customize message handling. In this implementation, we’re relaying our message handling based on the event.body.type, which indicates what kind of message is received from the client to server. The subscribe type here is used to subscribe to new topics. Similarly, the message type is used to receive the message from one client and then publish it to other clients who have subscribed to the same topic as the sender.

console.log(`Route ${route} - data from ${connectionId}`);
           if (!event.body) {
               return response;
           }
           let body = JSON.parse(event.body);
           const topic = body.topic;
           if (body.type === 'subscribe') {
               connection.subscribe({ topic });
               console.log(`Client subscribing for topic: ${topic}`);
           }
           if (body.type === 'message') {
               await new Topic(topic).publishMessage({ data: body.message });
               console.error(`Published messages to subscribers`);
               return response;
           }
           return response;

console.log(`Route ${route} - data from ${connectionId}`);
           if (!event.body) {
               return response;
           }
           let body = JSON.parse(event.body);
           const topic = body.topic;
           if (body.type === 'subscribe') {
               connection.subscribe({ topic });
               console.log(`Client subscribing for topic: ${topic}`);
           }
           if (body.type === 'message') {
               await new Topic(topic).publishMessage({ data: body.message });
               console.error(`Published messages to subscribers`);
               return response;
           }
           return response;

Similar to $connect, the subscribe type of payload, when received, creates a new subscription for the mentioned topic.

Publishing the messages

Here is the interesting part of this lambda. When a client sends a payload with type message, the lambda calls the publishMessage method with the data received. The method gets the subscribers active for the topic and publishes messages using another utility TopicSubscriber.sendMessage

async publishMessage(data) {
   const subscribers = await this.getSubscribers();
   const promises = subscribers.map(async ({ connectionId, subscriptionId }) => {
     const TopicSubscriber = new Client(connectionId);
       const res = await TopicSubscriber.sendMessage({
         id: subscriptionId,
         payload: { data },
         type: 'data',
       });
       return res;
   });
   return Promise.all(promises);
 }

async publishMessage(data) {
   const subscribers = await this.getSubscribers();
   const promises = subscribers.map(async ({ connectionId, subscriptionId }) => {
     const TopicSubscriber = new Client(connectionId);
       const res = await TopicSubscriber.sendMessage({
         id: subscriptionId,
         payload: { data },
         type: 'data',
       });
       return res;
   });
   return Promise.all(promises);
 }

The sendMessage executes the API endpoint, which is the API Gateway URL after deployment. As we’re using serverless-offline for the local development, the IS_OFFLINE env variable is automatically set.

const endpoint =  process.env.IS_OFFLINE ? 'http://localhost:3001' : process.env.PUBLISH_ENDPOINT;
   console.log('publish endpoint', endpoint);
   const gatewayClient = new ApiGatewayManagementApi({
     apiVersion: '2018-11-29',
     credentials: config,
     endpoint,
   });
   return gatewayClient
     .postToConnection({
       ConnectionId: this.connectionId,
       Data: JSON.stringify(message),
     })
     .promise();

const endpoint =  process.env.IS_OFFLINE ? 'http://localhost:3001' : process.env.PUBLISH_ENDPOINT;
   console.log('publish endpoint', endpoint);
   const gatewayClient = new ApiGatewayManagementApi({
     apiVersion: '2018-11-29',
     credentials: config,
     endpoint,
   });
   return gatewayClient
     .postToConnection({
       ConnectionId: this.connectionId,
       Data: JSON.stringify(message),
     })
     .promise();

Instead of manually invoking the API endpoint, we can also use DynamoDB streams to trigger a lambda asynchronously and publish messages to topics.

Implementing the client

For testing the socket implementation, we will be using a node.js script ws-client.js. This creates two nodejs ws clients: one that sends the data and another that receives it.

const WebSocket = require('ws');
const sockedEndpoint = 'http://0.0.0.0:3001';
const ws1 = new WebSocket(sockedEndpoint, {
 perMessageDeflate: false
});
const ws2 = new WebSocket(sockedEndpoint, {
 perMessageDeflate: false
});

const WebSocket = require('ws');
const sockedEndpoint = 'http://0.0.0.0:3001';
const ws1 = new WebSocket(sockedEndpoint, {
 perMessageDeflate: false
});
const ws2 = new WebSocket(sockedEndpoint, {
 perMessageDeflate: false
});

The first client on connect sends the data at an interval of one second to a topic named general. The count increments each send.

ws1.on('open', () => {
   console.log('WS1 connected');
   let count = 0;
   setInterval(() => {
     const data = {
       type: 'message',
       message: `count is ${count}`,
       topic: 'general'
     }
     const message  = JSON.stringify(data);
     ws1.send(message, (err) => {
       if(err) {
         console.log(`Error occurred while send data ${err.message}`)
       }
       console.log(`WS1 OUT ${message}`);
     })
     count++;
   }, 15000)
})

ws1.on('open', () => {
   console.log('WS1 connected');
   let count = 0;
   setInterval(() => {
     const data = {
       type: 'message',
       message: `count is ${count}`,
       topic: 'general'
     }
     const message  = JSON.stringify(data);
     ws1.send(message, (err) => {
       if(err) {
         console.log(`Error occurred while send data ${err.message}`)
       }
       console.log(`WS1 OUT ${message}`);
     })
     count++;
   }, 15000)
})

The second client on connect will first subscribe to the general topic and then attach a handler for receiving data.

ws2.on('open', () => {
 console.log('WS2 connected');
 const data = {
   type: 'subscribe',
   topic: 'general'
 }
 ws2.send(JSON.stringify(data), (err) => {
   if(err) {
     console.log(`Error occurred while send data ${err.message}`)
   }
 })
});
ws2.on('message', ( message) => {
 console.log(`ws2 IN ${message}`);
});

ws2.on('open', () => {
 console.log('WS2 connected');
 const data = {
   type: 'subscribe',
   topic: 'general'
 }
 ws2.send(JSON.stringify(data), (err) => {
   if(err) {
     console.log(`Error occurred while send data ${err.message}`)
   }
 })
});
ws2.on('message', ( message) => {
 console.log(`ws2 IN ${message}`);
});

Once the service is running, we should be able to see the following output, where the two clients successfully sharing and receiving the messages with our socket server.

Conclusion

With API Gateway WebSocket support and DynamoDB, we’re able to implement persistent socket connections using serverless functions. The implementation can be improved and can be as complex as needed.

December 12, 2022

Elasticsearch – Basic and Advanced Concepts
What is Elasticsearch?

In our previous blog, we have seen Elasticsearch is a highly scalable open-source full-text search and analytics engine, built on the top of Apache Lucene. Elasticsearch allows you to store, search, and analyze huge volumes of data as quickly as possible and in near real-time.

Basic Concepts –
- Index – Large collection of JSON documents. Can be compared to a database in relational databases. Every document must reside in an index.
- Shards – Since, there is no limit on the number of documents that reside in an index, indices are often horizontally partitioned as shards that reside on nodes in the cluster.
  Max documents allowed in a shard = 2,147,483,519 (as of now)‍
- Type – Logical partition of an index. Similar to a table in relational databases. ‍
- Fields – Similar to a column in relational databases. ‍
- Analyzers – Used while indexing/searching the documents. These contain “tokenizers” that split phrases/text into tokens and “token-filters”, that filter/modify tokens during indexing & searching.‍
- Mappings – Combination of Field + Analyzers. It defines how your fields can be stored & indexed.
Inverted Index

ES uses Inverted Indexes under the hood. Inverted Index is an index which maps terms to documents containing them.

Let’s say, we have 3 documents :
1. Food is great
2. It is raining
3. Wind is strong
An inverted index for these documents can be constructed as –

The terms in the dictionary are stored in a sorted order to find them quickly.

Searching multiple terms is done by performing a lookup on the terms in the index. It performs either UNION or INTERSECTION on them and fetches relevant matching documents.

An ES Index is spanned across multiple shards, each document is routed to a shard in a round–robin fashion while indexing. We can customize which shard to route the document, and which shard search-requests are sent to.

ES Index is made of multiple Lucene indexes, which in turn, are made up of index segments. These are write once, read many types of indices, i.e the index files Lucene writes are immutable (except for deletions).

Analyzers –

Analysis is the process of converting text into tokens or terms which are added to the inverted index for searching. Analysis is performed by an analyzer. An analyzer can be either a built-in or a custom.

We can define single analyzer for both indexing & searching, or a different search-analyzer and an index-analyzer for a mapping.

Building blocks of analyzer-
- Character filters – receives the original text as a stream of characters and can transform the stream by adding, removing, or changing characters.
- Tokenizers – receives a stream of characters, breaks it up into individual tokens.
- Token filters – receives the token stream and may add, remove, or change tokens.
Some Commonly used built-in analyzers –

1. Standard –

Divides text into terms on word boundaries. Lower-cases all terms. Removes punctuation and stopwords (if specified, default = None).

Text: The 2 QUICK Brown-Foxes jumped over the lazy dog’s bone.

Output: [the, 2, quick, brown, foxes, jumped, over, the, lazy, dog’s, bone]

2. Simple/Lowercase –

Divides text into terms whenever it encounters a non-letter character. Lower-cases all terms.

Text: The 2 QUICK Brown-Foxes jumped over the lazy dog’s bone.

Output: [ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]

3. Whitespace –

Divides text into terms whenever it encounters a white-space character.

Text: The 2 QUICK Brown-Foxes jumped over the lazy dog’s bone.

Output: [ The, 2, QUICK, Brown-Foxes, jumped, over, the, lazy, dog’s, bone.]

4. Stopword –

Same as simple-analyzer with stop word removal by default.

Text: The 2 QUICK Brown-Foxes jumped over the lazy dog’s bone.

Output: [ quick, brown, foxes, jumped, over, lazy, dog, s, bone]

5. Keyword / NOOP –

Returns the entire input string as it is.

Text: The 2 QUICK Brown-Foxes jumped over the lazy dog’s bone.

Output: [The 2 QUICK Brown-Foxes jumped over the lazy dog’s bone.]

Some Commonly used built-in tokenizers –

1. Standard –

Divides text into terms on word boundaries, removes most punctuation.

2. Letter –

Divides text into terms whenever it encounters a non-letter character.

3. Lowercase –

Letter tokenizer which lowercases all tokens.

4. Whitespace –

Divides text into terms whenever it encounters any white-space character.

5. UAX-URL-EMAIL –

Standard tokenizer which recognizes URLs and email addresses as single tokens.

6. N-Gram –

Divides text into terms when it encounters anything from a list of specified characters (e.g. whitespace or punctuation), and returns n-grams of each word: a sliding window of continuous letters, e.g. quick → [qu, ui, ic, ck, qui, quic, quick, uic, uick, ick].

7. Edge-N-Gram –

It is similar to N-Gram tokenizer with n-grams anchored to the start of the word (prefix- based NGrams). e.g. quick → [q, qu, qui, quic, quick].

8. Keyword –

Emits exact same text as a single term.

Make your mappings right –

Analyzers if not made right, can increase your search time extensively.

Avoid using regular expressions in queries as much as possible. Let your analyzers handle them.

ES provides multiple tokenizers (standard, whitespace, ngram, edge-ngram, etc) which can be directly used, or you can create your own tokenizer.

A simple use-case where we had to search for a user who either has “brad” in their name or “brad_pitt” in their email (substring based search), one would simply go and write a regex for this query, if no proper analyzers are written for this mapping.
```
{
  "query": {
    "bool": {
      "should": [
        {
          "regexp": {
            "email.raw": ".*brad_pitt.*"
          }
        },
        {
          "regexp": {
            "name.raw": ".*brad.*"
          }
        }
      ]
    }
  }
}
```
This took 16s for us to fetch 1 lakh out of 60 million documents

Instead, we created an n-gram analyzer with lower-case filter which would generate all relevant tokens while indexing.

The above regex query was updated to –
```
{
  "query": {
    "bool": {
      "multi_match": {
        "query": "brad",
        "fields": [
          "email.suggestion",
          "full_name.suggestion"
        ]
      }
    }
  },
  "size": 25
}
```
This took 109ms for us to fetch 1 lakh out of 60 million documents

Thus, previous search query which took more than 10-25s got reduced to less than 800-900ms to fetch the same set of records.

Had the use-case been to search results where name starts with “brad” or email starts with “brad_pitt” (prefix based search), it is better to go for edge-n-gram analyzer or suggesters.

Performance Improvement with Filter Queries –

Use Filter queries whenever possible.

ES usually scores documents and returns them in sorted order as per their scores. This may take a hit on performance if scoring of documents is not relevant to our use-case. In such scenarios, use “filter” queries which give boolean scores to documents.
```
{
  "query": {
    "bool": {
      "multi_match": {
        "query": "brad",
        "fields": [
          "email.suggestion",
          "full_name.suggestion"
        ]
      }
    }
  },
  "size": 25
}
```
Above query can now be written as –
```
{
  "query": {
    "bool": {
      "filter": {
        "bool": {
          "must": [
            {
              "multi_match": {
                "query": "brad",
                "fields": [
                  "email.suggestion",
                  "full_name.suggestion"
                ]
              }
            }
          ]
        }
      }
    }
  },
  "size": 25
}
```
This will reduce query-time by a few milliseconds.

Re-indexing made faster –

Before creating any mappings, know your use-case well.

ES does not allow us to alter existing mappings unlike “ALTER” command in relational databases, although we can keep adding new mappings to the index.

The only way to change existing mappings is by creating a new index, re-indexing existing documents and aliasing the new-index with required name with ZERO downtime on production. Note – This process can take days if you have millions of records to re-index.

To re-index faster, we can change a few settings –

1. Disable swapping – Since no requests will be directed to the new index till indexing is done, we can safely disable swap.
Command for Linux machines –
```
sudo swapoff -a
```
2. Disable refresh_interval for ES – Default refresh_interval is 1s which can safely be disabled while documents are getting re-indexed.

3. Change bulk size while indexing – ES usually indexes documents in chunks of size 1k. It is preferred to increase this default size to approx 5 to 10K, although we need to find the sweet spot while reindexing to avoid load on current index.

4. Reset replica count to 0 – ES creates at least 1 replica per shard, by default. We can set this to 0 while indexing & reset it to required value post indexing.

Conclusion

ElasticSearch is a very powerful database for text-based searches. The Elastic ecosystem is widely used for reporting, alerting, machine learning, etc. This article just gives an overview of ElasticSearch mappings and how creating relevant mappings can improve your query performance & accuracy. Giving right mappings, right resources to your ElasticSearch cluster can do wonders.
December 12, 2022
How To Use Inline Functions In React Applications Efficiently
This blog post explores the performance cost of inline functions in a React application. Before we begin, let’s try to understand what inline function means in the context of a React application.

What is an inline function?

Simply put, an inline function is a function that is defined and passed down inside the render method of a React component.

Let’s understand this with a basic example of what an inline function might look like in a React application:
```
export default class CounterApp extends React.Component {
  constructor(props) {
    super(props);
    this.state = { count: 0 };
  }
  render() {
    return (
      <div className="App">
        <button
          onClick={() => {
          this.setState({ count: this.state.count + 1 });
          }}
        >COUNT ({this.state.count})</button>
      </div>
    );
  }
}
```
The onClick prop, in the example above, is being passed as an inline function that calls this.setState. The function is defined within the render method, often inline with JSX. In the context of React applications, this is a very popular and widely used pattern.

Let’s begin by listing some common patterns and techniques where inline functions are used in a React application:
- Render prop: A component prop that expects a function as a value. This function must return a JSX element, hence the name. Render prop is a good candidate for inline functions.
```
render() {
  return (
    <ListView
      items={items}
      render={({ item }) => (<div>{item.label}</div>)}
    />
  );
}
```
- DOM event handlers: DOM event handlers often make a call to setState or invoke some effect in the React application such as sending data to an API server.
```
<button
  onClick={() => {
    this.setState({ count: this.state.count + 1 });
  }}>
  COUNT ({this.state.count})
</button>
```
- Custom function or event handlers passed to child: Oftentimes, a child component requires a custom event handler to be passed down as props. Inline function is usually used in this scenario.
```
<Button onTap={() => {
  this.nextPage();
}}>Next<Button>
```
Alternatives to inline function
- Bind in constructor: One of the most common patterns is to define the function within the class component and then bind context to the function in constructor. We only need to bind the current context if we want to use this keyword inside the handler function.
export default class CounterApp extends React.Component { constructor(props) { super(props); this.state = { count: 0 }; this.increaseCount = this.increaseCount.bind(this); } increaseCount() { this.setState({ count: this.state.count + 1 }); } render() { return ( <div className="App"> <button onClick={this.increaseCount}>COUNT ({this.state.count})</button> </div> ); } }
```
export default class CounterApp extends React.Component {
  constructor(props) {
    super(props);
    this.state = { count: 0 };
    this.increaseCount = this.increaseCount.bind(this);
  }

  increaseCount() {
    this.setState({ count: this.state.count + 1 });
  }

  render() {
    return (
      <div className="App">
        <button onClick={this.increaseCount}>COUNT ({this.state.count})</button>
      </div>
    );
 }
}
```
- Bind in render: Another common pattern is to bind the context inline when the function is passed down. Eventually, this gets repetitive and hence the first approach is more popular.
```
render() {
  return (
    <div className="App">
      <button onClick={this.increaseCount.bind(this)}>COUNT ({this.state.count})</button>
    </div>
  );
}
```
- Define as public field:
```
increaseCount = () => {
  this.setState({ count: this.state.count + 1 });
};

render() {
  return (
    <div className="App">
      <button onClick={this.increaseCount}>
        COUNT ({this.state.count})
      </button>
    </div>
  );
}
view raw
```
There are several other approaches that React dev community has come up with, like using a helper method to bind all functions automatically in the constructor.

After understanding inline functions with its examples and also taking a look at a few alternatives, let’s see why inline functions are so popular and widely used.

Why use inline function

Inline function definitions are right where they are invoked or passed down. This means inline functions are easier to write, especially when the body of the function is of a few instructions such as calling setState. This works well within loops as well.

For example, when rendering a list and assigning a DOM event handler to each list item, passing down an inline function feels much more intuitive. For the same reason, inline functions also make code more organized and readable.

Inline arrow functions preserve context that means developers can use this without having to worry about current execution context or explicitly bind a context to the function.
```
<Button onTap={() => {
  this.prevPage();
}}>Previous<Button>
```
Inline functions make value from parent scope available within the function definition. It results in more intuitive code and developers need to pass down fewer parameters. Let’s understand this with an example.
```
render() {
  const { count } = this.state;
  return (
    <div className="App">
      <button
        onClick={() => {
          this.setState({ count: count + 1 });
      }}>
        COUNT ({count})
      </button>
    </div>
  );
}
```
Here, the value of count is readily available to onClick event handlers. This behavior is called closing over.

For these reasons, React developers make use of inline functions heavily. That said, inline function has also been a hot topic of debate because of performance concerns. Let’s take a look at a few of these arguments.

Arguments against inline functions
- A new function is defined every time the render method is called. It results in frequent garbage collection, and hence performance loss.
- There is an eslint config that advises against using inline function jsx-no-bind. The idea behind this rule is when an inline function is passed down to a child component, React uses reference checks to re-render the component. This can result in child component rendering again and again as a reference to the passed prop value i.e. inline function. In this case, it doesn’t match the original one.‍
<listitem onclick=”{()” ==””> console.log(‘click’)}></listitem>

Suppose ListItem component implements shouldComponentUpdate method where it checks for onClick prop reference. Since inline functions are created every time a component re-renders, this means that the ListItem component will reference a new function every time, which points to a different location in memory. The comparison checks in shouldComponentUpdate and tells React to re-render ListItem even though the inline function’s behavior doesn’t change. This results in unnecessary DOM updates and eventually reduces the performance of applications.

Performance concerns revolving around the Function.prototype.bind methods: when not using arrow functions, the inline function being passed down must be bound to a context if using this keyword inside the function. The practice of calling .bind before passing down an inline function raises performance concerns, but it has been fixed. For older browsers, Function.prototype.bind can be supplemented with a polyfill for performance.

Now that we’ve summarized a few arguments in favor of inline functions and a few arguments against it, let’s investigate and see how inline functions really perform.
```
render() {
  return (
    <div>
      {this.state.timeThen > this.state.timeNow ? (
       <>
         <button onClick={() => { /* some action */ }} />
         <button onClick={() => { /* another action */ }} />
       </>
      ) : (
        <button onClick={() => { /* yet another action */ }} />
      )}
    </div>
  );
}
```
Pre-optimization can often lead to bad code. For instance, let’s try to get rid of all the inline function definitions in the component above and move them to the constructor because of performance concerns.

We’d then have to define 3 custom event handlers in the class definition and bind context to all three functions in the constructor.
export default class CounterApp extends React.Component { constructor(props) { super(props); this.state = { timeThen: ..., timeNow: Date.now() }; this.someAction = this.someAction.bind(this); this.anotherAction = this.anotherAction.bind(this); this.yetAnotherAction = this.yetAnotherAction.bind(this); } someAction() { /* some action */ } anotherAction() { /* another action */ } yetAnotherAction() { /* yet another action */ } render() { return (<div> {this.state.timeThen > this.state.timeNow ? ( <> <button onClick={this.someAction} /> <button onClick={this.anotherAction} /> </> ) : ( <button onClick={this.yetAnotherAction} /> )}</div>); } }
```
export default class CounterApp extends React.Component {
  constructor(props) {
    super(props);
    this.state = {
      timeThen: ...,
      timeNow: Date.now()
    };
    this.someAction = this.someAction.bind(this);
    this.anotherAction = this.anotherAction.bind(this);
    this.yetAnotherAction = this.yetAnotherAction.bind(this);
  }

  someAction() { /* some action */ }
  anotherAction() { /* another action */ }
  yetAnotherAction() { /* yet another action */ }

  render() {
    return (<div>
      {this.state.timeThen > this.state.timeNow ? (
        <>
          <button onClick={this.someAction} />
          <button onClick={this.anotherAction} />
        </>
      ) : (
        <button onClick={this.yetAnotherAction} />
      )}</div>);
  }
}
```
This would increase the initialization time of the component significantly as opposed to inline function declarations where only one or two functions are defined and used at a time based on the result of condition timeThen > timeNow.

Concerns around render props: A render prop is a method that returns a React element and is used to share state among React components.

Render props are meant to be invoked on each render since they share state between parent components and enclosed React elements. Inline functions are a good candidate for use in render prop and won’t cause any performance concern.
```
render() {
  return (
    <ListView
      items={items}
      render={({ item }) => (<div>{item.label}</div>)}
    />
  )
}
```
Here, the render prop of ListView component returns a label enclosed in a div. Since the enclosed component can never know what the value of the item variable is, it can never be a PureComponent or have a meaningful implementation of shouldComponentUpdate(). This eliminates the concerns around use of inline function in render prop. In fact, promotes it in most cases.

In my experience, inline render props can sometimes be harder to maintain especially when render prop returns a larger more complicated component in terms of code size. In such cases, breaking down the component further or having a separate method that gets passed down as render prop has worked well for me.

Concerns around PureComponents and shouldComponentUpdate(): Pure components and various implementations of shouldComponentUpdate both do a strict type comparison of props and state. These act as performance enhancers by letting React know when or when not to trigger a render based on changes to state and props. Since inline functions are created on every render, when such a method is passed down to a pure component or a component that implements the shouldComponentUpdate method, it can lead to an unnecessary render. This is because of the changed reference of the inline function.

To overcome this, consider skipping checks on all function props in shouldComponentUpdate(). This assumes that inline functions passed to the component are only different in reference and not behavior. If there is a difference in the behavior of the function passed down, it will result in a missed render and eventually lead to bugs in the component’s state and effects.‍

Conclusion‍

A common rule of thumb is to measure performance of the app and only optimize if needed. Performance impact of inline function, often categorized under micro-optimizations, is always a tradeoff between code readability, performance gain, code organization, etc that must be thought out carefully on a case by case basis and pre-optimization should be avoided.

In this blog post, we observed that inline functions don’t necessarily bring in a lot of performance cost. They are widely used because of ease of writing, reading and organizing inline functions, especially when inline function definitions are short and simple.
December 12, 2022

Managing a TLS Certificate for Kubernetes Admission Webhook

A Kubernetes admission controller is a great way of handling an incoming request, whether to add or modify fields or deny the request as per the rules/configuration defined. To extend the native functionalities, these admission webhook controllers call a custom-configured HTTP callback (webhook server) for additional checks. But the API server only communicates over HTTPS with the admission webhook servers and needs TLS cert’s CA information. This poses a problem for how we handle this webhook server certificate and how to pass CA information to the API server automatically.

One way to handle these TLS certificate and CA is using Kubernetes cert-manager. However, Kubernetes cert-manager itself is a big application and consists of many CRDs to handle its operation. It is not a good idea to install cert-manager just to handle admission webhook TLS certificate and CA. The second and possibly easier way is to use self-signed certificate and handle CA on our own using the Init Container. This eliminates the dependency on other applications, like cert-manager, and gives us the flexibility to control our application flow.

How is a custom admission webhook written? We will not cover this in-depth, and only a basic overview of admission controllers and their working will be covered. The main focus for this blog will be to cover the second approach step-by-step: handling admission webhook server TLS certificate and CA on our own using init container so that the API server can communicate with our custom webhook.

To understand the in-depth working of Admission Controllers, these articles are great:

In-depth introduction to Kubernetes admission webhooks
Diving into Kubernetes MutatingAdmissionWebhook

Prerequisites:

Knowledge of Kubernetes admission controllers, MutatingAdmissionWebhook, ValidatingAdmissionWebhook
Knowledge of Kubernetes resources like pods and volumes

Basic Overview:

Admission controllers intercept requests to the Kubernetes API server before persistence of objects in the etcd. These controllers are bundled and compiled together into the kube-apiserver binary. They consist of a list of controllers, and in that list, there are two special controllers: MutatingAdmissionWebhook and ValidatingAdmissionWebhook. MutatingAdmissionWebhook, as the name suggests, mutates/adds/modifies some fields in the request object by creating a patch, and ValidatingAdmissionWebhook validates the request by checking if the request object fields are valid or if the operation is allowed, etc., as per custom logic.

The main reason for these types of controllers is to dynamically add new checks along with the native existing checks in Kubernetes to allow a request, just like the plug-in model. To understand this more clearly, let’s say we want all the deployments in the cluster to have certain required labels. If the deployment does not have required labels, then the create deployment request should be denied. This functionality can be achieved in two ways:

1) Add these extra checks natively in Kubernetes API server codebase, compile a new binary, and run with the new binary. This is a tedious process, and every time new checks are needed, a new binary is required.

2) Create a custom admission webhook, a simple HTTP server, for these additional checks, and register this admission webhook with the API server using AdmissionRegistration API. To register two configurations, MutatingWebhookconfiguration and ValidatingWebhookConfiguration are used. The second approach is recommended and it’s quite easy as well. We will be discussing it here in detail.

Custom Admission Webhook Server:

As mentioned earlier, a custom admission webhook server is a simple HTTP server with TLS that exposes endpoints for mutation and validation. Depending upon the endpoint hit, corresponding handlers process mutation and validation. Once a custom webhook server is ready and deployed in a cluster as a deployment along with webhook service, the next part is to register it with the API server so that the API server can communicate with the custom webhook server. To register, MutatingWebhookconfiguration and ValidatingWebhookConfiguration are used. These configurations have a section to fill custom webhook related information.

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: mutation-config
webhooks:
  - admissionReviewVersions:
    - v1beta1
    name: mapplication.kb.io
    clientConfig:
      caBundle: ${CA_BUNDLE}
      service:
        name: webhook-service
        namespace: default
        path: /mutate
    rules:
      - apiGroups:
          - apps
      - apiVersions:
          - v1
        resources:
          - deployments
    sideEffects: None

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: mutation-config
webhooks:
  - admissionReviewVersions:
    - v1beta1
    name: mapplication.kb.io
    clientConfig:
      caBundle: ${CA_BUNDLE}
      service:
        name: webhook-service
        namespace: default
        path: /mutate
    rules:
      - apiGroups:
          - apps
      - apiVersions:
          - v1
        resources:
          - deployments
    sideEffects: None

Here, the service field gives information about the name, namespace, and endpoint path of the webhook server running. An important field here to note is the CA bundle. A custom admission webhook is required to run the HTTP server with TLS only because the API server only communicates over HTTPS. So the webhook server runs with server cert, and key and “caBundle” in the configuration is CA (Certification Authority) information so that API server can recognize server certificate.

The problem here is how to handle this server certificate and the key—and how to get this CA bundle and pass this information to the API server using MutatingWebhookconfiguration or ValidatingWebhookConfiguration. This will be the main focus of the following part.

Here, we are going to use a self-signed certificate for the webhook server. Now, this self-signed certificate can be made available to the webhook server using different ways. Two possible ways are:

Create a Kubernetes secret containing certificate and key and mount that as volume on to the server pod
Somehow create certificate and key in a volume, e.g., emptyDir volume and server consumes those from that volume

However, even after doing any of the above two possible ways, the remaining important part is to add the CA bundle in mutation/validation configs.

So, instead of doing all these steps manually, we all make use of Kubernetes init containers to perform all functions for us.

Custom Admission Webhook Server Init Container:

The main function of this init container will be to create a self-signed webhook server certificate and provide the CA bundle to the API server via mutation/validation configs. How the webhook server consumes this certificate (via secret volume or emptyDir volume), depends on the use-case. This init container will run a simple Go binary to perform all these functions.

package main
import (
	"bytes"
	cryptorand "crypto/rand"
	"crypto/rsa"
	"crypto/x509"
	"crypto/x509/pkix"
	"encoding/pem"
	"fmt"
	log "github.com/sirupsen/logrus"
	"math/big"
	"os"
	"time"
)
func main() {
	var caPEM, serverCertPEM, serverPrivKeyPEM *bytes.Buffer
	// CA config
	ca := &x509.Certificate{
		SerialNumber: big.NewInt(2020),
		Subject: pkix.Name{
			Organization: []string{"velotio.com"},
		},
		NotBefore:             time.Now(),
		NotAfter:              time.Now().AddDate(1, 0, 0),
		IsCA:                  true,
		ExtKeyUsage:           []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth},
		KeyUsage:              x509.KeyUsageDigitalSignature | x509.KeyUsageCertSign,
		BasicConstraintsValid: true,
	}
	// CA private key
	caPrivKey, err := rsa.GenerateKey(cryptorand.Reader, 4096)
	if err != nil {
		fmt.Println(err)
	}
	// Self signed CA certificate
	caBytes, err := x509.CreateCertificate(cryptorand.Reader, ca, ca, &caPrivKey.PublicKey, caPrivKey)
	if err != nil {
		fmt.Println(err)
	}
	// PEM encode CA cert
	caPEM = new(bytes.Buffer)
	_ = pem.Encode(caPEM, &pem.Block{
		Type:  "CERTIFICATE",
		Bytes: caBytes,
	})
	dnsNames := []string{"webhook-service",
		"webhook-service.default", "webhook-service.default.svc"}
	commonName := "webhook-service.default.svc"
	// server cert config
	cert := &x509.Certificate{
		DNSNames:     dnsNames,
		SerialNumber: big.NewInt(1658),
		Subject: pkix.Name{
			CommonName:   commonName,
			Organization: []string{"velotio.com"},
		},
		NotBefore:    time.Now(),
		NotAfter:     time.Now().AddDate(1, 0, 0),
		SubjectKeyId: []byte{1, 2, 3, 4, 6},
		ExtKeyUsage:  []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth},
		KeyUsage:     x509.KeyUsageDigitalSignature,
	}
	// server private key
	serverPrivKey, err := rsa.GenerateKey(cryptorand.Reader, 4096)
	if err != nil {
		fmt.Println(err)
	}
	// sign the server cert
	serverCertBytes, err := x509.CreateCertificate(cryptorand.Reader, cert, ca, &serverPrivKey.PublicKey, caPrivKey)
	if err != nil {
		fmt.Println(err)
	}
	// PEM encode the  server cert and key
	serverCertPEM = new(bytes.Buffer)
	_ = pem.Encode(serverCertPEM, &pem.Block{
		Type:  "CERTIFICATE",
		Bytes: serverCertBytes,
	})
	serverPrivKeyPEM = new(bytes.Buffer)
	_ = pem.Encode(serverPrivKeyPEM, &pem.Block{
		Type:  "RSA PRIVATE KEY",
		Bytes: x509.MarshalPKCS1PrivateKey(serverPrivKey),
	})
	err = os.MkdirAll("/etc/webhook/certs/", 0666)
	if err != nil {
		log.Panic(err)
	}
	err = WriteFile("/etc/webhook/certs/tls.crt", serverCertPEM)
	if err != nil {
		log.Panic(err)
	}
	err = WriteFile("/etc/webhook/certs/tls.key", serverPrivKeyPEM)
	if err != nil {
		log.Panic(err)
	}
}
// WriteFile writes data in the file at the given path
func WriteFile(filepath string, sCert *bytes.Buffer) error {
	f, err := os.Create(filepath)
	if err != nil {
		return err
	}
	defer f.Close()
	_, err = f.Write(sCert.Bytes())
	if err != nil {
		return err
	}
	return nil
}

package main

import (
	"bytes"
	cryptorand "crypto/rand"
	"crypto/rsa"
	"crypto/x509"
	"crypto/x509/pkix"
	"encoding/pem"
	"fmt"
	log "github.com/sirupsen/logrus"
	"math/big"
	"os"
	"time"
)

func main() {
	var caPEM, serverCertPEM, serverPrivKeyPEM *bytes.Buffer
	// CA config
	ca := &x509.Certificate{
		SerialNumber: big.NewInt(2020),
		Subject: pkix.Name{
			Organization: []string{"velotio.com"},
		},
		NotBefore:             time.Now(),
		NotAfter:              time.Now().AddDate(1, 0, 0),
		IsCA:                  true,
		ExtKeyUsage:           []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth},
		KeyUsage:              x509.KeyUsageDigitalSignature | x509.KeyUsageCertSign,
		BasicConstraintsValid: true,
	}

	// CA private key
	caPrivKey, err := rsa.GenerateKey(cryptorand.Reader, 4096)
	if err != nil {
		fmt.Println(err)
	}

	// Self signed CA certificate
	caBytes, err := x509.CreateCertificate(cryptorand.Reader, ca, ca, &caPrivKey.PublicKey, caPrivKey)
	if err != nil {
		fmt.Println(err)
	}

	// PEM encode CA cert
	caPEM = new(bytes.Buffer)
	_ = pem.Encode(caPEM, &pem.Block{
		Type:  "CERTIFICATE",
		Bytes: caBytes,
	})

	dnsNames := []string{"webhook-service",
		"webhook-service.default", "webhook-service.default.svc"}
	commonName := "webhook-service.default.svc"

	// server cert config
	cert := &x509.Certificate{
		DNSNames:     dnsNames,
		SerialNumber: big.NewInt(1658),
		Subject: pkix.Name{
			CommonName:   commonName,
			Organization: []string{"velotio.com"},
		},
		NotBefore:    time.Now(),
		NotAfter:     time.Now().AddDate(1, 0, 0),
		SubjectKeyId: []byte{1, 2, 3, 4, 6},
		ExtKeyUsage:  []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth},
		KeyUsage:     x509.KeyUsageDigitalSignature,
	}

	// server private key
	serverPrivKey, err := rsa.GenerateKey(cryptorand.Reader, 4096)
	if err != nil {
		fmt.Println(err)
	}

	// sign the server cert
	serverCertBytes, err := x509.CreateCertificate(cryptorand.Reader, cert, ca, &serverPrivKey.PublicKey, caPrivKey)
	if err != nil {
		fmt.Println(err)
	}

	// PEM encode the  server cert and key
	serverCertPEM = new(bytes.Buffer)
	_ = pem.Encode(serverCertPEM, &pem.Block{
		Type:  "CERTIFICATE",
		Bytes: serverCertBytes,
	})

	serverPrivKeyPEM = new(bytes.Buffer)
	_ = pem.Encode(serverPrivKeyPEM, &pem.Block{
		Type:  "RSA PRIVATE KEY",
		Bytes: x509.MarshalPKCS1PrivateKey(serverPrivKey),
	})

	err = os.MkdirAll("/etc/webhook/certs/", 0666)
	if err != nil {
		log.Panic(err)
	}
	err = WriteFile("/etc/webhook/certs/tls.crt", serverCertPEM)
	if err != nil {
		log.Panic(err)
	}

	err = WriteFile("/etc/webhook/certs/tls.key", serverPrivKeyPEM)
	if err != nil {
		log.Panic(err)
	}

}

// WriteFile writes data in the file at the given path
func WriteFile(filepath string, sCert *bytes.Buffer) error {
	f, err := os.Create(filepath)
	if err != nil {
		return err
	}
	defer f.Close()

	_, err = f.Write(sCert.Bytes())
	if err != nil {
		return err
	}
	return nil
}

The steps to generate self-signed CA and sign webhook server certificate using this CA in Golang:

Create a config for the CA, ca in the code above.
Create an RSA private key for this CA, caPrivKey in the code above.
Generate a self-signed CA, caBytes, and caPEM above. Here caPEM is the PEM encoded caBytes and will be the CA bundle given to the API server.
Create a config for webhook server certificate, cert in the code above. The important field in this configuration is the DNSNames and commonName. This name must be the full webhook service name of the webhook server to reach the webhook pod.
Create an RS private key for the webhook server, serverPrivKey in the code above.
Create server certificate using ca and caPrivKey, serverCertBytes in the code above.
Now, PEM encode the serverPrivKey and serverCertBytes. This serverPrivKeyPEM and serverCertPEM is the TLS certificate and key and will be consumed by the webhook server.

At this point, we have generated the required certificate, key, and CA bundle using init container. Now we will share this server certificate and key with the actual webhook server container in the same pod.

One approach is to create an empty secret resource before-hand, create webhook deployment by passing the secret name as an environment variable. Init container will generate server certificate and key and populate the empty secret with certificate and key information. This secret will be mounted on to webhook server container to start HTTP server with TLS.
The second approach (used in the code above) is to use Kubernete’s native pod specific emptyDir volume. This volume will be shared between both the containers. In the code above, we can see the init container is writing these certificate and key information in a file on a particular path. This path will be the one emptyDir volume is mounted to, and the webhook server container will read certificate and key for TLS configuration from that path and start the HTTP webhook server. Refer to the below diagram:

The pod spec will look something like this:

spec:
  initContainers:
      image: <webhook init-image name>
      imagePullPolicy: IfNotPresent
      name: webhook-init
      volumeMounts:
        - mountPath: /etc/webhook/certs
          name: webhook-certs
  containers:
      image: <webhook server image name>
      imagePullPolicy: IfNotPresent
      name: webhook-server
      volumeMounts:
        - mountPath: /etc/webhook/certs
          name: webhook-certs
          readOnly: true
  volumes:
    - name: webhook-certs
      emptyDir: {}

spec:
  initContainers:
      image: <webhook init-image name>
      imagePullPolicy: IfNotPresent
      name: webhook-init
      volumeMounts:
        - mountPath: /etc/webhook/certs
          name: webhook-certs
  containers:
      image: <webhook server image name>
      imagePullPolicy: IfNotPresent
      name: webhook-server
      volumeMounts:
        - mountPath: /etc/webhook/certs
          name: webhook-certs
          readOnly: true
  volumes:
    - name: webhook-certs
      emptyDir: {}

The only part remaining is to give this CA bundle information to the API server using mutation/validation configs. This can be done in two ways:

Patch the CA bundle in the existing MutatingWebhookConfiguration or ValidatingWebhookConfiguration using Kubernetes go-client in the init container.
Create MutatingWebhookConfiguration or ValidatingWebhookConfiguration in the init container itself with CA bundle information in configs.

Here, we will create configs through init container. To get certain parameters, like mutation config name, webhook service name, and webhook namespace dynamically, we can take these values from init containers env:

initContainers:
  image: <webhook init-image name>
  imagePullPolicy: IfNotPresent
  name: webhook-init
  volumeMounts:
    - mountPath: /etc/webhook/certs
      name: webhook-certs
  env:
    - name: MUTATE_CONFIG
      value: mutating-webhook-configuration
    - name: VALIDATE_CONFIG
      value: validating-webhook-configuration
    - name: WEBHOOK_SERVICE
      value: webhook-service
    - name: WEBHOOK_NAMESPACE
      value:  default

initContainers:
  image: <webhook init-image name>
  imagePullPolicy: IfNotPresent
  name: webhook-init
  volumeMounts:
    - mountPath: /etc/webhook/certs
      name: webhook-certs
  env:
    - name: MUTATE_CONFIG
      value: mutating-webhook-configuration
    - name: VALIDATE_CONFIG
      value: validating-webhook-configuration
    - name: WEBHOOK_SERVICE
      value: webhook-service
    - name: WEBHOOK_NAMESPACE
      value:  default

To create MutatingWebhookConfiguration, we will add the below piece of code in init container code below the certificate generation code.

package main
import (
	"bytes"
	admissionregistrationv1 "k8s.io/api/admissionregistration/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/client-go/kubernetes"
	"os"
	ctrl "sigs.k8s.io/controller-runtime"
)
func createMutationConfig(caCert *bytes.Buffer) {
	var (
		webhookNamespace, _ = os.LookupEnv("WEBHOOK_NAMESPACE")
		mutationCfgName, _  = os.LookupEnv("MUTATE_CONFIG")
		// validationCfgName, _ = os.LookupEnv("VALIDATE_CONFIG") Not used here in below code
		webhookService, _ = os.LookupEnv("WEBHOOK_SERVICE")
	)
	config := ctrl.GetConfigOrDie()
	kubeClient, err := kubernetes.NewForConfig(config)
	if err != nil {
		panic("failed to set go -client")
	}
	path := "/mutate"
	fail := admissionregistrationv1.Fail
	mutateconfig := &admissionregistrationv1.MutatingWebhookConfiguration{
		ObjectMeta: metav1.ObjectMeta{
			Name: mutationCfgName,
		},
		Webhooks: []admissionregistrationv1.MutatingWebhook{{
			Name: "mapplication.kb.io",
			ClientConfig: admissionregistrationv1.WebhookClientConfig{
				CABundle: caCert.Bytes(), // CA bundle created earlier
				Service: &admissionregistrationv1.ServiceReference{
					Name:      webhookService,
					Namespace: webhookNamespace,
					Path:      &path,
				},
			},
			Rules: []admissionregistrationv1.RuleWithOperations{{Operations: []admissionregistrationv1.OperationType{
				admissionregistrationv1.Create},
				Rule: admissionregistrationv1.Rule{
					APIGroups:   []string{"apps"},
					APIVersions: []string{"v1"},
					Resources:   []string{"deployments"},
				},
			}},
			FailurePolicy: &fail,
		}},
	}
  
	if _, err := kubeClient.AdmissionregistrationV1().MutatingWebhookConfigurations().Create(mutateconfig)
		err != nil {
		panic(err)
	}
}

package main

import (
	"bytes"
	admissionregistrationv1 "k8s.io/api/admissionregistration/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/client-go/kubernetes"
	"os"
	ctrl "sigs.k8s.io/controller-runtime"
)

func createMutationConfig(caCert *bytes.Buffer) {

	var (
		webhookNamespace, _ = os.LookupEnv("WEBHOOK_NAMESPACE")
		mutationCfgName, _  = os.LookupEnv("MUTATE_CONFIG")
		// validationCfgName, _ = os.LookupEnv("VALIDATE_CONFIG") Not used here in below code
		webhookService, _ = os.LookupEnv("WEBHOOK_SERVICE")
	)
	config := ctrl.GetConfigOrDie()
	kubeClient, err := kubernetes.NewForConfig(config)
	if err != nil {
		panic("failed to set go -client")
	}

	path := "/mutate"
	fail := admissionregistrationv1.Fail

	mutateconfig := &admissionregistrationv1.MutatingWebhookConfiguration{
		ObjectMeta: metav1.ObjectMeta{
			Name: mutationCfgName,
		},
		Webhooks: []admissionregistrationv1.MutatingWebhook{{
			Name: "mapplication.kb.io",
			ClientConfig: admissionregistrationv1.WebhookClientConfig{
				CABundle: caCert.Bytes(), // CA bundle created earlier
				Service: &admissionregistrationv1.ServiceReference{
					Name:      webhookService,
					Namespace: webhookNamespace,
					Path:      &path,
				},
			},
			Rules: []admissionregistrationv1.RuleWithOperations{{Operations: []admissionregistrationv1.OperationType{
				admissionregistrationv1.Create},
				Rule: admissionregistrationv1.Rule{
					APIGroups:   []string{"apps"},
					APIVersions: []string{"v1"},
					Resources:   []string{"deployments"},
				},
			}},
			FailurePolicy: &fail,
		}},
	}
  
	if _, err := kubeClient.AdmissionregistrationV1().MutatingWebhookConfigurations().Create(mutateconfig)
		err != nil {
		panic(err)
	}
}

The code above is just a sample code to create MutatingWebhookConfiguration. Here first, we are importing the required packages. Then, we are reading the environment variables like webhookNamespace, etc. Next, we are defining the MutatingWebhookConfiguration struct with CA bundle information (created earlier) and other required information. Finally, we are creating a configuration using the go-client. The same approach can be followed for creating the ValidatingWebhookConfiguration. For cases of pod restart or deletion, we can add extra logic in init containers like delete the existing configs first before creating or updating only the CA bundle if configs already exist.

For certificate rotation, the approach will be different for each approach taken for serving this certificate to the server container:

If we are using emptyDir volume, then the approach will be to just restart the webhook pod. As emptyDir volume is ephemeral and bound to the lifecycle of the pod, on restart, a new certificate will be generated and served to the server container. A new CA bundle will be added in configs if configs already exist.
If we are using secret volume, then, while restarting the webhook pod, the expiration of the existing certificate from the secret can be checked to decide whether to use the existing certificate for the server or create a new one.

In both cases, the webhook pod restart is required to trigger the certificate rotation/renew process. When you will want to restart the webhook pod and how the webhook pod will be restarted will vary depending on the use-case. A few possible ways can be using cron-job, controllers, etc.

Now, our custom webhook is registered, the API server can read CA bundle information through configs, and the webhook server is ready to serve the mutation/validation requests as per rules defined in configs.

Conclusion:

We covered how we will add additional checks mutation/validation by registering our own custom admission webhook server. We also covered how we can automatically handle webhook server TLS certificate and key using init containers and passing the CA bundle information to API server through mutation/validation configs.

1. OPA On Kubernetes: An Introduction For Beginners

‍2. Prow + Kubernetes – A Perfect Combination To Execute CI/CD At Scale

December 12, 2022

Real Time Analytics for IoT Data using Mosquitto, AWS Kinesis and InfluxDB

Internet of things (IoT) is maturing rapidly and it is finding application across various industries. Every common device that we use is turning into the category of smart devices. Smart devices are basically IoT devices. These devices captures various parameters in and around their environment leading to generation of a huge amount of data. This data needs to be collected, processed, stored and analyzed in order to get actionable insights from them. To do so, we need to build data pipeline. In this blog we will be building a similar pipeline using Mosquitto, Kinesis, InfluxDB and Grafana. We will discuss all these individual components of the pipeline and the steps to build it.

Why the Analysis of IoT data is different

In an IoT setup, the data is generated by sensors that are distributed across various locations. In order to use the data generated by them we should first get them to a common location from where the various applications which want to process them can read it.

Network Protocol

IoT devices have low computational and network resources. Moreover, these devices write data in very short intervals thus high throughput is expected on the network. For transferring IoT data it is desirable to use lightweight network protocols. A protocol like HTTP uses a complex structure for communication resulting in consumption of more resources making it unsuitable for IoT data transfer. One of the lightweight protocol suitable for IoT data is MQTT which we are using in our pipeline. MQTT is designed for machine to machine (M2M) connectivity. It uses a publisher/subscriber communication model and helps clients to distribute telemetry data with very low network resource consumption. Along with IoT MQTT has been found to be useful in other fields as well.

Other similar protocols include Constrained Application Protocol (CoAP), Advanced Message Queuing Protocol (AMQP) etc.

Datastore

IoT devices generally collect telemetry about its environment usually through sensors. In most of the IoT scenarios, we try to analyze how things have changed over a period of time. Storing these data in a time series database makes our analysis simpler and better. InfluxDB is popular time series database which we will use in our pipeline. More about time series databases can be read here.

Pipeline Overview

The first thing we need for a data pipeline is data. As shown in the image above the data generated by various sensors are written to a topic in the MQTT message broker. To mimic sensors we will use a program which uses the MQTT client to write data to the MQTT broker.

The next component is Amazon Kinesis which is used for streaming data analysis. It closely resembles apache Kafka which is an open source tool used for similar purposes. Kinesis brings the data generated by a number of clients to a single location from where different consumers can pull it for processing. We are using Kinesis so that multiple consumers can read data from a single location. This approach scales well even if we have multiple message brokers.

Once the data is written to the MQTT broker a Kinesis producer subscribes to it and pull the data from it and writes it to the Kinesis stream, from the Kinesis stream the data is pulled by Kinesis consumers which processes the data and writes it to an InfluxDB which is a time series database.

Finally, we use Grafana which is a well-known tool for analytics and monitoring, we can connect it to many popular databases and perform analytics and monitoring. Another popular tool in this space is Kibana (the K of ELK stack)

Setting up a MQTT Message Broker Server:

For MQTT message broker we will use Mosquitto which is a popular open source message broker that implements MQTT. The details of downloading and installing mosquitto for various platforms are available here.

For Ubuntu, it can be installed using the following commands

sudo apt-add-repository ppa:mosquitto-dev/mosquitto-ppa
sudo apt-get update
sudo apt-get install mosquitto
service mosquitto status

sudo apt-add-repository ppa:mosquitto-dev/mosquitto-ppa
sudo apt-get update
sudo apt-get install mosquitto
service mosquitto status

Setting up InfluxDB and Grafana

The simplest way to set up both these components is to use their docker image directly

docker run --name influxdb -p 8083:8083 -p 8086:8086 influxdb:1.0
docker run --name grafana -p 3000:3000 --link influxdb grafana/grafana:3.1.1

docker run --name influxdb -p 8083:8083 -p 8086:8086 influxdb:1.0
docker run --name grafana -p 3000:3000 --link influxdb grafana/grafana:3.1.1

In InfluxDB we have mapped two ports, port 8086 is the HTTP API endpoint port while 8083 is the administration web server’s port. We need to create a database where we will write our data.

For creating a database we can directly go to the console at <influxdb-ip>:8083 and run the command: </influxdb-ip>

CREATE DATABASE "iotdata"

CREATE DATABASE "iotdata"

Or we can do it via HTTP request :

curl -XPOST "http://localhost:8086/query" --data-urlencode "q=CREATE DATABASE iotdata”

curl -XPOST "http://localhost:8086/query" --data-urlencode "q=CREATE DATABASE iotdata”

Creating a Kinesis stream

In Kinesis, we create streams where the Kinesis producers write the data coming from various sources and then the Kinesis consumers read the data from the stream. In the stream, the data is stored in various shards. For our purpose, one shard would be enough.

Creating the MQTT client

We will use the Golang client available in this repository to connect with our message broker server and write data to a specific topic. We will first create a new MQTT client. Here we can see the list of options we have for configuring our MQTT client.

Once we create the options object we can pass it to the NewClient() method which will return us the MQTT client. Now we can write data to the MQTT server. We have defined the structure of the data in the struct sensor data. Now to mimic two sensors which are writing telemetry data to the MQTT broker we have two goroutines which push data to the MQTT server every five seconds.

package publisher
import (
	"config"
	"encoding/json"
	"fmt"
	"log"
	"math/rand"
	"os"
	"time"
	"github.com/eclipse/paho.mqtt.golang"
)
type SensorData struct {
	Id          string  `json:"id"`
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	Timestamp   int64   `json:"timestamp"`
	City        string  `json:"city"`
}
func StartMQTTPublisher() {
	fmt.Println("MQTT publisher Started")
	mqtt.DEBUG = log.New(os.Stdout, "", 0)
	mqtt.ERROR = log.New(os.Stdout, "", 0)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttPublisherClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	c := mqtt.NewClient(opts)
	if token := c.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	}
	go func() {
		t := 20.04
		h := 32.06
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIMUM",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Mumbai",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	go func() {
		t := 16.02
		h := 24.04
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIPUN",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Pune",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	time.Sleep(1000 * time.Second)
	c.Disconnect(250)
}

package publisher

import (
	"config"
	"encoding/json"
	"fmt"
	"log"
	"math/rand"
	"os"
	"time"

	"github.com/eclipse/paho.mqtt.golang"
)

type SensorData struct {
	Id          string  `json:"id"`
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	Timestamp   int64   `json:"timestamp"`
	City        string  `json:"city"`
}

func StartMQTTPublisher() {
	fmt.Println("MQTT publisher Started")
	mqtt.DEBUG = log.New(os.Stdout, "", 0)
	mqtt.ERROR = log.New(os.Stdout, "", 0)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttPublisherClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	c := mqtt.NewClient(opts)
	if token := c.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	}

	go func() {
		t := 20.04
		h := 32.06
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIMUM",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Mumbai",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	go func() {
		t := 16.02
		h := 24.04
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIPUN",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Pune",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	time.Sleep(1000 * time.Second)
	c.Disconnect(250)

}

Create a Kinesis Producer

Now we will create a Kinesis producer which subscribes to the topic to which our MQTT client writes data and pull the data from the broker and pushes it to the Kinesis stream. Just like in the previous section here also we first create an MQTT client which connects to the message broker and subscribe to the topic to which our clients/publishers are going to write data to. In the client option, we have the option to define a function which will be called when data is written to this topic. We have created a function postDataTokinesisStream() which connects Kinesis using the Kinesis client and then writes data to the Kinesis stream, every time a data is pushed to the topic.

package producer
import (
	"config"
	"fmt"
	"os"
	"time"
	"github.com/aws/aws-sdk-go/service/kinesis"
	mqtt "github.com/eclipse/paho.mqtt.golang"
)
func postDataTokinesisStream(client mqtt.Client, message mqtt.Message) {
	fmt.Printf("Received message on topic: %snMessage: %sn", message.Topic(), message.Payload())
	streamName := config.GetKinesisStreamName()
	kclient := config.GetKinesisClient()
	var putRecordInput kinesis.PutRecordInput
	partitionKey := message.Topic()
	putRecordInput.PartitionKey = &partitionKey
	putRecordInput.StreamName = &streamName
	putRecordInput.Data = message.Payload()
	putRecordOutput, err := kclient.PutRecord(&putRecordInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(putRecordOutput)
	}
}
func StartKinesisProducer() {
	fmt.Println("Kinesis Producer Started")
	c := make(chan os.Signal, 1)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttSubscriberClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	opts.OnConnect = func(c mqtt.Client) {
		if token := c.Subscribe(config.GetMQTTTopicName(), 0, postDataTokinesisStream); token.Wait() && token.Error() != nil {
			panic(token.Error())
		}
	}
	client := mqtt.NewClient(opts)
	if token := client.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	} else {
		fmt.Printf("Connected to %sn", config.GetMqttServerurl())
	}
	<-c
}

package producer

import (
	"config"
	"fmt"

	"os"
	"time"

	"github.com/aws/aws-sdk-go/service/kinesis"

	mqtt "github.com/eclipse/paho.mqtt.golang"
)

func postDataTokinesisStream(client mqtt.Client, message mqtt.Message) {
	fmt.Printf("Received message on topic: %snMessage: %sn", message.Topic(), message.Payload())
	streamName := config.GetKinesisStreamName()
	kclient := config.GetKinesisClient()
	var putRecordInput kinesis.PutRecordInput
	partitionKey := message.Topic()
	putRecordInput.PartitionKey = &partitionKey
	putRecordInput.StreamName = &streamName
	putRecordInput.Data = message.Payload()
	putRecordOutput, err := kclient.PutRecord(&putRecordInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(putRecordOutput)
	}

}

func StartKinesisProducer() {
	fmt.Println("Kinesis Producer Started")
	c := make(chan os.Signal, 1)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttSubscriberClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	opts.OnConnect = func(c mqtt.Client) {
		if token := c.Subscribe(config.GetMQTTTopicName(), 0, postDataTokinesisStream); token.Wait() && token.Error() != nil {
			panic(token.Error())
		}
	}

	client := mqtt.NewClient(opts)
	if token := client.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	} else {
		fmt.Printf("Connected to %sn", config.GetMqttServerurl())
	}

	<-c
}

Create a Kinesis Consumer

Now the data is available in our Kinesis stream we can pull it for processing. In the Kinesis consumer section, we create a Kinesis client just like we did in the previous section and then pull data from it. Here we first make a call to the DescribeStream method which returns us the shardId, we then use this shardId to get the ShardIterator and then finally we are able to fetch the records by passing the ShardIterator to GetRecords() method. GetRecords() also returns the NextShardIterator which we can use to continuously look for records in the shard until NextShardIterator becomes null.

package consumer
import (
	"config"
	"fmt"
	"github.com/aws/aws-sdk-go/service/kinesis"
	"velotio.com/dao"
)
func StartKinesisConsumer() {
	fmt.Println("Kinesis Consumer Started")
	client := config.GetKinesisClient()
	streamName := config.GetKinesisStreamName()
	var describeStreamInput kinesis.DescribeStreamInput
	describeStreamInput.StreamName = &streamName
	describeStreamOutput, err := client.DescribeStream(&describeStreamInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*describeStreamOutput.StreamDescription.Shards[0].ShardId)
	}
	var getShardIteratorInput kinesis.GetShardIteratorInput
	getShardIteratorInput.ShardId = describeStreamOutput.StreamDescription.Shards[0].ShardId
	getShardIteratorInput.StreamName = &streamName
	shardIteratorType := "TRIM_HORIZON"
	getShardIteratorInput.ShardIteratorType = &shardIteratorType
	getShardIteratorOuput, err := client.GetShardIterator(&getShardIteratorInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*getShardIteratorOuput.ShardIterator)
	}
	var getRecordsInput kinesis.GetRecordsInput
	getRecordsInput.ShardIterator = getShardIteratorOuput.ShardIterator
	getRecordsOuput, err := client.GetRecords(&getRecordsInput)
	//fmt.Println(getRecordsOuput)
	if err != nil {
		fmt.Println(err)
	} else {
		for *getRecordsOuput.NextShardIterator != "" {
			i := 0
			for i < len(getRecordsOuput.Records) {
				//fmt.Println(len(getRecordsOuput.Records))
				sdf := &dao.SensorDataFiltered{}
				sdf.PostDataToInfluxDB(getRecordsOuput.Records[i].Data)
				i++
			}
			getRecordsInput.ShardIterator = getRecordsOuput.NextShardIterator
			getRecordsOuput, err = client.GetRecords(&getRecordsInput)
		}
	}
}

package consumer

import (
	"config"
	"fmt"

	"github.com/aws/aws-sdk-go/service/kinesis"
	"velotio.com/dao"
)

func StartKinesisConsumer() {
	fmt.Println("Kinesis Consumer Started")
	client := config.GetKinesisClient()
	streamName := config.GetKinesisStreamName()
	var describeStreamInput kinesis.DescribeStreamInput
	describeStreamInput.StreamName = &streamName
	describeStreamOutput, err := client.DescribeStream(&describeStreamInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*describeStreamOutput.StreamDescription.Shards[0].ShardId)
	}
	var getShardIteratorInput kinesis.GetShardIteratorInput
	getShardIteratorInput.ShardId = describeStreamOutput.StreamDescription.Shards[0].ShardId
	getShardIteratorInput.StreamName = &streamName
	shardIteratorType := "TRIM_HORIZON"
	getShardIteratorInput.ShardIteratorType = &shardIteratorType
	getShardIteratorOuput, err := client.GetShardIterator(&getShardIteratorInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*getShardIteratorOuput.ShardIterator)
	}
	var getRecordsInput kinesis.GetRecordsInput

	getRecordsInput.ShardIterator = getShardIteratorOuput.ShardIterator
	getRecordsOuput, err := client.GetRecords(&getRecordsInput)
	//fmt.Println(getRecordsOuput)
	if err != nil {
		fmt.Println(err)
	} else {
		for *getRecordsOuput.NextShardIterator != "" {
			i := 0
			for i < len(getRecordsOuput.Records) {
				//fmt.Println(len(getRecordsOuput.Records))
				sdf := &dao.SensorDataFiltered{}
				sdf.PostDataToInfluxDB(getRecordsOuput.Records[i].Data)
				i++
			}
			getRecordsInput.ShardIterator = getRecordsOuput.NextShardIterator
			getRecordsOuput, err = client.GetRecords(&getRecordsInput)
		}

	}
}

Processing the data and writing it to InfluxDB

Now we do simple processing of filtering out data. The data that we got from the sensor is having fields sensorId, temperature, humidity, city, and timestamp but we are interested in only the values of temperature and humidity for a city so we have created a new structure ‘SensorDataFiltered’ which contains only the fields we need.

For every record that the Kinesis consumer receives it creates an instance of the SensorDataFiltered type and calls the PostDataToInfluxDB() method where the record received from the Kinesis stream is unmarshaled into the SensorDataFiltered type and send to InfluxDB. Here we need to provide the name of the database we created earlier to the variable dbName and the InfluxDB host and port values to dbHost and dbPort respectively.

In the InfluxDB request body, the first value that we provide is used as the measurement which is an InfluxDB struct to store similar data together. Then we have tags, we have used `city` as our tag so that we can filter the data based on them and then we have the actual values. For more details on InfluxDB data write format please refer here.

package dao
import (
	"bytes"
	"crypto/tls"
	"encoding/json"
	"fmt"
	"net/http"
)
type SensorDataFiltered struct {
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	City        string  `json:"city"`
}
var dbName = "iotdata"
var dbHost = "184.73.62.30"
var dbPort = "8086"
func (sdf *SensorDataFiltered) PostDataToInfluxDB(Data []byte) {
	err := json.Unmarshal(Data, &sdf)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(sdf.Temperature, sdf.Humidity)
	}
	url := "http://" + dbHost + ":" + dbPort + "/write?db=" + dbName
	humidity := fmt.Sprintf("%.2f", sdf.Humidity)
	temperature := fmt.Sprintf("%.2f", sdf.Temperature)
	city := sdf.City
	requestBody := "sensordata,city=" + city + " humidity=" + humidity + ",temperature=" + temperature
	req, err := http.NewRequest("POST", url, bytes.NewBuffer([]byte(requestBody)))
	httpclient := &http.Client{
		Transport: &http.Transport{
			TLSClientConfig: &tls.Config{
				InsecureSkipVerify: true,
			},
		},
	}
	resp, err := httpclient.Do(req)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println("Status code for influxdb data port request = ", resp.StatusCode)
	}
	defer resp.Body.Close()
}

package dao

import (
	"bytes"
	"crypto/tls"
	"encoding/json"
	"fmt"
	"net/http"
)

type SensorDataFiltered struct {
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	City        string  `json:"city"`
}

var dbName = "iotdata"
var dbHost = "184.73.62.30"
var dbPort = "8086"

func (sdf *SensorDataFiltered) PostDataToInfluxDB(Data []byte) {
	err := json.Unmarshal(Data, &sdf)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(sdf.Temperature, sdf.Humidity)
	}
	url := "http://" + dbHost + ":" + dbPort + "/write?db=" + dbName
	humidity := fmt.Sprintf("%.2f", sdf.Humidity)
	temperature := fmt.Sprintf("%.2f", sdf.Temperature)
	city := sdf.City
	requestBody := "sensordata,city=" + city + " humidity=" + humidity + ",temperature=" + temperature
	req, err := http.NewRequest("POST", url, bytes.NewBuffer([]byte(requestBody)))
	httpclient := &http.Client{
		Transport: &http.Transport{
			TLSClientConfig: &tls.Config{
				InsecureSkipVerify: true,
			},
		},
	}
	resp, err := httpclient.Do(req)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println("Status code for influxdb data port request = ", resp.StatusCode)
	}
	defer resp.Body.Close()

}

Once the data is written to InfluxDB we can see it in the web console by querying the measurement create in our database.

Putting everything together in our main function

Now we need to simply call the functions we discussed above and run our main program. Note that we have used `go` before the first two function call which makes them goroutines and they execute concurrently.

On running the code you will see the logs for all the stages of our pipeline getting written to the stdout and it very closely resembles real-life scenarios where data is written by IoT devices and gets processed in near real-time.

package main
import (
	"time"
	"velotio.com/consumer"
	"velotio.com/producer"
	"velotio.com/publisher"
)
func main() {
	go producer.StartKinesisProducer()
	go publisher.StartMQTTPublisher()
	time.Sleep(5 * time.Second)
	consumer.StartKinesisConsumer()
}

package main

import (
	"time"

	"velotio.com/consumer"
	"velotio.com/producer"
	"velotio.com/publisher"
)

func main() {

	go producer.StartKinesisProducer()
	go publisher.StartMQTTPublisher()
	time.Sleep(5 * time.Second)
	consumer.StartKinesisConsumer()

}

Visualization through Grafana

We can access the Grafana web console at port 3000 of the machine on which it is running. First, we need to add our InfluxDB as a data source to it under the data sources option.

For creating dashboard go to the dashboard option and choose new. Once the dashboard is created we can start by adding a panel.

We need to add Influxdb data source that we added earlier as the panel data source and write queries as shown in the image below.

We can repeat the same process for adding another panel to the dashboard this time choosing a different city in our query.

Conclusion:

IoT data analytics is a fast evolving and interesting space. The number of IoT devices are growing rapidly. There is a great opportunity to get valuable insights from the huge amount of data generated by these device. In this blog, I tried to help you grab that opportunity by building a near real time data pipeline for IoT data. If you like it please share and subscribe to our blog.

December 12, 2022

How Much Do You Really Know About Simplified Cloud Deployments?

Is your EC2/VM bill giving you sleepless nights?

Are your EC2 instances under-utilized? Have you been wondering if there was an easy way to maximize the EC2/VM usage?

Are you investing too much in your Control Plane and wish you could divert some of that investment towards developing more features in your applications (business logic)?

Is your Configuration Management system overwhelming you and seems to have got a life of its own?

Do you have legacy applications that do not need Docker at all?

Would you like to simplify your deployment toolchain to streamline your workflows?

Have you been recommended to use Kubernetes as a problem to fix all your woes, but you aren’t sure if Kubernetes is actually going to help you?

Do you feel you are moving towards Docker, just so that Kubernetes can be used?

If you answered “Yes” to any of the questions above, do read on, this article is just what you might need.

There are steps to create a simple setup on your laptop at the end of the article.

Introduction

In the following article, we will present the typical components of a multi-tier application and how it is setup and deployed.

We shall further go on to see how the same application deployment can be remodeled for scale using any Cloud Infrastructure. (The same software toolchain can be used to deploy the application on your On-Premise Infrastructure as well)

The tools that we propose are Nomad and Consul. We shall focus more on how to use these tools, rather than deep-dive into the specifics of the tools. We will briefly see the features of the software which would help us achieve our goals.

Nomad is a distributed workload manager for not only Docker containers, but also for various other types of workloads like legacy applications, JAVA, LXC, etc.

More about Nomad Drivers here: Nomadproject.io, application delivery with HashiCorp, introduction to HashiCorp Nomad.

Consul is a distributed service mesh, with features like service registry and a key-value store, among others.

Using these tools, the application/startup workflow would be as follows:

Nomad will be responsible for starting the service.

Nomad will publish the service information in Consul. The service information will include details like:

Where is the application running (IP:PORT) ?
What “service-name” is used to identify the application?
What “tags” (metadata) does this application have?

A Typical Application

A typical application deployment consists of a certain fixed set of processes, usually coupled with a database and a set of few (or many) peripheral services.

These services could be primary (must-have) or support (optional) features of the application.

Note: We are aware about what/how a proper “service-oriented-architecture” should be, though we will skip that discussion for now. We will rather focus on how real-world applications are setup and deployed.

Simple Multi-tier Application

In this section, let’s see the components of a multi-tier application along with typical access patterns from outside the system and within the system.

Load Balancer/Web/Front End Tier
Application Services Tier
Database Tier
Utility (or Helper Servers): To run background, cron, or queued jobs.

Using a proxy/loadbalancer, the services (Service-A, Service-B, Service-C) could be accessed using distinct hostnames:

a.example.tld
b.example.tld
c.example.tld

For an equivalent path-based routing approach, the setup would be similar. Instead of distinct hostnames, the communication mechanism would be:

common-proxy.example.tld/path-a/
common-proxy.example.tld/path-b/
common-proxy.example.tld/path-c/

Problem Scenario 1

Some of the basic problems with the deployment of the simple multi-tier application are:

What if the service process crashes during its runtime?
What if the host on which the services run shuts down, reboots or terminates?

This is where Nomad’s feature of always keep the service running would be useful.

In spite of this auto-restart feature, there could be issues if the service restarts on a different machine (i.e. different IP address).

In case of Docker and ephemeral ports, the service could start on a different port as well.

To solve this, we will use the service discovery feature provided by Consul, combined with a with a Consul-aware load-balancer/proxy to redirect traffic to the appropriate service.

The order of the operations within the Nomad job will thus be:

Nomad will launch the job/task.
Nomad will register the task details as a service definition in Consul.
(These steps will be re-executed if/when the application is restarted due to a crash/fail-over)
The Consul-aware load-balancer will route the traffic to the service (IP:PORT)

Multi-tier Application With Load Balancer

Using the Consul-aware load-balancer, the diagram will now look like:

The details of the setup now are:

A Consul-aware load-balancer/proxy; the application will access the services via the load-balancer.
3 (three) instances of service A; A1, A2, A3
3 (three) instances of service B; B1, B2, B3

The Routing Question

At this moment, you could be wondering, “Why/How would the load-balancer know that it has to route traffic for service-A to A1/A2/A3 and route traffic for service-B to B1/B2/B3 ?”

The answer lies in the Consul tags which will be published as part of the service definition (when Nomad registers the service in Consul).

The appropriate Consul tags will tell the load-balancer to route traffic of a particular service to the appropriate backend. (+++)

Let’s read that statement again (very slowly, just to be sure); The Consul tags, which are part of the service definition, will inform (advertise) the load-balancer to route traffic to the appropriate backend.

The reason to dwell upon this distinction is very important, as this is different from how the classic load-balancer/proxy software like HAProxy or NGINX are configured. For HAProxy/NGINX the backend routing information resides with the load-balancer instance and is not “advertised” by the backend.

The traditional load-balancers like NGINX/HAProxy do not natively support dynamic reloading of the backends. (when the backends stop/start/move-around). The heavy lifting of regenerating the configuration file and reloading the service is left up to an external entity like Consul-Template.

The use of a Consul-aware load-balancer, instead of a traditional load-balancer, eliminates the need of external workarounds.

The setup can thus be termed as a zero-configuration setup; you don’t have to re-configure the load-balancer, it will discover the changing backend services based on the information available from Consul.

Problem Scenario 2

So far we have achieved a method to “automatically” discover the backends, but isn’t the Load-Balancer itself a single-point-of-failure (SPOF)?

It absolutely is, and you should always have redundant load-balancers instances (which is what any cloud-provided load-balancer has).

As there is a certain cost associated with using “cloud-provided load-balancer”, we would create the load-balancers ourselves and not use cloud-provided load-balancers.

To provide redundancy to the load-balancer instances, you should configure them using and AutoScalingGroup (AWS), VM Scale Sets (Azure), etc.

The same redundancy strategy should also be used for the worker nodes, where the actual services reside, by using AutoScaling Groups/VMSS for the worker nodes.

The Complete Picture

Installation and Configuration

Given that nowadays laptops are pretty powerful, you can easily create a test setup on your laptop using VirtualBox, VMware Workstation Player, VMware Workstation, etc.

As a prerequisite, you will need a few virtual machines which can communicate with each other.

NOTE: Create the VMs with networking set to bridged mode.

The machines needed for the simple setup/demo would be:

1 Linux VM to act as a server (srv1)
1 Linux VM to act as a load-balancer (lb1)
2 Linux VMs to act as worker machines (client1, client2)

*** Each machine can be 2 CPU 1 GB memory each.

The configuration files and scripts needed for the demo, which will help you set up the Nomad and Consul cluster are available here.

Setup the Server

Install the binaries on the server

# install the Consul binary
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip -O consul.zip
unzip -o consul.zip
sudo chown root:root consul
sudo mv -fv consul /usr/sbin/

# install the Nomad binary
wget https://releases.hashicorp.com/nomad/0.11.3/nomad_0.11.3_linux_amd64.zip -O nomad.zip
unzip -o nomad.zip
sudo chown root:root nomad
sudo mv -fv nomad /usr/sbin/

# install Consul's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/consul.service -O consul.service
sudo chown root:root consul.service
sudo mv -fv consul.service /etc/systemd/system/consul.service

# install Nomad's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/nomad.service -O nomad.service
sudo chown root:root nomad.service

# install the Consul binary
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip -O consul.zip
unzip -o consul.zip
sudo chown root:root consul
sudo mv -fv consul /usr/sbin/

# install the Nomad binary
wget https://releases.hashicorp.com/nomad/0.11.3/nomad_0.11.3_linux_amd64.zip -O nomad.zip
unzip -o nomad.zip
sudo chown root:root nomad
sudo mv -fv nomad /usr/sbin/

# install Consul's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/consul.service -O consul.service
sudo chown root:root consul.service
sudo mv -fv consul.service /etc/systemd/system/consul.service

# install Nomad's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/nomad.service -O nomad.service
sudo chown root:root nomad.service

Create the Server Configuration

### On the server machine ...

### Consul 
sudo mkdir -p /etc/consul/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/consul/server.hcl -O /etc/consul/server.hcl

### Edit Consul's server.hcl file and setup the fields 'encrypt' and 'retry_join' as per your cluster.
sudo vim /etc/consul/server.hcl

### Nomad
sudo mkdir -p /etc/nomad/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/nomad/server.hcl -O /etc/nomad/server.hcl

### Edit Nomad's server.hcl file and setup the fields 'encrypt' and 'retry_join' as per your cluster.
sudo vim /etc/nomad/server.hcl

### After you are done with the edits ...

sudo systemctl daemon-reload
sudo systemctl enable consul nomad
sudo systemctl restart consul nomad
sleep 10
sudo consul members
sudo nomad server members

### On the server machine ...

### Consul 
sudo mkdir -p /etc/consul/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/consul/server.hcl -O /etc/consul/server.hcl

### Edit Consul's server.hcl file and setup the fields 'encrypt' and 'retry_join' as per your cluster.
sudo vim /etc/consul/server.hcl

### Nomad
sudo mkdir -p /etc/nomad/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/nomad/server.hcl -O /etc/nomad/server.hcl

### Edit Nomad's server.hcl file and setup the fields 'encrypt' and 'retry_join' as per your cluster.
sudo vim /etc/nomad/server.hcl

### After you are done with the edits ...

sudo systemctl daemon-reload
sudo systemctl enable consul nomad
sudo systemctl restart consul nomad
sleep 10
sudo consul members
sudo nomad server members

Setup the Load-Balancer

Install the binaries on the server

# install the Consul binary
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip -O consul.zip
unzip -o consul.zip
sudo chown root:root consul
sudo mv -fv consul /usr/sbin/

# install the Nomad binary
wget https://releases.hashicorp.com/nomad/0.11.3/nomad_0.11.3_linux_amd64.zip -O nomad.zip
unzip -o nomad.zip
sudo chown root:root nomad
sudo mv -fv nomad /usr/sbin/

# install Consul's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/consul.service -O consul.service
sudo chown root:root consul.service
sudo mv -fv consul.service /etc/systemd/system/consul.service

# install Nomad's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/nomad.service -O nomad.service
sudo chown root:root nomad.service
sudo mv -fv nomad.service /etc/systemd/system/nomad.service

# install the Consul binary
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip -O consul.zip
unzip -o consul.zip
sudo chown root:root consul
sudo mv -fv consul /usr/sbin/

# install the Nomad binary
wget https://releases.hashicorp.com/nomad/0.11.3/nomad_0.11.3_linux_amd64.zip -O nomad.zip
unzip -o nomad.zip
sudo chown root:root nomad
sudo mv -fv nomad /usr/sbin/

# install Consul's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/consul.service -O consul.service
sudo chown root:root consul.service
sudo mv -fv consul.service /etc/systemd/system/consul.service

# install Nomad's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/nomad.service -O nomad.service
sudo chown root:root nomad.service
sudo mv -fv nomad.service /etc/systemd/system/nomad.service

Create the Load-Balancer Configuration

### On the load-balancer machine ...

### for Consul 
sudo mkdir -p /etc/consul/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/consul/client.hcl -O /etc/consul/client.hcl

### Edit Consul's client.hcl file and setup the fields 'name', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/consul/client.hcl

### for Nomad ...
sudo mkdir -p /etc/nomad/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/nomad/client.hcl -O /etc/nomad/client.hcl

### Edit Nomad's client.hcl file and setup the fields 'name', 'node_class', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/nomad/client.hcl

### After you are done with the edits ...

sudo systemctl daemon-reload
sudo systemctl enable consul nomad
sudo systemctl restart consul nomad
sleep 10
sudo consul members
sudo nomad server members
sudo nomad node status -verbose

### On the load-balancer machine ...

### for Consul 
sudo mkdir -p /etc/consul/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/consul/client.hcl -O /etc/consul/client.hcl

### Edit Consul's client.hcl file and setup the fields 'name', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/consul/client.hcl

### for Nomad ...
sudo mkdir -p /etc/nomad/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/nomad/client.hcl -O /etc/nomad/client.hcl

### Edit Nomad's client.hcl file and setup the fields 'name', 'node_class', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/nomad/client.hcl

### After you are done with the edits ...

sudo systemctl daemon-reload
sudo systemctl enable consul nomad
sudo systemctl restart consul nomad
sleep 10
sudo consul members
sudo nomad server members
sudo nomad node status -verbose

Setup the Client (Worker) Machines

Install the binaries on the server

# install the Consul binary
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip -O consul.zip
unzip -o consul.zip
sudo chown root:root consul
sudo mv -fv consul /usr/sbin/

# install the Nomad binary
wget https://releases.hashicorp.com/nomad/0.11.3/nomad_0.11.3_linux_amd64.zip -O nomad.zip
unzip -o nomad.zip
sudo chown root:root nomad
sudo mv -fv nomad /usr/sbin/

# install Consul's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/consul.service -O consul.service
sudo chown root:root consul.service
sudo mv -fv consul.service /etc/systemd/system/consul.service

# install Nomad's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/nomad.service -O nomad.service
sudo chown root:root nomad.service
sudo mv -fv nomad.service /etc/systemd/system/nomad.service

# install the Consul binary
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip -O consul.zip
unzip -o consul.zip
sudo chown root:root consul
sudo mv -fv consul /usr/sbin/

# install the Nomad binary
wget https://releases.hashicorp.com/nomad/0.11.3/nomad_0.11.3_linux_amd64.zip -O nomad.zip
unzip -o nomad.zip
sudo chown root:root nomad
sudo mv -fv nomad /usr/sbin/

# install Consul's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/consul.service -O consul.service
sudo chown root:root consul.service
sudo mv -fv consul.service /etc/systemd/system/consul.service

# install Nomad's service file
wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/systemd/nomad.service -O nomad.service
sudo chown root:root nomad.service
sudo mv -fv nomad.service /etc/systemd/system/nomad.service

Create the Worker Configuration

### On the client (worker) machine ...

### Consul 
sudo mkdir -p /etc/consul/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/consul/client.hcl -O /etc/consul/client.hcl

### Edit Consul's client.hcl file and setup the fields 'name', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/consul/client.hcl

### Nomad
sudo mkdir -p /etc/nomad/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/nomad/client.hcl -O /etc/nomad/client.hcl

### Edit Nomad's client.hcl file and setup the fields 'name', 'node_class', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/nomad/client.hcl

### After you are sure about your edits ...

sudo systemctl daemon-reload
sudo systemctl enable consul nomad
sudo systemctl restart consul nomad
sleep 10
sudo consul members
sudo nomad server members
sudo nomad node status -verbose

### On the client (worker) machine ...

### Consul 
sudo mkdir -p /etc/consul/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/consul/client.hcl -O /etc/consul/client.hcl

### Edit Consul's client.hcl file and setup the fields 'name', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/consul/client.hcl

### Nomad
sudo mkdir -p /etc/nomad/
sudo wget https://raw.githubusercontent.com/shantanugadgil/hashistack/master/config/nomad/client.hcl -O /etc/nomad/client.hcl

### Edit Nomad's client.hcl file and setup the fields 'name', 'node_class', 'encrypt', 'retry_join' as per your cluster.
sudo vim /etc/nomad/client.hcl

### After you are sure about your edits ...

sudo systemctl daemon-reload
sudo systemctl enable consul nomad
sudo systemctl restart consul nomad
sleep 10
sudo consul members
sudo nomad server members
sudo nomad node status -verbose

Test the Setup

For the sake of simplicity, we shall assume the following IP addresses for the machines. (You can adapt the IPs as per your actual cluster configuration)

srv1: 192.168.1.11

lb1: 192.168.1.101

client1: 192.168.201

client1: 192.168.202

You can access the web GUI for Consul and Nomad at the following URLs:

Consul: http://192.168.1.11:8500

Nomad: http://192.168.1.11:4646

# watch -n 5 "consul members; echo; nomad server members; echo; nomad node status -verbose; echo; nomad job status"

# watch -n 5 "consul members; echo; nomad server members; echo; nomad node status -verbose; echo; nomad job status"

Output:

Node     Address             Status  Type    Build  Protocol  DC   Segment
srv1     192.168.1.11:8301   alive   server  1.5.1  2         dc1  <all>
client1  192.168.1.201:8301  alive   client  1.5.1  2         dc1  <default>
client2  192.168.1.202:8301  alive   client  1.5.1  2         dc1  <default>
lb1      192.168.1.101:8301  alive   client  1.5.1  2         dc1  <default>

Name         Address       Port  Status  Leader  Protocol  Build  Datacenter  Region
srv1.global  192.168.1.11  4648  alive   true    2         0.9.3  dc1         global

ID           DC   Name     Class   Address        Version Drain  Eligibility  Status
37daf354...  dc1  client2  worker  192.168.1.202  0.9.3  false  eligible     ready
9bab72b1...  dc1  client1  worker  192.168.1.201  0.9.3  false  eligible     ready
621f4411...  dc1  lb1      lb      192.168.1.101  0.9.3  false  eligible     ready

Node     Address             Status  Type    Build  Protocol  DC   Segment
srv1     192.168.1.11:8301   alive   server  1.5.1  2         dc1  <all>
client1  192.168.1.201:8301  alive   client  1.5.1  2         dc1  <default>
client2  192.168.1.202:8301  alive   client  1.5.1  2         dc1  <default>
lb1      192.168.1.101:8301  alive   client  1.5.1  2         dc1  <default>

Name         Address       Port  Status  Leader  Protocol  Build  Datacenter  Region
srv1.global  192.168.1.11  4648  alive   true    2         0.9.3  dc1         global

ID           DC   Name     Class   Address        Version Drain  Eligibility  Status
37daf354...  dc1  client2  worker  192.168.1.202  0.9.3  false  eligible     ready
9bab72b1...  dc1  client1  worker  192.168.1.201  0.9.3  false  eligible     ready
621f4411...  dc1  lb1      lb      192.168.1.101  0.9.3  false  eligible     ready

Submit Jobs

Run the load-balancer job

# nomad run fabio_docker.nomad

# nomad run fabio_docker.nomad

Output:

==> Monitoring evaluation "bb140467"
    Evaluation triggered by job "fabio_docker"
    Allocation "1a6a5587" created: node "621f4411", group "fabio"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "bb140467" finished with status "complete"

==> Monitoring evaluation "bb140467"
    Evaluation triggered by job "fabio_docker"
    Allocation "1a6a5587" created: node "621f4411", group "fabio"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "bb140467" finished with status "complete"

Check the status of the load-balancer

# nomad alloc status 1a6a5587

# nomad alloc status 1a6a5587

Output:

ID                  = 1a6a5587
Eval ID             = bb140467
Name                = fabio_docker.fabio[0]
Node ID             = 621f4411
Node Name           = lb1
Job ID              = fabio_docker
Job Version         = 0
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 1m9s ago
Modified            = 1m3s ago

Task "fabio" is "running"
Task Resources
CPU        Memory          Disk     Addresses
5/200 MHz  10 MiB/128 MiB  300 MiB  lb: 192.168.1.101:9999
                                    ui: 192.168.1.101:9998

Task Events:
Started At     = 2019-06-13T19:15:17Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2019-06-13T19:15:17Z  Started     Task started by client
2019-06-13T19:15:12Z  Driver      Downloading image
2019-06-13T19:15:12Z  Task Setup  Building Task Directory
2019-06-13T19:15:12Z  Received    Task received by client

ID                  = 1a6a5587
Eval ID             = bb140467
Name                = fabio_docker.fabio[0]
Node ID             = 621f4411
Node Name           = lb1
Job ID              = fabio_docker
Job Version         = 0
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 1m9s ago
Modified            = 1m3s ago

Task "fabio" is "running"
Task Resources
CPU        Memory          Disk     Addresses
5/200 MHz  10 MiB/128 MiB  300 MiB  lb: 192.168.1.101:9999
                                    ui: 192.168.1.101:9998

Task Events:
Started At     = 2019-06-13T19:15:17Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2019-06-13T19:15:17Z  Started     Task started by client
2019-06-13T19:15:12Z  Driver      Downloading image
2019-06-13T19:15:12Z  Task Setup  Building Task Directory
2019-06-13T19:15:12Z  Received    Task received by client

Run the service ‘foo’

# nomad run foo_docker.nomad

# nomad run foo_docker.nomad

Output:

==> Monitoring evaluation "a994bbf0"
    Evaluation triggered by job "foo_docker"
    Allocation "7794b538" created: node "9bab72b1", group "gowebhello"
    Allocation "eecceffc" modified: node "37daf354", group "gowebhello"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "a994bbf0" finished with status "complete"

==> Monitoring evaluation "a994bbf0"
    Evaluation triggered by job "foo_docker"
    Allocation "7794b538" created: node "9bab72b1", group "gowebhello"
    Allocation "eecceffc" modified: node "37daf354", group "gowebhello"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "a994bbf0" finished with status "complete"

Check the status of service ‘foo’

# nomad alloc status 7794b538

# nomad alloc status 7794b538

Output:

ID                  = 7794b538
Eval ID             = a994bbf0
Name                = foo_docker.gowebhello[1]
Node ID             = 9bab72b1
Node Name           = client1
Job ID              = foo_docker
Job Version         = 1
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 9s ago
Modified            = 7s ago

Task "gowebhello" is "running"
Task Resources
CPU        Memory           Disk     Addresses
0/500 MHz  4.2 MiB/256 MiB  300 MiB  http: 192.168.1.201:23382

Task Events:
Started At     = 2019-06-13T19:27:17Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2019-06-13T19:27:17Z  Started     Task started by client
2019-06-13T19:27:16Z  Task Setup  Building Task Directory
2019-06-13T19:27:15Z  Received    Task received by client

ID                  = 7794b538
Eval ID             = a994bbf0
Name                = foo_docker.gowebhello[1]
Node ID             = 9bab72b1
Node Name           = client1
Job ID              = foo_docker
Job Version         = 1
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 9s ago
Modified            = 7s ago

Task "gowebhello" is "running"
Task Resources
CPU        Memory           Disk     Addresses
0/500 MHz  4.2 MiB/256 MiB  300 MiB  http: 192.168.1.201:23382

Task Events:
Started At     = 2019-06-13T19:27:17Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2019-06-13T19:27:17Z  Started     Task started by client
2019-06-13T19:27:16Z  Task Setup  Building Task Directory
2019-06-13T19:27:15Z  Received    Task received by client

Run the service ‘bar’

# nomad run bar_docker.nomad

# nomad run bar_docker.nomad

Output:

==> Monitoring evaluation "075076bc"
    Evaluation triggered by job "bar_docker"
    Allocation "9f16354b" created: node "9bab72b1", group "gowebhello"
    Allocation "b86d8946" created: node "37daf354", group "gowebhello"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "075076bc" finished with status "complete"

==> Monitoring evaluation "075076bc"
    Evaluation triggered by job "bar_docker"
    Allocation "9f16354b" created: node "9bab72b1", group "gowebhello"
    Allocation "b86d8946" created: node "37daf354", group "gowebhello"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "075076bc" finished with status "complete"

Check the status of service ‘bar’

# nomad alloc status 9f16354b

# nomad alloc status 9f16354b

Output:

ID                  = 9f16354b
Eval ID             = 075076bc
Name                = bar_docker.gowebhello[1]
Node ID             = 9bab72b1
Node Name           = client1
Job ID              = bar_docker
Job Version         = 0
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 4m28s ago
Modified            = 4m16s ago

Task "gowebhello" is "running"
Task Resources
CPU        Memory           Disk     Addresses
0/500 MHz  6.2 MiB/256 MiB  300 MiB  http: 192.168.1.201:23646

Task Events:
Started At     = 2019-06-14T06:49:36Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2019-06-14T06:49:36Z  Started     Task started by client
2019-06-14T06:49:35Z  Task Setup  Building Task Directory
2019-06-14T06:49:35Z  Received    Task received by client

ID                  = 9f16354b
Eval ID             = 075076bc
Name                = bar_docker.gowebhello[1]
Node ID             = 9bab72b1
Node Name           = client1
Job ID              = bar_docker
Job Version         = 0
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 4m28s ago
Modified            = 4m16s ago

Task "gowebhello" is "running"
Task Resources
CPU        Memory           Disk     Addresses
0/500 MHz  6.2 MiB/256 MiB  300 MiB  http: 192.168.1.201:23646

Task Events:
Started At     = 2019-06-14T06:49:36Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2019-06-14T06:49:36Z  Started     Task started by client
2019-06-14T06:49:35Z  Task Setup  Building Task Directory
2019-06-14T06:49:35Z  Received    Task received by client

Check the Fabio Routes

http://192.168.1.101:9998/routes

Connect to the Services

The services “foo” and “bar” are available at:

http://192.168.1.101:9999/foo

http://192.168.1.101:9999/bar

Output:

gowebhello root page

https://github.com/udhos/gowebhello is a simple golang replacement for 'python -m SimpleHTTPServer'.
Welcome!
gowebhello version 0.7 runtime go1.12.5 os=linux arch=amd64
Keepalive: true
Application banner: Welcome to FOO
...
...

gowebhello root page

https://github.com/udhos/gowebhello is a simple golang replacement for 'python -m SimpleHTTPServer'.
Welcome!
gowebhello version 0.7 runtime go1.12.5 os=linux arch=amd64
Keepalive: true
Application banner: Welcome to FOO
...
...

Pressing F5 to refresh the browser should keep changing the backend service that you are eventually connected to.

Conclusion

This article should give you a fair idea about the common problems of a distributed application and how they can be solved.

Remodeling an existing application deployment as it scales can be quite a challenge. Hopefully the sample/demo setup will help you to explore, design and optimize the deployment workflows of your application, be it On-Premise or any Cloud Environment.

December 12, 2022

Why You Should Prefer Next.js 12 Over Other React Setup
If you are coming from a robust framework, such as Angular or any other major full-stack framework, you have probably asked yourself why a popular library like React (yes, it’s not a framework, hence this blog) has the worst tooling and developer experience.

They’ve done the least amount of work possible to build this framework: no routing, no support for SSR, nor a decent design system, or CSS support. While some people might disagree—“The whole idea is to keep it simple so that people can bootstrap their own framework.” –Dan Abramov. However, here’s the catch: Most people don’t want to go through the tedious process of setting up.

Many just want to install and start building some robust applications, and with the new release of Next.js (12), it’s more production-ready than your own setup can ever be.

Before we get started discussing what Next.js 12 can do for us, let’s get some facts straight:
- React is indeed a library that could be used with or without JSX.
- Next.js is a framework (Not entirely UI ) for building full-stack applications.
- Next.js is opinionated, so if your plan is to do whatever you want or how you want, maybe Next isn’t the right thing for you (mind that it’s for production).
- Although Next is one of the most updated code bases and has a massive community supporting it, a huge portion of it is handled by Vercel, and like other frameworks backed by a tech giant… be ready for occasional Vendor-lockin (don’t forget React–[Meta] ).
- This is not a Next.js tutorial; I won’t be going over Next.js. I will be going over the features that are released with V12 that make it go over the inflection point where Next could be considered as the primary framework for React apps.
ES module support

ES modules bring a standardized module system to the entire JS ecosystem. They’re supported by all major browsers and node.js, enabling your build to have smaller package sizes. This lets you use any package using a URL—no installation or build step required—use any CDN that serves ES module as well as the design tools of the future (Framer already does it –https://www.framer.com/ ).
```
import Card from 'https://framer.com/m/Card-3Yxh.js@gsb1Gjlgc5HwfhuD1VId';
import Head from 'next/head';

export default class MyDocument extends Document {
  render() {
    return (
      <>
        <Head>
          <title>URL imports for Next 12</title>
        </Head>
        <div>
          <Card variant='R3F' />
        </div>
      </>
    );
  }
}
```
As you can see, we are importing a Card component directly from the framer CDN on the go with all its perks. This would, in turn, be the start of seamless integration with all your developer environments in the not-too-distant future. If you want to learn more about URL imports and how to enable the alpha version, go here.

New engine for faster DEV run and production build:

Next.js 12 comes with a new Rust compiler that comes with a native infrastructure. This is built on top of SWC, an open platform for fast tooling systems. It comes with an impressive stat of having 3 times faster local refresh and 5 times faster production builds.

Contrary to most productions builds with React using webpack, which come with a ton of overheads and don’t really run on the native system, SWC is going to save you a ton of time that you waste during your mundane workloads.

Source: Nextjs.org

Next.js Live:

If you are anything like me, you’ve probably had some changes that you aren’t really sure about and just want to go through them with the designer, but you don’t really wanna push the code to PROD. Taking a call with the designer and sharing your screen isn’t really the best way to do it. If only there were a way to share your workflow on-the-go with your team with some collaboration feature that just wouldn’t take up an entire day to setup. Well, Next.js Live lets you do just that.

Source: Next.js

With the help of ES module system and native support for webassembly, Next.js Live runs entirely on the browser, and irrespective of where you host it, the development engine behind it will soon be open source so that more platforms can actually take advantage of this, but for now, it’s all Next.js.

Go over to V and do a test run.

Middleware & serverless:

These are just repetitive pieces of code that you think could run on their own out of your actual backend. The best part about this is that you don’t really need to place these close to your backend. Before the request gets completed, you can potentially rewrite, redirect, add headers, or even stream HTML., Depending upon how you host your middleware using Vercel edge functions or lambdas with AWS, they can potentially handle
- Authentication
- Bot protection
- Redirects
- Browser support
- Feature flags
- A/B tests
- Server-side analytics
- Logging
And since this is part of the Next build output, you can technically use any hosting providers with an Edge network (No Vendor lock-in)

For implementing middleware, we can create a file _middleware inside any pages folder that will run before any requests at that particular route (routename)

pages/routeName/_middleware.ts.
```
import type { NextFetchEvent } from 'next/server';
import { NextResponse } from 'next/server';
export function middleware(event: NextFetchEvent) {
  // gram the user's location or use India for default
  const country = event.request.geo.country.toLowerCase() || 'IND';

  //rewrite to static, cached page for each local
  return event.respondWith(NextResponse.rewrite(`/routeName/${country}`));
}
```
Since this middleware, each request will be cached, and rewriting the response change the URL in your client Next.js can make the difference and still provide you the country flag.

Server-side streaming:

React 18 now supports server-side suspense API and SSR streaming. One big drawback of SSR was that it wasn’t restricted to the strict run time of REST fetch standard. So, in theory, any page that needed heavy lifting from the server could give you higher FCP (first contentful paint). Now this will allow you to stream server-rendered pages using HTTP streaming that will solve your problem for higher render time you can take a look at the alpha version by adding.
```
module.exports = {
  experimental: {
    concurrentFeatures: true
  }
}
```
React server components:

React server components allow us to render almost everything, including the components themselves inside the server. This is fundamentally different from SSR where you are just generating HTML on the server, with server components, there’s zero client-side Javascript needed, making the rendering process much faster (basically no hydration process). This could also be deemed as including the best parts of server rendering with client-side interactivity.
import Footer from '../components/Footer'; import Page from '../components/Page'; import Story from '../components/Story'; import fetchData from '../lib/api'; export async function getServerSideProps() { const storyIds = await fetchData('storyIds'); const data = await Promise.all( storyIds.slice(0, 30).map(async (id) => await fetchData(`item/${id}`)) ); return { props: { data, }, }; } export default function News({ data }) { return ( <Page> {data?.map((item, i) => ( <Story key={i} {...item} /> ))} <Footer /> </Page> ); }
```
import Footer from '../components/Footer';
import Page from '../components/Page';
import Story from '../components/Story';
import fetchData from '../lib/api';
export async function getServerSideProps() {
  const storyIds = await fetchData('storyIds');
  const data = await Promise.all(
    storyIds.slice(0, 30).map(async (id) => await fetchData(`item/${id}`))
  );

  return {
    props: {
      data,
    },
  };
}

export default function News({ data }) {
  return (
    <Page>
      {data?.map((item, i) => (
        <Story key={i} {...item} />
      ))}
      <Footer />
    </Page>
  );
}
```
As you can see in the above SSR example, while we are fetching the stories from the endpoint, our client is actually waiting for a response with a blank page, and depending upon how fast your APIs are, this is a pretty big problem—and the reason we don’t just use SSR blindly everywhere.

Now, let’s take a look at a server component example:

Any file ending with .server.js/.ts will be treated as a server component in your Next.js application.
export async function NewsWithData() { const storyIds = await fetchData('storyIds'); return ( <> {storyIds.slice(0, 30).map((id) => { return ( <Suspense fallback={<Spinner />}> <StoryWithData id={id} /> </Suspense> ); })} </> ); } export default function News() { return ( <Page> <Suspense fallback={<Spinner />}> <NewsWithData /> </Suspense> <Footer /> </Page> ); }
```
export async function NewsWithData() {
  const storyIds = await fetchData('storyIds');
  return (
    <>
      {storyIds.slice(0, 30).map((id) => {
        return (
          <Suspense fallback={<Spinner />}>
            <StoryWithData id={id} />
          </Suspense>
        );
      })}
    </>
  );
}

export default function News() {
  return (
    <Page>
      <Suspense fallback={<Spinner />}>
        <NewsWithData />
      </Suspense>
      <Footer />
    </Page>
  );
}
```
This implementation will stream your components progressively and eventually show your data as it gets generated in the server component–by-component. The difference is huge; it will be the next level of code-splitting ,and allow you to do data fetching at the component level and you don’t need to worry about making an API call in the browser.

And functions like getStaticProps and getserverSideProps will be a liability of the past.

And this also identifies the React Hooks model, going with the de-centralized component model. It also removes the choice we often need to make between static or dynamic, bringing the best of both worlds. In the future, the feature of incremental static regeneration will be based on a per-component level, removing the all or nothing page caching and in terms will allow decisive / intelligent caching based on your needs.

Next.js is internally working on a data component, which is basically the React suspense API but with surrogate keys, revalidate, and fallback, which will help to realize these things in the future. Defining your caching semantics at the component level

Conclusion:

Although all the features mentioned above are still in the development stage, just the inception of these will take the React world and frontend in general into a particular direction, and it’s the reason you should be keeping it as your default go-to production framework.
December 12, 2022

Category: Blogs

What is PWA:

Why build a PWA?

PWA and React

Building PWAs with React

Distributing and publishing a PWA

Conclusion

Section 1: Lambda function input format

1. Event:

2. Context:

Section 2: Response Format

Section 3: Managing Conversation Context

1. Setting session timeout

2.Setting session attributes

3. Sharing information between intents

Section 4: Example

Conclusion

What is GitOps?

Role of Infrastructure as Code (IaC) in GitOps

GitOps does pull-based deployments

Key benefits of GitOps:

Introduction to Argo CD

Key components of Argo CD:

Key objects/resources in Argo CD:

Demo:

Installing Argo CD:

Accessing the Argo CD GUI

Deploy the app:

Updating the app

Conclusion:

Prerequisites

Why Serverless?

Why DynamoDB?

Overview

Base Setup

Adding Lambda role and policies:

Adding custom config for plugins:

Add WebSocket Lambda:

Publishing the messages

Implementing the client

Conclusion

What is Elasticsearch?

Basic Concepts –

Inverted Index

Analyzers –

Some Commonly used built-in analyzers –

Some Commonly used built-in tokenizers –

Make your mappings right –

Performance Improvement with Filter Queries –

Re-indexing made faster –

Conclusion

What is an inline function?

Alternatives to inline function

Why use inline function

Arguments against inline functions

Conclusion‍

Prerequisites:

Basic Overview:

Custom Admission Webhook Server:

Custom Admission Webhook Server Init Container:

Conclusion:

Related Articles:

Why the Analysis of IoT data is different

Network Protocol

Datastore

Pipeline Overview

Setting up a MQTT Message Broker Server:

Setting up InfluxDB and Grafana

Creating a Kinesis stream

Creating the MQTT client

Create a Kinesis Producer

Create a Kinesis Consumer

Processing the data and writing it to InfluxDB

Putting everything together in our main function

Visualization through Grafana

Conclusion:

Introduction

A Typical Application

Simple Multi-tier Application

Problem Scenario 1