Category: Blogs

A Step Towards Machine Learning Algorithms: Univariate Linear Regression
These days the concept of Machine Learning is evolving rapidly. The understanding of it is so vast and open that everyone is having their independent thoughts about it. Here I am putting mine. This blog is my experience with the learning algorithms. In this blog, we will get to know the basic difference between Artificial Intelligence, Machine Learning, and Deep Learning. We will also get to know the foundation Machine Learning Algorithm i.e Univariate Linear Regression.

Intermediate knowledge of Python and its library (Numpy, Pandas, MatPlotLib) is good to start. For Mathematics, a little knowledge of Algebra, Calculus and Graph Theory will help to understand the trick of the algorithm.

A way to Artificial intelligence, Machine Learning, and Deep Learning

These are the three buzzwords of today’s Internet world where we are seeing the future of the programming language. Specifically, we can say that this is the place where science domain meets with programming. Here we use scientific concepts and mathematics with a programming language to simulate the decision-making process. Artificial Intelligence is a program or the ability of a machine to make decisions more as humans do. Machine Learning is another program that supports Artificial Intelligence. It helps the machine to observe the pattern and learn from it to make a decision. Here programming is helping in observing the patterns not in making decisions. Machine learning requires more and more information from various sources to observe all of the variables for any given pattern to make more accurate decisions. Here deep learning is supporting machine learning by creating a network (neural network) to fetch all required information and provide it to machine learning algorithms.

What is Machine Learning

Definition: Machine Learning provides machines with the ability to learn autonomously based on experiences, observations and analyzing patterns within a given data set without explicitly programming.

This is a two-part process. In the first part, it observes and analyses the patterns of given data and makes a shrewd guess of a mathematical function that will be very close to the pattern. There are various methods for this. Few of them are Linear, Non-Linear, logistic, etc. Here we calculate the error function using the guessed mathematical function and the given data. In the second part we will minimize the error function. This minimized function is used for the prediction of the pattern.

Here are the general steps to understand the process of Machine Learning:
1. Plot the given dataset on x-y axis
2. By looking into the graph, we will guess more close mathematical function
3. Derive the Error function with the given dataset and guessed mathematical function
4. Try to minimize an error function by using some algorithms
5. Minimized error function will give us a more accurate mathematical function for the given patterns.
Getting Started with the First Algorithms: Linear Regression with Univariable

Linear Regression is a very basic algorithm or we can say the first and foundation algorithm to understand the concept of ML. We will try to understand this with an example of given data of prices of plots for a given area. This example will help us understand it better.
movieID title userID rating timestamp 0 1 Toy story 170 3.0 1162208198000 1 1 Toy story 175 4.0 1133674606000 2 1 Toy story 190 4.5 1057778398000 3 1 Toy story 267 2.5 1084284499000 4 1 Toy story 325 4.0 1134939391000 5 1 Toy story 493 3.5 1217711355000 6 1 Toy story 533 5.0 1050012402000 7 1 Toy story 545 4.0 1162333326000 8 1 Toy story 580 5.0 1162374884000 9 1 Toy story 622 4.0 1215485147000 10 1 Toy story 788 4.0 1188553740000
```
movieID	title	userID	rating	timestamp
0	1	Toy story	170	3.0	1162208198000
1	1	Toy story	175	4.0	1133674606000
2	1	Toy story	190	4.5	1057778398000
3	1	Toy story	267	2.5	1084284499000
4	1	Toy story	325	4.0	1134939391000
5	1	Toy story	493	3.5	1217711355000
6	1	Toy story	533	5.0	1050012402000
7	1	Toy story	545	4.0	1162333326000
8	1	Toy story	580	5.0	1162374884000
9	1	Toy story	622	4.0	1215485147000
10	1	Toy story	788	4.0	1188553740000
```
With this data, we can easily determine the price of plots of the given area. But what if we want the price of the plot with area 5.0 * 10 sq mtr. There is no direct price of this in our given dataset. So how we can get the price of the plots with the area not given in the dataset. This we can do using Linear Regression.

So at first, we will plot this data into a graph.

The below graphs describe the area of plots (10 sq mtr) in x-axis and its prices in y-axis (Lakhs INR).

Definition of Linear Regression

The objective of a linear regression model is to find a relationship between one or more features (independent variables) and a continuous target variable(dependent variable). When there is only feature it is called Univariate Linear Regression and if there are multiple features, it is called Multiple Linear Regression.

Hypothesis function:

Here we will try to find the relation between price and area of plots. As this is an example of univariate, we can see that the price is only dependent on the area of the plot.

By observing this pattern we can have our hypothesis function as below:

f(x) = w * x + b

where w is weightage and b is biased.

For the different value set of (w,b) there can be multiple line possible but for one set of value, it will be close to this pattern.

When we generalize this function for multivariable then there will be a set of values of w then these constants are also termed as model params.

Note: There is a range of mathematical functions that relate to this pattern and selection of the function is totally up to us. But point to be taken care is that neither it should be under or overmatched and function must be continuous so that we can easily differentiate it or it should have global minima or maxima.

Error for a point

As our hypothesis function is continuous, for every Xi (area points) there will be one Yi Predicted Price and Y will be the actual price.

So the error at any point,

Ei = Yi – Y = F(Xi) – Y

These errors are also called as residuals. These residuals can be positive (if actual points lie below the predicted line) or negative (if actual points lie above the predicted line). Our motive is to minimize this residual for each of the points.

Note: While observing the patterns it is possible that few points are very far from the pattern. For these far points, residuals will be much more so if these points are less in numbers than we can avoid these points considering that these are errors in the dataset. Such points are termed as outliers.

Energy Functions

As there are m training points, we can calculate the Average Energy function below

E (w,b) = 1/m ( iΣm (Ei) )

and

our motive is to minimize the energy functions

min (E (w,b)) at point ( w,b )

Little Calculus: For any continuous function, the points where the first derivative is zero are the points of either minima or maxima. If the second derivative is negative, it is the point of maxima and if it is positive, it is the point of minima.

Here we will do the trick – we will convert our energy function into an upper parabola by squaring the error function. It will ensure that our energy function will have only one global minima (the point of our concern). It will simplify our calculation that where the first derivative of the energy function will be zero is the point that we need and the value of (w,b) at that point will be our required point.

So our final Energy function is

E (w,b) = 1/2m ( iΣm (Ei)2 )

dividing by 2 doesn’t affect our result and at the time of derivation it will cancel out for e.g

the first derivative of x2 is 2x.

Gradient Descent Method

Gradient descent is a generic optimization algorithm. It iteratively hit and trials the parameters of the model in order to minimize the energy function.

In the above picture, we can see on the right side:
1. w0 and w1 is the random initialization and by following gradient descent it is moving towards global minima.
2. No of turns of the black line is the number of iterations so it must not be more or less.
3. The distance between the turns is alpha i.e the learning parameter.
By solving this left side equation we will be able to get model params at the global minima of energy functions.

Points to consider at the time of Gradient Descent calculations:
1. Random initialization: We start this algorithm at any random point that is set of random (w, b) value. By moving along this algorithm decide at which direction new trials have to be taken. As we know that it will be the upper parabola so by moving into the right direction (towards the global minima) we will get lesser value compared to the previous point.
2. No of iterations: No of iteration must not be more or less. If it is lesser, we will not reach global minima and if it is more, then it will be extra calculations around the global minima.
3. Alpha as learning parameters: when alpha is too small then gradient descent will be slow as it takes unnecessary steps to reach the global minima. If alpha is too big then it might overshoot the global minima. In this case it will neither converge nor diverge.
Implementation of Gradient Descent in Python
""" Method to read the csv file using Pandas and later use this data for linear regression. """ """ Better run with Python 3+. """ # Library to read csv file effectively import pandas import matplotlib.pyplot as plt import numpy as np # Method to read the csv file def load_data(file_name): column_names = ['area', 'price'] # To read columns io = pandas.read_csv(file_name,names=column_names, header=None) x_val = (io.values[1:, 0]) y_val = (io.values[1:, 1]) size_array = len(y_val) for i in range(size_array): x_val[i] = float(x_val[i]) y_val[i] = float(y_val[i]) return x_val, y_val # Call the method for a specific file x_raw, y_raw = load_data('area-price.csv') x_raw = x_raw.astype(np.float) y_raw = y_raw.astype(np.float) y = y_raw # Modeling w, b = 0.1, 0.1 num_epoch = 100 converge_rate = np.zeros([num_epoch , 1], dtype=float) learning_rate = 1e-3 for e in range(num_epoch): # Calculate the gradient of the loss function with respect to arguments (model parameters) manually. y_predicted = w * x_raw + b grad_w, grad_b = (y_predicted - y).dot(x_raw), (y_predicted - y).sum() # Update parameters. w, b = w - learning_rate * grad_w, b - learning_rate * grad_b converge_rate[e] = np.mean(np.square(y_predicted-y)) print(w, b) print(f"predicted function f(x) = x * {w} + {b}" ) calculatedprice = (10 * w) + b print(f"price of plot with area 10 sqmtr = 10 * {w} + {b} = {calculatedprice}")
```
""" Method to read the csv file using Pandas and later use this data for linear regression. """
""" Better run with Python 3+. """

# Library to read csv file effectively
import pandas
import matplotlib.pyplot as plt
import numpy as np

# Method to read the csv file
def load_data(file_name):
	column_names = ['area', 'price']
	# To read columns
	io = pandas.read_csv(file_name,names=column_names, header=None)
	x_val = (io.values[1:, 0])
	y_val = (io.values[1:, 1])
	size_array = len(y_val)
	for i in range(size_array):
		x_val[i] = float(x_val[i])
		y_val[i] = float(y_val[i])
		return x_val, y_val

# Call the method for a specific file
x_raw, y_raw = load_data('area-price.csv')
x_raw = x_raw.astype(np.float)
y_raw = y_raw.astype(np.float)
y = y_raw

# Modeling
w, b = 0.1, 0.1
num_epoch = 100
converge_rate = np.zeros([num_epoch , 1], dtype=float)
learning_rate = 1e-3
for e in range(num_epoch):
	# Calculate the gradient of the loss function with respect to arguments (model parameters) manually.
	y_predicted = w * x_raw + b
	grad_w, grad_b = (y_predicted - y).dot(x_raw), (y_predicted - y).sum()
	# Update parameters.
	w, b = w - learning_rate * grad_w, b - learning_rate * grad_b
	converge_rate[e] = np.mean(np.square(y_predicted-y))

print(w, b)
print(f"predicted function f(x) = x * {w} + {b}" )
calculatedprice = (10 * w) + b
print(f"price of plot with area 10 sqmtr = 10 * {w} + {b} = {calculatedprice}")
```
This is the basic implementation of Gradient Descent algorithms using numpy and Pandas. It is basically reading the area-price.csv file. Here we are normalizing the x-axis for better readability of data points over the graph. We have taken (w,b) as (0.1, 0.1) as random initialization. We have taken 100 as count of iterations and learning rate as .001.

In every iteration, we are calculating w and b value and seeing it for converging rate.

We can repeat this calculation for (w,b) for different values of random initialization, no of iterations and learning rate (alpha).

Note: There is another python Library TensorFlow which is more preferable for such calculations. There are inbuilt functions of Gradient Descent in TensorFlow. But for better understanding, we have used library numpy and pandas here.

RMSE (Root Mean Square Error)

RMSE: This is the method to verify that our calculation of (w,b) is accurate at what extent. Below is the basic formula of calculation of RMSE where f is the predicted value and the observed value.

Note: There is no absolute good or bad threshold value for RMSE, however, we can assume this based on our observed value. For an observed value ranges from 0 to 1000, the RMSE value of 0.7 is small, but if the range goes from 0 to 1, it is not that small.

Conclusion

As part of this article, we have seen a little introduction to Machine Learning and the need for it. Then with the help of a very basic example, we learned about one of the various optimization algorithms i.e. Linear Regression (for univariate only). This can be generalized for multivariate also. We then use the Gradient Descent Method for the calculation of the predicted data model in Linear Regression. We also learned the basic flow details of Gradient Descent. There is one example in python for displaying Linear Regression via Gradient Descent.
December 12, 2022
Publish APIs For Your Customers: Deploy Serverless Developer Portal For Amazon API Gateway
Amazon API Gateway is a fully managed service that allows you to create, secure, publish, test and monitor your APIs. We often come across scenarios where customers of these APIs expect a platform to learn and discover APIs that are available to them (often with examples).

The Serverless Developer Portal is one such application that is used for developer engagement by making your APIs available to your customers. Further, your customers can use the developer portal to subscribe to an API, browse API documentation, test published APIs, monitor their API usage, and submit their feedback.

This blog is a detailed step-by-step guide for deploying the Serverless Developer Portal for APIs that are managed via Amazon API Gateway.

Advantages

The users of the Amazon API Gateway can be vaguely categorized as –

API Publishers – They can use the Serverless Developer Portal to expose and secure their APIs for customers which can be integrated with AWS Marketplace for monetary benefits. Furthermore, they can customize the developer portal, including content, styling, logos, custom domains, etc.

API Consumers – They could be Frontend/Backend developers, third party customers, or simply students. They can explore available APIs, invoke the APIs, and go through the documentation to get an insight into how each API works with different requests.

Developer Portal Architecture

We would need to establish a basic understanding of how the developer portal works. The Serverless Developer Portal is a serverless application built on microservice architecture using Amazon API Gateway, Amazon Cognito, AWS Lambda, Simple Storage Service and Amazon CloudFront.

The developer portal comprises multiple microservices and components as described in the following figure.

Source: AWS

There are a few key pieces in the above architecture –
1. Identity Management: Amazon Cognito is basically the secure user directory of the developer portal responsible for user management. It allows you to configure triggers for registration, authentication, and confirmation, thereby giving you more control over the authentication process.
2. Business Logic: AWS Cloudfront is configured to serve your static content hosted in a private S3 bucket. The static content is built using the React JS framework which interacts with backend APIs dictating the business logic for various events.
3. Catalog Management: Developer portal uses catalog for rendering the APIs with Swagger specifications on the APIs page. The catalog file (catalog.json in S3 Artifact bucket) is updated whenever an API is published or removed. This is achieved by creating an S3 trigger on AWS Lambda responsible for studying the content of the catalog directory and generating a catalog for the developer portal.
4. API Key Creation: API Key is created for consumers at the time of registration. Whenever you subscribe to an API, associated Usage Plans are updated to your API key, thereby giving you access to those APIs as defined by the usage plan. Cognito User – API key mapping is stored in the DynamoDB table along with other registration related details.
5. Static Asset Uploader: AWS Lambda (Static-Asset-Uploader) is responsible for updating/deploying static assets for the developer portal. Static assets include – content, logos, icons, CSS, JavaScripts, and other media files.
Let’s move forward to building and deploying a simple Serverless Developer Portal.

Building Your API

Start with deploying an API which can be accessed using API Gateway from
```
https://<api-id>.execute-api.region.amazonaws.com/stage
```
If you do not have any such API available, create a simple application by jumping to the section, “API Performance Across the Globe,” on this blog.

Setup custom domain name

For professional projects, I recommend that you create a custom domain name as they provide simpler and more intuitive URLs you can provide to your API users.

Make sure your API Gateway domain name is updated in the Route53 record set created after you set up your custom domain name.

See more on Setting up custom domain names for REST APIs – Amazon API Gateway

Enable CORS for an API Resource

There are two ways you can enable CORS on a resource:
1. Enable CORS Using the Console
2. Enable CORS on a resource using the import API from Amazon API Gateway
Let’s discuss the easiest way to do it using a console.
1. Open API Gateway console.
2. Select the API Gateway for your API from the list.
3. Choose a resource to enable CORS for all the methods under that resource.
  Alternatively, you could choose a method under the resource to enable CORS for just this method.
4. Select Enable CORS from the Actions drop-down menu.
5. In the Enable CORS form, do the following:
  – Leave Access-Control-Allow-Headers and Access-Control-Allow-Origin header to default values.
  – Click on Enable CORS and replace existing CORS headers.
6. Review the changes in Confirm method changes popup, choose Yes, overwrite existing values to apply your CORS settings.
Once enabled, you can see a mock integration on the OPTIONS method for the selected resource. You must enable CORS for ${proxy} resources too.

To verify the CORS is enabled on API resource, try curl on OPTIONS method
```
curl -v -X OPTIONS -H "Access-Control-Request-Method: POST" -H "Origin: http://example.com" https://api-id.execute-api.region.amazonaws.com/stage
```
You should see the response OK in the header:
< HTTP/1.1 200 OK < Content-Type: application/json < Content-Length: 0 < Connection: keep-alive < Date: Mon, 13 Apr 2020 16:27:44 GMT < x-amzn-RequestId: a50b97b5-2437-436c-b99c-22e00bbe9430 < Access-Control-Allow-Origin: * < Access-Control-Allow-Headers: Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token < x-amz-apigw-id: K7voBHDZIAMFu9g= < Access-Control-Allow-Methods: DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT < X-Cache: Miss from cloudfront < Via: 1.1 1c8c957c4a5bf1213bd57bd7d0ec6570.cloudfront.net (CloudFront) < X-Amz-Cf-Pop: BOM50-C1 < X-Amz-Cf-Id: OmxFzV2-TH2BWPVyOohNrhNlJ-s1ZhYVKyoJaIrA_zyE9i0mRTYxOQ==
```
< HTTP/1.1 200 OK
< Content-Type: application/json
< Content-Length: 0
< Connection: keep-alive
< Date: Mon, 13 Apr 2020 16:27:44 GMT
< x-amzn-RequestId: a50b97b5-2437-436c-b99c-22e00bbe9430
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Headers: Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token
< x-amz-apigw-id: K7voBHDZIAMFu9g=
< Access-Control-Allow-Methods: DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT
< X-Cache: Miss from cloudfront
< Via: 1.1 1c8c957c4a5bf1213bd57bd7d0ec6570.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: BOM50-C1
< X-Amz-Cf-Id: OmxFzV2-TH2BWPVyOohNrhNlJ-s1ZhYVKyoJaIrA_zyE9i0mRTYxOQ==
```
Deploy Developer Portal

There are two ways to deploy the developer portal for your API.

Using SAR

An easy way will be to deploy api-gateway-dev-portal directly from AWS Serverless Application Repository.

Note -If you intend to upgrade your Developer portal to a major version then you need to refer to the Upgrading Instructions which is currently under development.

Using AWS SAM
1. Ensure that you have the latest AWS CLI and AWS SAM CLI installed and configured.
2. Download or clone the API Gateway Serverless Developer Portal repository.
3. Update the Cloudformation template file – cloudformation/template.yaml.
Parameters you must configure and verify includes:
- ArtifactsS3BucketName
- DevPortalSiteS3BucketName
- DevPortalCustomersTableName
- DevPortalPreLoginAccountsTableName
- DevPortalAdminEmail
- DevPortalFeedbackTableName
- CognitoIdentityPoolName
- CognitoDomainNameOrPrefix
- CustomDomainName
- CustomDomainNameAcmCertArn
- UseRoute53Nameservers
- AccountRegistrationMode
You can view your template file in AWS Cloudformation Designer to get a better idea of all the components/services involved and how they are connected.

See Developer portal settings for more information about parameters.
1. Replace the static files in your project with the ones you would like to use.
  dev-portal/public/custom-content
  lambdas/static-asset-uploader/build
  – api-logo contains the logos you would like to show on the API page (in png format). Portal checks for an api-id_stage.png file when rendering the API page. If not found, it chooses the default logo – default.png.
  – content-fragments includes various markdown files comprising the content of the different pages in the portal.
  – Other static assets including favicon.ico, home-image.png and nav-logo.png that appear on your portal.
2. Let’s create a ZIP file of your code and dependencies, and upload it to Amazon S3. Running below command creates an AWS SAM template packaged.yaml, replacing references to local artifacts with the Amazon S3 location where the command uploaded the artifacts:
```
sam package --template-file ./cloudformation/template.yaml --output-template-file ./cloudformation/packaged.yaml --s3-bucket {your-lambda-artifacts-bucket-name}
```
1. Run the following command from the project root to deploy your portal, replace:
  – {your-template-bucket-name} with the name of your Amazon S3 bucket.
  – {custom-prefix} with a prefix that is globally unique.
  – {cognito-domain-or-prefix} with a unique string.
```
sam deploy --template-file ./cloudformation/packaged.yaml --s3-bucket {your-template-bucket-name} --stack-name "{custom-prefix}-dev-portal" --capabilities CAPABILITY_NAMED_IAM
```
Note: Ensure that you have required privileges to make deployments, as, during the deployment process, it attempts to create various resources such as AWS Lambda, Cognito User Pool, IAM roles, API Gateway, Cloudfront Distribution, etc.

After your developer portal has been fully deployed, you can get its URL by following.
1. Open the AWS CloudFormation console.
2. Select your stack you created above.
3. Open the Outputs section. The URL for the developer portal is specified in the WebSiteURL property.
Create Usage Plan

Create a usage plan, to list your API under a subscribable APIs category allowing consumers to access the API using their API keys in the developer portal. Ensure that the API gateway stage is configured for the usage plan.

Publishing an API

Only Administrators have permission to publish an API. To create an Administrator account for your developer portal –

1. Go to the WebSiteURL obtained after the successful deployment.

2. On the top right of the home page click on Register.

Source: Github

3. Fill the registration form and hit Sign up.

4. Enter the confirmation code received on your email address provided in the previous step.

5. Promote the user as Administrator by adding it to AdminGroup.
- Open Amazon Cognito User Pool console.
- Select the User Pool created for your developer portal.
- From the General Settings > Users and Groups page, select the User you want to promote as Administrator.
- Click on Add to group and then select the Admin group from the dropdown and confirm.
6. You will be required to log in again to log in as an Administrator. Click on the Admin Panel and choose the API you wish to publish from the APIs list.

Setting up an account

The signup process depends on the registration mode selected for the developer portal.

For request registration mode, you need to wait for the Administrator to approve your registration request.

For invite registration mode, you can only register on the portal when invited by the portal administrator.

Subscribing an API
1. Sign in to the developer portal.
2. Navigate to the Dashboard page and Copy your API Key.
3. Go to APIs Page to see a list of published APIs.
4. Select an API you wish to subscribe to and hit the Subscribe button.
Tips
1. When a user subscribes to API, all the APIs published under that usage plan are accessible no matter whether they are published or not.
2. Whenever you subscribe to an API, the catalog is exported from API Gateway resource documentation. You can customize the workflow or override the catalog swagger definition JSON in S3 bucket as defined in ArtifactsS3BucketName under /catalog/<apiid>_<stage>.json</stage></apiid>.
3. For backend APIs, CORS requests are allowed only from custom domain names selected for your developer portal.
4. Ensure to set the CORS response header from the published API in order to invoke them from the developer portal.
Summary

You’ve seen how to deploy a Serverless Developer Portal and publish an API. If you are creating a serverless application for the first time, you might want to read more on Serverless Computing and AWS Gateway before you get started.

Start building your own developer portal. To know more on distributing your API Gateway APIs to your customers follow this AWS guide.
December 12, 2022

Setting Up A Robust Authentication Environment For OpenSSH Using QR Code PAM

Do you like WhatsApp Web authentication? Well, WhatsApp Web has always fascinated me with the simplicity of QR-Code based authentication. Though there are similar authentication UIs available, I always wondered whether a remote secure shell (SSH) could be authenticated with a QR code with this kind of simplicity while keeping the auth process secure. In this guide, we will see how to write and implement a bare-bones PAM module for OpenSSH Linux-based system.

“OpenSSH is the premier connectivity tool for remote login with the SSH protocol. It encrypts all traffic to eliminate eavesdropping, connection hijacking, and other attacks. In addition, OpenSSH provides a large suite of secure tunneling capabilities, several authentication methods, and sophisticated configuration options.”

– openssh.com

Meet PAM!

PAM, short for “Pluggable Authentication Module,” is a middleware that abstracts authentication features on Linux and UNIX-like operating systems. PAM has been around for more than two decades. The authentication process could be cumbersome with each service looking for authenticating users with a different set of hardware and software, such as username-password, fingerprint module, face recognition, two-factor authentication, LDAP, etc. But the underlining process remains the same, i.e., users must be authenticated as who they say they are. This is where PAM comes into the picture and provides an API to the application layer and provides built-in functions to implement and extend PAM capability.

‍

Source: Redhat

Understand how OpenSSH interacts with PAM

The Linux host OpenSSH (sshd daemon) begins by reading the configuration defined in /etc/pam.conf or alternatively in /etc/pam.d configuration files. The config files are usually defined with service names having various realms (auth, account, session, password). The “auth” realm is what takes care of authenticating users as who they say. A typical sshd PAM service file on Ubuntu OS can be seen below, and you can relate with your own flavor of Linux:

@include common-auth
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

@include common-auth
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

The common-auth file has an “auth” realm with the pam_unix.so PAM module, which is responsible for authenticating the user with a password. Our goal is to write a PAM module that replaces pam_unix.so with our own version.

When OpenSSH makes calls to the PAM module, the very first function it looks for is “pam_sm_authenticate,” along with some other mandatory function such as pam_sm_setcred. Thus, we will be implementing the pam_sm_authenticate function, which will be an entry point to our shared object library. The module should return PAM_SUCCESS (0) as the return code for successful authentication.

Application Architecture

The project architecture has four main applications. The backend is hosted on an AWS cloud with minimal and low-cost infrastructure resources.

1. PAM Module: Provides QR-Code auth prompt to client SSH Login

2. Android Mobile App: Authenticates SSH login by scanning a QR code

3. QR Auth Server API: Backend application to which our Android App connects and communicates and shares authentication payload along with some other meta information

4. WebSocket Server (API Gateway WebSocket, and NodeJS) App: PAM Module and server-side app shares auth message payload in real time

When a user connects to the remote server via SSH, a PAM module is triggered, offering a QR code for authentication. Information is exchanged between the API gateway WebSocket, which in terms saves temporary auth data in DynamoDB. A user then uses an Android mobile app (written in react-native) to scan the QR code.

Upon scanning, the app connects to the API gateway. An API call is first authenticated by AWS Cognito to avoid any intrusion. The request is then proxied to the Lambda function, which authenticates input payload comparing information available in DynamoDB. Upon successful authentication, the Lambda function makes a call to the API gateway WebSocket to inform the PAM to authenticate the user.

Framework and Toolchains

PAM modules are shared object libraries that must be be written in C (although other languages can be used to compile and link or probably make cross programming language calls like python pam or pam_exec). Below are the framework and toolset I am using to serve this project:

1. gcc, make, automake, autoreconf, libpam (GNU dev tools on Ubuntu OS)

2. libqrencode, libwebsockets, libpam, libssl, libcrypto (C libraries)

3. NodeJS, express (for server-side app)

4. API gateway and API Gateway webSocket, AWS Lambda (AWS Cloud Services for hosting serverless server side app)

5. Serverless framework (for easily deploying infrastructure)

6. react-native, react-native-qrcode-scanner (for Android mobile app)

7. AWS Cognito (for authentication)

8. AWS Amplify Library

This guide assumes you have a basic understanding of the Linux OS, C programming language, pointers, and gcc code compilation. For the backend APIs, I prefer to use NodeJS as a primary programming language, but you may opt for the language of your choice for designing HTTP APIs.

Authentication with QR Code PAM Module

When the module initializes, we first want to generate a random string with the help “/dev/urandom” character device. Byte string obtained from this device contains non-screen characters, so we encode them with Base64. Let’s call this string an auth verification string.

void get_random_string(char *random_str,int length)
{
   FILE *fp = fopen("/dev/urandom","r");
   if(!fp){
       perror("Unble to open urandom device");
       exit(EXIT_FAILURE);
   }
   fread(random_str,length,1,fp);
   fclose(fp);
}
 
char random_string[11];
  
  //get random string
   get_random_string(random_string,10);
  //convert random string to base64 coz input string is coming from /dev/urandom and may contain binary chars
   const int encoded_length = Base64encode_len(10);
   base64_string=(char *)malloc(encoded_length+1);
   Base64encode(base64_string,random_string,10);
   base64_string[encoded_length]='�';

void get_random_string(char *random_str,int length)
{
   FILE *fp = fopen("/dev/urandom","r");
   if(!fp){
       perror("Unble to open urandom device");
       exit(EXIT_FAILURE);
   }
   fread(random_str,length,1,fp);
   fclose(fp);
}
 
char random_string[11];
  
  //get random string
   get_random_string(random_string,10);
  //convert random string to base64 coz input string is coming from /dev/urandom and may contain binary chars
   const int encoded_length = Base64encode_len(10);
   base64_string=(char *)malloc(encoded_length+1);
   Base64encode(base64_string,random_string,10);
   base64_string[encoded_length]='';

We then initiate a WebSocket connection with the help of the libwebsockets library and connect to our API Gateway WebSocket endpoint. Once the connection is established, we inform that a user may try to authenticate with auth verification string. The API Gateway WebSocket returns a unique connection ID to our PAM module.

static void connect_client(struct lws_sorted_usec_list *sul)
{
   struct vhd_minimal_client_echo *vhd =
       lws_container_of(sul, struct vhd_minimal_client_echo, sul);
   struct lws_client_connect_info i;
   char host[128];
   lws_snprintf(host, sizeof(host), "%s:%u", *vhd->ads, *vhd->port);
   memset(&i, 0, sizeof(i));
   i.context = vhd->context;
  //i.port = *vhd->port;
   i.port = *vhd->port;
   i.address = *vhd->ads;
   i.path = *vhd->url;
   i.host = host;
   i.origin = host;
   i.ssl_connection = LCCSCF_USE_SSL | LCCSCF_ALLOW_SELFSIGNED | LCCSCF_SKIP_SERVER_CERT_HOSTNAME_CHECK | LCCSCF_PIPELINE;
  //i.ssl_connection = 0;
   if ((*vhd->options) & 2)
       i.ssl_connection |= LCCSCF_USE_SSL;
   i.vhost = vhd->vhost;
   i.iface = *vhd->iface;
  //i.protocol = ;
   i.pwsi = &vhd->client_wsi;
  //lwsl_user("connecting to %s:%d/%s\n", i.address, i.port, i.path);
   log_message(LOG_INFO,ws_applogic.pamh,"About to create connection %s",host);
  //return !lws_client_connect_via_info(&i);
   if (!lws_client_connect_via_info(&i))
       lws_sul_schedule(vhd->context, 0, &vhd->sul,
                connect_client, 10 * LWS_US_PER_SEC);
}

static void connect_client(struct lws_sorted_usec_list *sul)
{
   struct vhd_minimal_client_echo *vhd =
       lws_container_of(sul, struct vhd_minimal_client_echo, sul);
   struct lws_client_connect_info i;
   char host[128];
   lws_snprintf(host, sizeof(host), "%s:%u", *vhd->ads, *vhd->port);
   memset(&i, 0, sizeof(i));
   i.context = vhd->context;
  //i.port = *vhd->port;
   i.port = *vhd->port;
   i.address = *vhd->ads;
   i.path = *vhd->url;
   i.host = host;
   i.origin = host;
   i.ssl_connection = LCCSCF_USE_SSL | LCCSCF_ALLOW_SELFSIGNED | LCCSCF_SKIP_SERVER_CERT_HOSTNAME_CHECK | LCCSCF_PIPELINE;
  //i.ssl_connection = 0;
   if ((*vhd->options) & 2)
       i.ssl_connection |= LCCSCF_USE_SSL;
   i.vhost = vhd->vhost;
   i.iface = *vhd->iface;
  //i.protocol = ;
   i.pwsi = &vhd->client_wsi;
  //lwsl_user("connecting to %s:%d/%s\n", i.address, i.port, i.path);
   log_message(LOG_INFO,ws_applogic.pamh,"About to create connection %s",host);
  //return !lws_client_connect_via_info(&i);
   if (!lws_client_connect_via_info(&i))
       lws_sul_schedule(vhd->context, 0, &vhd->sul,
                connect_client, 10 * LWS_US_PER_SEC);
}

Upon receiving the connection id from the server, the PAM module converts this connection id to SHA1 hash string and finally composes a unique string for generating QR Code. This string consists of three parts separated by colons (:), i.e.,

“qrauth:BASE64(AUTH_VERIFY_STRING):SHA1(CONNECTION_ID).” For example, let’s say a random Base64 encoded string is “UX6t4PcS5doEeA==” and connection id is “KZlfidYvBcwCFFw=”

Then the final encoded string is “qrauth:UX6t4PcS5doEeA==:2fc58b0cc3b13c3f2db49a5b4660ad47c873b81a.”

This string is then encoded to the UTF8 QR code with the help of libqrencode library and the authentication screen is prompted by the PAM module.

char *con_id=strstr(msg,ws_com_strings[READ_WS_CONNECTION_ID]);
           int length = strlen(ws_com_strings[READ_WS_CONNECTION_ID]);
          
           if(!con_id){
               pam_login_status=PAM_AUTH_ERR;
               interrupted=1;
               return;
           }
           con_id+=length;
           log_message(LOG_DEBUG,ws_applogic.pamh,"strstr is %s",con_id);
           string_crypt(ws_applogic.sha_code_hex, con_id);
           sprintf(temp_text,"qrauth:%s:%s",ws_applogic.authkey,ws_applogic.sha_code_hex);
           char *qr_encoded_text=get_qrcode_string(temp_text);
           ws_applogic.qr_encoded_text=qr_encoded_text;
           conv_info(ws_applogic.pamh,"\nSSH Auth via QR Code\n\n");
           conv_info(ws_applogic.pamh, ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"Use Mobile App to Scan \n %s",ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"%s",temp_text);
           ws_applogic.current_action=READ_WS_AUTH_VERIFIED;
           sprintf(temp_text,ws_com_strings[SEND_WS_EXPECT_AUTH],ws_applogic.authkey,ws_applogic.username);
           websocket_write_back(wsi,temp_text,-1);
           conv_read(ws_applogic.pamh,"\n\nUse Mobile SSH QR Auth App to Authentiate SSh Login and Press Enter\n\n",PAM_PROMPT_ECHO_ON);

char *con_id=strstr(msg,ws_com_strings[READ_WS_CONNECTION_ID]);
           int length = strlen(ws_com_strings[READ_WS_CONNECTION_ID]);
          
           if(!con_id){
               pam_login_status=PAM_AUTH_ERR;
               interrupted=1;
               return;
           }
           con_id+=length;
           log_message(LOG_DEBUG,ws_applogic.pamh,"strstr is %s",con_id);
           string_crypt(ws_applogic.sha_code_hex, con_id);
           sprintf(temp_text,"qrauth:%s:%s",ws_applogic.authkey,ws_applogic.sha_code_hex);
           char *qr_encoded_text=get_qrcode_string(temp_text);
           ws_applogic.qr_encoded_text=qr_encoded_text;
           conv_info(ws_applogic.pamh,"\nSSH Auth via QR Code\n\n");
           conv_info(ws_applogic.pamh, ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"Use Mobile App to Scan \n %s",ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"%s",temp_text);
           ws_applogic.current_action=READ_WS_AUTH_VERIFIED;
           sprintf(temp_text,ws_com_strings[SEND_WS_EXPECT_AUTH],ws_applogic.authkey,ws_applogic.username);
           websocket_write_back(wsi,temp_text,-1);
           conv_read(ws_applogic.pamh,"\n\nUse Mobile SSH QR Auth App to Authentiate SSh Login and Press Enter\n\n",PAM_PROMPT_ECHO_ON);

API Gateway WebSocket App

We used a serverless framework for easily creating and deploying our infrastructure resources. With serverless cli, we use aws-nodejs template (serverless create –template aws-nodejs). You can find a detailed guide on Serverless, API Gateway WebSocket, and DynamoDB here. Below is the template YAML definition. Note that the DynamoDB resource has TTL set to expires_at property. This field holds the UNIX epoch timestamp.

What this means is that any record that we store is automatically deleted as per the epoch time set. We plan to keep the record only for 5 minutes. This also means the user must authenticate themselves within 5 minutes of the authentication request to the remote SSH server.

service: ssh-qrapp-websocket
frameworkVersion: '2'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 websocketsApiName: ssh-qrapp-websocket
 websocketsApiRouteSelectionExpression: $request.body.action
 region: ap-south-1
  iam:
   role:
     statements:
       - Effect: Allow
         Action:
           - "dynamodb:query"
           - "dynamodb:GetItem"
           - "dynamodb:PutItem"
         Resource:
           - Fn::GetAtt: [ SSHAuthDB, Arn ]
  environment:
   REGION: ${env:REGION}
   DYNAMODB_TABLE: SSHAuthDB
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
   NODE_ENV: ${env:NODE_ENV}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
plugins:
 - serverless-dotenv-plugin
layers:
 sshQRAPPLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 connectionHandler:
   handler: handler.connectHandler
   timeout: 60
   memorySize: 256
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: $connect
        routeResponseSelectionExpression: $default
 disconnectHandler:
   handler: handler.disconnectHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $disconnect
 defaultHandler:
   handler: handler.defaultHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $default
 customQueryHandler:
   handler: handler.queryHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: expectauth
        routeResponseSelectionExpression: $default
     - websocket:
        route: getconid
        routeResponseSelectionExpression: $default
     - websocket:
        route: verifyauth
        routeResponseSelectionExpression: $default
 resources:
 Resources:
   SSHAuthDB:
     Type: AWS::DynamoDB::Table
     Properties:
       TableName: ${env:DYNAMODB_TABLE}
       AttributeDefinitions:
         - AttributeName: authkey
           AttributeType: S
       KeySchema:
         - AttributeName: authkey
           KeyType: HASH
       TimeToLiveSpecification:
         AttributeName: expires_at
         Enabled: true
       ProvisionedThroughput:
         ReadCapacityUnits: 2
         WriteCapacityUnits: 2

service: ssh-qrapp-websocket
frameworkVersion: '2'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 websocketsApiName: ssh-qrapp-websocket
 websocketsApiRouteSelectionExpression: $request.body.action
 region: ap-south-1
  iam:
   role:
     statements:
       - Effect: Allow
         Action:
           - "dynamodb:query"
           - "dynamodb:GetItem"
           - "dynamodb:PutItem"
         Resource:
           - Fn::GetAtt: [ SSHAuthDB, Arn ]
  environment:
   REGION: ${env:REGION}
   DYNAMODB_TABLE: SSHAuthDB
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
   NODE_ENV: ${env:NODE_ENV}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
plugins:
 - serverless-dotenv-plugin
layers:
 sshQRAPPLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 connectionHandler:
   handler: handler.connectHandler
   timeout: 60
   memorySize: 256
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: $connect
        routeResponseSelectionExpression: $default
 disconnectHandler:
   handler: handler.disconnectHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $disconnect
 defaultHandler:
   handler: handler.defaultHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $default
 customQueryHandler:
   handler: handler.queryHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: expectauth
        routeResponseSelectionExpression: $default
     - websocket:
        route: getconid
        routeResponseSelectionExpression: $default
     - websocket:
        route: verifyauth
        routeResponseSelectionExpression: $default
 resources:
 Resources:
   SSHAuthDB:
     Type: AWS::DynamoDB::Table
     Properties:
       TableName: ${env:DYNAMODB_TABLE}
       AttributeDefinitions:
         - AttributeName: authkey
           AttributeType: S
       KeySchema:
         - AttributeName: authkey
           KeyType: HASH
       TimeToLiveSpecification:
         AttributeName: expires_at
         Enabled: true
       ProvisionedThroughput:
         ReadCapacityUnits: 2
         WriteCapacityUnits: 2

The API Gateway WebSocket has three custom events. These events come as an argument to the lambda function in “event.body.action.” API Gateway WebSocket calls them as route selection expressions. These custom events are:

The “expectauth” event is sent by the PAM module to WebSocket informing that a client has asked for authentication and mobile application may try to authenticate by scanning QR code. During this event, the WebSocket handler stores the connection ID along with auth verification string. This key acts as a primary key to our DynamoDB table.
The “getconid” event is sent to retrieve the current connection ID so that the PAM can generate a SHA1 sum and provide a QR Code prompt.
The “verifyauth” event is sent by the PAM module to confirm and verify authentication. During this event, even the WebSocket server expects random challenge response text. WebSocket server retrieves data payload from DynamoDB with auth verification string as primary key, and tries to find the key “authVerified” marked as “true” (more on this later).

queryHandler: async (event,context) => {
   const payload = JSON.parse(event.body);
   const documentClient = new DynamoDB.DocumentClient({
     region : process.env.REGION
   });
   try {
     switch(payload.action){
       case 'expectauth':
        
         const expires_at = parseInt(new Date().getTime() / 1000) + 300;
  
         await documentClient.put({
           TableName : process.env.DYNAMODB_TABLE,
           Item: {
             authkey : payload.authkey,
             connectionId : event.requestContext.connectionId,
             username : payload.username,
             expires_at : expires_at,
             authVerified: false
           }
         }).promise();
         return {
           statusCode: 200,
           body : "OK"
         };
       case 'getconid':
         return {
           statusCode: 200,
           body: `connectionid:${event.requestContext.connectionId}`
         };
       case 'verifyauth':
         const data = await documentClient.get({
           TableName : process.env.DYNAMODB_TABLE,
           Key : {
             authkey : payload.authkey
           }
         }).promise();
         if(!("Item" in data)){
           throw "Failed to query data";
         }
         if(data.Item.authVerified === true){
           return {
             statusCode: 200,
             body: `authverified:${payload.challengeText}`
           }
         }
         throw "auth verification failed";
     }
   } catch (error) {
     console.log(error);
   }
   return {
     statusCode:  200,
     body : "ok"
    };
  
 }

queryHandler: async (event,context) => {
   const payload = JSON.parse(event.body);
   const documentClient = new DynamoDB.DocumentClient({
     region : process.env.REGION
   });
   try {
     switch(payload.action){
       case 'expectauth':
        
         const expires_at = parseInt(new Date().getTime() / 1000) + 300;
  
         await documentClient.put({
           TableName : process.env.DYNAMODB_TABLE,
           Item: {
             authkey : payload.authkey,
             connectionId : event.requestContext.connectionId,
             username : payload.username,
             expires_at : expires_at,
             authVerified: false
           }
         }).promise();
         return {
           statusCode: 200,
           body : "OK"
         };
       case 'getconid':
         return {
           statusCode: 200,
           body: `connectionid:${event.requestContext.connectionId}`
         };
       case 'verifyauth':
         const data = await documentClient.get({
           TableName : process.env.DYNAMODB_TABLE,
           Key : {
             authkey : payload.authkey
           }
         }).promise();
         if(!("Item" in data)){
           throw "Failed to query data";
         }
         if(data.Item.authVerified === true){
           return {
             statusCode: 200,
             body: `authverified:${payload.challengeText}`
           }
         }
         throw "auth verification failed";
     }
   } catch (error) {
     console.log(error);
   }
   return {
     statusCode:  200,
     body : "ok"
    };
  
 }

Android App: SSH QR Code Auth

The Android app consists of two parts. App login and scanning the QR code for authentication. The AWS Cognito and Amplify library ease out the process of a secure login. Just wrapping your react-native app with “withAutheticator” component you get ready to use “Login Screen.” We then use the react-native-qrcode-scanner component to scan the QR Code.

This component returns decoded string on the successful scan. Application logic then breaks the string and finds the validity of the string decoded. If the decoded string is a valid application string, an API call is made to the server with the appropriate payload.

render(){
   return (
     <View style={styles.container}>
       {this.state.authQRCode ?
       <AuthQRCode
        hideAuthQRCode = {this.hideAuthQRCode}
        qrScanData = {this.qrScanData}
       />
       :
       <View style={{marginVertical: 10}}>
       <Button title="Auth SSH Login" onPress={this.showAuthQRCode} />
       <View style={{margin:10}} />
       <Button title="Sign Out" onPress={this.signout} />
       </View>
      
       }
     </View>
   );
 }
     const scanCode = e.data.split(':');
     if(scanCode.length <3){
       throw "invalid qr code";
     }
     const [appstring,authcode,shacode] = scanCode;
     if(appstring !== "qrauth"){
       throw "Not a valid app qr code";
     }
     const authsession = await Auth.currentSession();
     const jwtToken = authsession.getIdToken().jwtToken;
     const response = await axios({
       url : "https://API_GATEWAY_URL/v1/app/sshqrauth/qrauth",
       method : "post",
       headers : {
         Authorization : jwtToken,
         'Content-Type' : 'application/json'
       },
       responseType: "json",
       data : {
         authcode,
         shacode
       }
     });
     if(response.data.status === 200){
       rescanQRCode=false;
       setTimeout(this.hideAuthQRCode, 1000);
     }

render(){
   return (
     <View style={styles.container}>
       {this.state.authQRCode ?
       <AuthQRCode
        hideAuthQRCode = {this.hideAuthQRCode}
        qrScanData = {this.qrScanData}
       />
       :
       <View style={{marginVertical: 10}}>
       <Button title="Auth SSH Login" onPress={this.showAuthQRCode} />
       <View style={{margin:10}} />
       <Button title="Sign Out" onPress={this.signout} />
       </View>
      
       }
     </View>
   );
 }
     const scanCode = e.data.split(':');
     if(scanCode.length <3){
       throw "invalid qr code";
     }
     const [appstring,authcode,shacode] = scanCode;
     if(appstring !== "qrauth"){
       throw "Not a valid app qr code";
     }
     const authsession = await Auth.currentSession();
     const jwtToken = authsession.getIdToken().jwtToken;
     const response = await axios({
       url : "https://API_GATEWAY_URL/v1/app/sshqrauth/qrauth",
       method : "post",
       headers : {
         Authorization : jwtToken,
         'Content-Type' : 'application/json'
       },
       responseType: "json",
       data : {
         authcode,
         shacode
       }
     });
     if(response.data.status === 200){
       rescanQRCode=false;
       setTimeout(this.hideAuthQRCode, 1000);
     }

This guide does not cover how to deploy react-native Android applications. You may refer to the official react-native guide to deploy your application to the Android mobile device.

QR Auth API

The QR Auth API is built using a serverless framework with aws-nodejs template. It uses API Gateway as HTTP API and AWS Cognito for authorizing input requests. The serverless YAML definition is defined below.

service: ssh-qrauth-server
frameworkVersion: '2 || 3'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 deploymentBucket:
   name: ${env:DEPLOYMENT_BUCKET_NAME}
 httpApi:
   authorizers:
     cognitoJWTAuth:
       identitySource: $request.header.Authorization
       issuerUrl: ${env:COGNITO_ISSUER}
       audience:
         - ${env:COGNITO_AUDIENCE}
 region: ap-south-1
 iam:
   role:
     statements:
     - Effect: "Allow"
       Action:
         - "dynamodb:Query"
         - "dynamodb:PutItem"
         - "dynamodb:GetItem"
       Resource:
         - ${env:DYNAMO_DB_ARN}
     - Effect: "Allow"
       Action:
         - "execute-api:Invoke"
         - "execute-api:ManageConnections"
       Resource:
         - ${env:API_GATEWAY_WEBSOCKET_API_ARN}/*
 environment:
   REGION: ${env:REGION}
   COGNITO_ISSUER: ${env:COGNITO_ISSUER}
   DYNAMODB_TABLE: ${env:DYNAMODB_TABLE}
   COGNITO_AUDIENCE: ${env:COGNITO_AUDIENCE}
   POOLID: ${env:POOLID}
   COGNITOIDP: ${env:COGNITOIDP}
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
   - '!.env'
   - '!test.http'
plugins:
 - serverless-deployment-bucket
 - serverless-dotenv-plugin
layers:
 qrauthLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 sshauthqrcode:
   handler: handler.authqrcode
   memorySize: 256
   timeout: 30
   layers:
     - {Ref: QrauthLibsLambdaLayer}
   events:
     - httpApi:
         path: /v1/app/sshqrauth/qrauth
         method: post
         authorizer:
           name: cognitoJWTAuth

service: ssh-qrauth-server
frameworkVersion: '2 || 3'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 deploymentBucket:
   name: ${env:DEPLOYMENT_BUCKET_NAME}
 httpApi:
   authorizers:
     cognitoJWTAuth:
       identitySource: $request.header.Authorization
       issuerUrl: ${env:COGNITO_ISSUER}
       audience:
         - ${env:COGNITO_AUDIENCE}
 region: ap-south-1
 iam:
   role:
     statements:
     - Effect: "Allow"
       Action:
         - "dynamodb:Query"
         - "dynamodb:PutItem"
         - "dynamodb:GetItem"
       Resource:
         - ${env:DYNAMO_DB_ARN}
     - Effect: "Allow"
       Action:
         - "execute-api:Invoke"
         - "execute-api:ManageConnections"
       Resource:
         - ${env:API_GATEWAY_WEBSOCKET_API_ARN}/*
 environment:
   REGION: ${env:REGION}
   COGNITO_ISSUER: ${env:COGNITO_ISSUER}
   DYNAMODB_TABLE: ${env:DYNAMODB_TABLE}
   COGNITO_AUDIENCE: ${env:COGNITO_AUDIENCE}
   POOLID: ${env:POOLID}
   COGNITOIDP: ${env:COGNITOIDP}
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
   - '!.env'
   - '!test.http'
plugins:
 - serverless-deployment-bucket
 - serverless-dotenv-plugin
layers:
 qrauthLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 sshauthqrcode:
   handler: handler.authqrcode
   memorySize: 256
   timeout: 30
   layers:
     - {Ref: QrauthLibsLambdaLayer}
   events:
     - httpApi:
         path: /v1/app/sshqrauth/qrauth
         method: post
         authorizer:
           name: cognitoJWTAuth

Once the API Gateway authenticates the incoming requests, control is handed over to the serverless-express router. At this stage, we verify the payload for the auth verify string, which is scanned by the Android mobile app. This auth verify string must be available in the DynamoDB table. Upon retrieving the record pointed by auth verification string, we read the connection ID property and convert it to SHA1 hash. If the hash matches with the hash available in the request payload, we update the record “authVerified” as “true” and inform the PAM module via API Gateway WebSocket API. PAM Module then takes care of further validation via challenge response text.

The entire authentication flow is depicted in a flow diagram, and the architecture is depicted in the cover post of this blog.

Compiling and Installing PAM module

Unlike any other C programs, PAM modules are shared libraries. Therefore, the compiled code when loaded in memory may go at this arbitrary place. Thus, the module must be compiled as position independent. With gcc while compiling, we must pass -fPIC option. Further while linking and generating shared object binary, we should use -shared flag.

gcc -I$PWD -fPIC -c $(ls *.c)
gcc -shared -o pam_qrapp_auth.so $(ls *.o) -lpam -lqrencode -lssl -lcrypto -lpthread -lwebsockets

gcc -I$PWD -fPIC -c $(ls *.c)
gcc -shared -o pam_qrapp_auth.so $(ls *.o) -lpam -lqrencode -lssl -lcrypto -lpthread -lwebsockets

To ease this process of compiling and validating libraries, I prefer to use the autoconf tool. The entire project is checked out at my GitHub repository along with autoconf scripts.

Once the shared object file is generated (pam_qrapp_auth.so), copy this file to the “/usr/lib64/security/” directory and run ldconfig command to inform OS new shared library is available. Remove common-auth (from /etc/pam.d/sshd if applicable) or any line that uses “auth” realm with pam_unix.so module recursively used in /etc/pam.d/sshd. pam_unix.so module enforces a password or private key authentication. We then need to add our module to the auth realm (“auth required pam_qrapp_auth.so”). Depending upon your Linux flavor, your /etc/pam.d/sshd file may look similar to below:

auth       required     pam_qrapp_auth.so
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

auth       required     pam_qrapp_auth.so
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

Finally, we need to configure our sshd daemon configuration file to allow challenge response authentication. Open file /etc/ssh/sshd_config and add “ChallengeResponseAuthentication yes” if already not available or commented or set to “no.” Reload the sshd service by issuing the command “systemctl reload sshd.” Voila, and we are done here.

Conclusion

This guide was a barebones tutorial and not meant for production use. There are certain flaws to this PAM module. For example, our module should prompt for changing the password if the password is expired or login should be denied if an account is a locked and similar feature that addresses security. Also, the Android mobile app should be bound with ssh username so that, AWS Cognito user bound with ssh username could only authenticate.

One known limitation to this PAM module is we have to always hit enter after scanning the QR Code via Android Mobile App. This limitation is because of how OpenSSH itself is implemented. OpenSSH server blocks all the informational text unless user input is required. In our case, the informational text is UTF8 QR Code itself.

However, no such input is required from the interactive device, as the authentication event comes from the WebSocket to PAM module. If we do not ask the user to exclusively press enter after scanning the QR Code our QR Code will never be displayed. Thus input here is a dummy. This is a known issue for OpenSSH PAM_TEXT_INFO. Find more about the issue here.

References

– Pluggable authentication module

– An introduction to Pluggable Authentication Modules (PAM) in Linux

– Custom PAM for SSHD in C

– google-authenticator-libpam

– PAM_TEXT_INFO and PAM_ERROR_MSG conversation not honoured during PAM authentication

December 12, 2022

Building Dynamic Forms in React Using Formik

Every day we see a huge number of web applications allowing us customizations. It involves drag & drop or metadata-driven UI interfaces to support multiple layouts while having a single backend. Feedback taking system is one of the simplest examples of such products, where on the admin side, one can manage the layout and on the consumer side, users are shown that layout to capture the data. This post focuses on building a microframework to support such use cases with the help of React and Formik.

Building big forms in React can be extremely time consuming and tedious when structural changes are requested. Handling their validations also takes too much time in the development life cycle. If we use Redux-based solutions to simplify this, like Redux-form, we see a lot of performance bottlenecks. So here comes Formik!

Why Formik?

“Why” is one of the most important questions while solving any problem. There are quite a few reasons to lean towards Formik for the implementation of such systems, such as:

Simplicity
Advanced validation support with Yup
Good community support with a lot of people helping on Github

Being said that, it’s one of the easiest frameworks for quick form building activities. Formik’s clean API lets us use it without worrying about a lot of state management.

Yup is probably the best library out there for validation and Formik provides out of the box support for Yup validations which makes it more programmer-friendly!!‍

API Responses:

We need to follow certain API structures to let our React code understand which component to render where.

Let’s assume we will be getting responses from the backend API in the following fashion.

[{
   “type” : “text”,
   “field”: “name”
   “name” : “User’s name”,
   “style” : {
         “width” : “50%”
    }
}]

[{
   “type” : “text”,
   “field”: “name”
   “name” : “User’s name”,
   “style” : {
         “width” : “50%”
    }
}]

We can have any number of fields but each one will have two mandatory unique properties type and field. We will use those properties to build UI as well as response.

So let’s start with building the simplest form with React and Formik.

import React from 'react';
import { useFormik } from 'formik';

const SignupForm = () => {
  const formik = useFormik({
    initialValues: {
      email: '',
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
  });
  return (
    <form onSubmit={formik.handleSubmit}>
      <label htmlFor="email">Email Address</label>
      <input
        id="email"
        name="email"
        type="email"
        onChange={formik.handleChange}
        value={formik.values.email}
      />
      <button type="submit">Submit</button>
    </form>
  );
};

export default SignupForm;

import React from 'react';
import { useFormik } from 'formik';

const SignupForm = () => {
  const formik = useFormik({
    initialValues: {
      email: '',
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
  });
  return (
    <form onSubmit={formik.handleSubmit}>
      <label htmlFor="email">Email Address</label>
      <input
        id="email"
        name="email"
        type="email"
        onChange={formik.handleChange}
        value={formik.values.email}
      />
      <button type="submit">Submit</button>
    </form>
  );
};

export default SignupForm;

import React from 'react';

export default ({ name }) => <h1>Hello {name}!</h1>;

import React from 'react';

export default ({ name }) => <h1>Hello {name}!</h1>;

<div id="root"></div>

<div id="root"></div>

import React, { Component } from 'react';
import { render } from 'react-dom';
import Basic from './Basic';
import './style.css';

class App extends Component {
  constructor() {
    super();
    this.state = {
      name: 'React'
    };
  }

  render() {
    return (
      <div>
        <Basic />
      </div>
    );
  }
}

render(<App />, document.getElementById('root'));

import React, { Component } from 'react';
import { render } from 'react-dom';
import Basic from './Basic';
import './style.css';

class App extends Component {
  constructor() {
    super();
    this.state = {
      name: 'React'
    };
  }

  render() {
    return (
      <div>
        <Basic />
      </div>
    );
  }
}

render(<App />, document.getElementById('root'));

{
  "name": "react",
  "version": "0.0.0",
  "private": true,
  "dependencies": {
    "react": "^16.12.0",
    "react-dom": "^16.12.0",
    "formik": "latest"
  },
  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test --env=jsdom",
    "eject": "react-scripts eject"
  },
  "devDependencies": {
    "react-scripts": "latest"
  }
}

{
  "name": "react",
  "version": "0.0.0",
  "private": true,
  "dependencies": {
    "react": "^16.12.0",
    "react-dom": "^16.12.0",
    "formik": "latest"
  },
  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test --env=jsdom",
    "eject": "react-scripts eject"
  },
  "devDependencies": {
    "react-scripts": "latest"
  }
}

h1, p {
  font-family: Lato;
}

h1, p {
  font-family: Lato;
}

You can view the fiddle of above code here to see the live demo.

We will go with the latest functional components to build this form. You can find more information on useFormik hook at useFormik Hook documentation.

It’s nothing more than just a wrapper for Formik functionality.

Adding dynamic nature

So let’s first create and import the mocked API response to build the UI dynamically.

import React from 'react';
import { useFormik } from 'formik';
import response from "./apiresponse"

const SignupForm = () => {
  const formik = useFormik({
    initialValues: {
      email: '',
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
  });
  return (
    <form onSubmit={formik.handleSubmit}>
      <label htmlFor="email">Email Address</label>
      <input
        id="email"
        name="email"
        type="email"
        onChange={formik.handleChange}
        value={formik.values.email}
      />
      <button type="submit">Submit</button>
    </form>
  );
};

export default SignupForm;

import React from 'react';
import { useFormik } from 'formik';
import response from "./apiresponse"

const SignupForm = () => {
  const formik = useFormik({
    initialValues: {
      email: '',
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
  });
  return (
    <form onSubmit={formik.handleSubmit}>
      <label htmlFor="email">Email Address</label>
      <input
        id="email"
        name="email"
        type="email"
        onChange={formik.handleChange}
        value={formik.values.email}
      />
      <button type="submit">Submit</button>
    </form>
  );
};

export default SignupForm;

You can view the fiddle here.

We simply imported the file and made it available for processing. So now, we need to write the logic to build components dynamically.
So let’s visualize the DOM hierarchy of components possible:

<Container>
	<TextField />
	<NumberField />
	<Container />
		<TextField />
		<BooleanField />
	</Container >
</Container>

<Container>
	<TextField />
	<NumberField />
	<Container />
		<TextField />
		<BooleanField />
	</Container >
</Container>

We can have a recurring container within the container, so let’s address this by adding a children attribute in API response.

export default [
  {
    "type": "text",
    "field": "name",
    "label": "User's name"
  },
  {
    "type": "number",
    "field": "number",
    "label": "User's age",
  },
  {
    "type": "none",
    "field": "none",
    "children": [
      {
        "type": "text",
        "field": "user.hobbies",
        "label": "User's hobbies"
      }
    ]
  }
]

export default [
  {
    "type": "text",
    "field": "name",
    "label": "User's name"
  },
  {
    "type": "number",
    "field": "number",
    "label": "User's age",
  },
  {
    "type": "none",
    "field": "none",
    "children": [
      {
        "type": "text",
        "field": "user.hobbies",
        "label": "User's hobbies"
      }
    ]
  }
]

You can see the fiddle with response processing here with live demo.

To process the recursive nature, we will create a separate component.

import React, { useMemo } from 'react';

const RecursiveContainer = ({config, formik}) => {
  const builder = (individualConfig) => {
    switch (individualConfig.type) {
      case 'text':
        return (
                <>
                <div>
                  <label htmlFor={individualConfig.field}>{individualConfig.label}</label>
                  <input type='text' 
                    name={individualConfig.field} 
                    onChange={formik.handleChange} style={{...individualConfig.style}} />
                  </div>
                </>
              );
      case 'number':
        return (
          <>
            <div>
              <label htmlFor={individualConfig.field}>{individualConfig.label}</label>
                  <input type='number' 
                    name={individualConfig.field} 
                    onChange={formik.handleChange} style={{...individualConfig.style}} />
            </div>
          </>
        )
      case 'array':
        return (
          <RecursiveContainer config={individualConfig.children || []} formik={formik} />
        );
      default:
        return <div>Unsupported field</div>
    }
  }

  return (
    <>
      {config.map((c) => {
        return builder(c);
      })}
    </>
  );
};

export default RecursiveContainer;

import React, { useMemo } from 'react';

const RecursiveContainer = ({config, formik}) => {
  const builder = (individualConfig) => {
    switch (individualConfig.type) {
      case 'text':
        return (
                <>
                <div>
                  <label htmlFor={individualConfig.field}>{individualConfig.label}</label>
                  <input type='text' 
                    name={individualConfig.field} 
                    onChange={formik.handleChange} style={{...individualConfig.style}} />
                  </div>
                </>
              );
      case 'number':
        return (
          <>
            <div>
              <label htmlFor={individualConfig.field}>{individualConfig.label}</label>
                  <input type='number' 
                    name={individualConfig.field} 
                    onChange={formik.handleChange} style={{...individualConfig.style}} />
            </div>
          </>
        )
      case 'array':
        return (
          <RecursiveContainer config={individualConfig.children || []} formik={formik} />
        );
      default:
        return <div>Unsupported field</div>
    }
  }

  return (
    <>
      {config.map((c) => {
        return builder(c);
      })}
    </>
  );
};

export default RecursiveContainer;

You can view the complete fiddle of the recursive component here.

So what we do in this is pretty simple. We pass config which is a JSON object that is retrieved from the API response. We simply iterate through config and build the component based on type. When the type is an array, we create the same component RecursiveContainer which is basic recursion.

We can optimize it by passing the depth and restricting to nth possible depth to avoid going out of stack errors at runtime. Specifying the depth will ultimately make it less prone to runtime errors. There is no standard limit, it varies from use case to use case. If you are planning to build a system that is based on a compliance questionnaire, it can go to a max depth of 5 to 7, while for the basic signup form, it’s often seen to be only 2.

So we generated the forms but how do we validate them? How do we enforce required, min, max checks on the form?

For this, Yup is very helpful. Yup is an object schema validation library that helps us validate the object and give us results back. Its chaining like syntax makes it very much easier to build incremental validation functions.

Yup provides us with a vast variety of existing validations. We can combine them, specify error or warning messages to be thrown and much more.

You can find more information on Yup at Yup Official Documentation‍

To build a validation function, we need to pass a Yup schema to Formik.

Here is a simple example:

import React from 'react';
import { useFormik } from 'formik';
import response from "./apiresponse"
import RecursiveContainer from './RecursiveContainer';
import * as yup from 'yup';

const SignupForm = () => {
  const signupSchema = yup.object().shape({
      name: yup.string().required()
  });

  const formik = useFormik({
    initialValues: {
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
    validationSchema: signupSchema
  });
  console.log(formik, response)
  return (
    <form onSubmit={formik.handleSubmit}>
      <RecursiveContainer config={response} formik={formik} />
      <button type="submit">Submit</button>
    </form>
  );
};

export default SignupForm;

import React from 'react';
import { useFormik } from 'formik';
import response from "./apiresponse"
import RecursiveContainer from './RecursiveContainer';
import * as yup from 'yup';

const SignupForm = () => {
  const signupSchema = yup.object().shape({
      name: yup.string().required()
  });

  const formik = useFormik({
    initialValues: {
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
    validationSchema: signupSchema
  });
  console.log(formik, response)
  return (
    <form onSubmit={formik.handleSubmit}>
      <RecursiveContainer config={response} formik={formik} />
      <button type="submit">Submit</button>
    </form>
  );
};

export default SignupForm;

You can see the schema usage example here.

In this example, we simply created a schema and passed it to useFormik hook. You can notice now unless and until the user enters the name field, the form submission is not working.

Here is a simple hack to make the button disabled until all necessary fields are filled.

import React from 'react';
import { useFormik } from 'formik';
import response from "./apiresponse"
import RecursiveContainer from './RecursiveContainer';
import * as yup from 'yup';

const SignupForm = () => {
  const signupSchema = yup.object().shape({
      name: yup.string().required()
  });

  const formik = useFormik({
    initialValues: {
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
    validationSchema: signupSchema
  });
  console.log(formik, response)
  return (
    <form onSubmit={formik.handleSubmit}>
      <RecursiveContainer config={response} formik={formik} />
      <button type="submit" disabled={!formik.isValid}>Submit</button>
    </form>
  );
};

export default SignupForm;

import React from 'react';
import { useFormik } from 'formik';
import response from "./apiresponse"
import RecursiveContainer from './RecursiveContainer';
import * as yup from 'yup';

const SignupForm = () => {
  const signupSchema = yup.object().shape({
      name: yup.string().required()
  });

  const formik = useFormik({
    initialValues: {
    },
    onSubmit: values => {
      alert(JSON.stringify(values, null, 2));
    },
    validationSchema: signupSchema
  });
  console.log(formik, response)
  return (
    <form onSubmit={formik.handleSubmit}>
      <RecursiveContainer config={response} formik={formik} />
      <button type="submit" disabled={!formik.isValid}>Submit</button>
    </form>
  );
};

export default SignupForm;

You can see how to use submit validation with live fiddle here‍

We do get a vast variety of output from Formik while the form is being rendered and we can use them the way it suits us. You can find the full API of Formik at Formik Official Documentation

So existing validations are fine but we often get into cases where we would like to build our own validations. How do we write them and integrate them with Yup validations?

For this, there are 2 different ways with Formik + Yup. Either we can extend the Yup to support the additional validation or pass validation function to the Formik. The validation function approach is much simpler. You just need to write a function that gives back an error object to Formik. As simple as it sounds, it does get messy at times.

So we will see an example of adding custom validation to Yup. Yup provides us an addMethod interface to add our own user-defined validations in the application.

Let’s say we want to create an alias for existing validation for supporting casing because that’s the most common mistake we see. Url becomes url, trim is coming from the backend as Trim. These method names are case sensitive so if we say Yup.Url, it will fail. But with Yup.url, we get a function. These are just some examples, but you can also alias them with some other names like I can have an alias required to be as readable as NotEmpty.

The usage is very simple and straightforward as follows:

yup.addMethod(yup.string, “URL”, function(...args) {
return this.url(...args);
});

yup.addMethod(yup.string, “URL”, function(...args) {
return this.url(...args);
});

This will create an alias for url as URL.

Here is an example of custom method validation which takes Y and N as boolean values.

const validator = function (message) {
    return this.test('is-string-boolean', message, function (value) {
      if (isEmpty(value)) {
        return true;
      }

      if (['Y', 'N'].indexOf(value) !== -1) {
        return true;
      } else {
        return false;
      }
    });
  };

const validator = function (message) {
    return this.test('is-string-boolean', message, function (value) {
      if (isEmpty(value)) {
        return true;
      }

      if (['Y', 'N'].indexOf(value) !== -1) {
        return true;
      } else {
        return false;
      }
    });
  };

With the above, we will be able to execute yup.string().stringBoolean() and yup.string().StringBoolean().

It’s a pretty handy syntax that lets users create their own validations. You can create many more validations in your project to be used with Yup and reuse them wherever required.

So writing schema is also a cumbersome task and is useless if the form is dynamic. When the form is dynamic then validations also need to be dynamic. Yup’s chaining-like syntax lets us achieve it very easily.

We will consider that the backend sends us additional following things with metadata.

[{
   “type” : “text”,
   “field”: “name”
   “name” : “User’s name”,
   “style” : {
         “width” : “50%”
    },
   “validationType”: “string”,
   “validations”: [{
          type: “required”,
          params: [“Name is required”]
    }]
}]

[{
   “type” : “text”,
   “field”: “name”
   “name” : “User’s name”,
   “style” : {
         “width” : “50%”
    },
   “validationType”: “string”,
   “validations”: [{
          type: “required”,
          params: [“Name is required”]
    }]
}]

validationType will hold the Yup’s data types like string, number, date, etc and validations will hold the validations that need to be applied to that field.

So let’s have a look at the following snippet which utilizes the above structure and generates dynamic validation.

import * as yup from 'yup';

/** Adding just additional methods here */

yup.addMethod(yup.string, "URL", function(...args) {
    return this.url(...args);
});


const validator = function (message) {
    return this.test('is-string-boolean', message, function (value) {
      if (isEmpty(value)) {
        return true;
      }

      if (['Y', 'N'].indexOf(value) !== -1) {
        return true;
      } else {
        return false;
      }
    });
  };

yup.addMethod(yup.string, "stringBoolean", validator);
yup.addMethod(yup.string, "StringBoolean", validator);




export function createYupSchema(schema, config) {
  const { field, validationType, validations = [] } = config;
  if (!yup[validationType]) {
    return schema;
  }
  let validator = yup[validationType]();
  validations.forEach((validation) => {
    const { params, type } = validation;
    if (!validator[type]) {
      return;
    }
    validator = validator[type](...params);
  });
  if (field.indexOf('.') !== -1) {
    // nested fields are not covered in this example but are eash to handle tough
  } else {
    schema[field] = validator;
  }

  return schema;
}

export const getYupSchemaFromMetaData = (
  metadata,
  additionalValidations,
  forceRemove
) => {
  const yepSchema = metadata.reduce(createYupSchema, {});
  const mergedSchema = {
    ...yepSchema,
    ...additionalValidations,
  };

  forceRemove.forEach((field) => {
    delete mergedSchema[field];
  });

  const validateSchema = yup.object().shape(mergedSchema);

  return validateSchema;
};

import * as yup from 'yup';

/** Adding just additional methods here */

yup.addMethod(yup.string, "URL", function(...args) {
    return this.url(...args);
});


const validator = function (message) {
    return this.test('is-string-boolean', message, function (value) {
      if (isEmpty(value)) {
        return true;
      }

      if (['Y', 'N'].indexOf(value) !== -1) {
        return true;
      } else {
        return false;
      }
    });
  };

yup.addMethod(yup.string, "stringBoolean", validator);
yup.addMethod(yup.string, "StringBoolean", validator);




export function createYupSchema(schema, config) {
  const { field, validationType, validations = [] } = config;
  if (!yup[validationType]) {
    return schema;
  }
  let validator = yup[validationType]();
  validations.forEach((validation) => {
    const { params, type } = validation;
    if (!validator[type]) {
      return;
    }
    validator = validator[type](...params);
  });
  if (field.indexOf('.') !== -1) {
    // nested fields are not covered in this example but are eash to handle tough
  } else {
    schema[field] = validator;
  }

  return schema;
}

export const getYupSchemaFromMetaData = (
  metadata,
  additionalValidations,
  forceRemove
) => {
  const yepSchema = metadata.reduce(createYupSchema, {});
  const mergedSchema = {
    ...yepSchema,
    ...additionalValidations,
  };

  forceRemove.forEach((field) => {
    delete mergedSchema[field];
  });

  const validateSchema = yup.object().shape(mergedSchema);

  return validateSchema;
};

You can see the complete live fiddle with dynamic validations with formik here.

Here we have added the above code snippets to show how easily we can add a new method to Yup. Along with it, there are two functions createYupSchema and getYupSchemaFromMetaData which drive the whole logic for building dynamic schema. We are passing the validations in response and building the validation from it.

createYupSchema simply builds Yup validation based on the validation array and validationType. getYupSchemaFromMetaData basically iterates over the response array and builds Yup validation for each field and at the end, it wraps it in the Object schema. In this way, we can generate dynamic validations. One can even go further and create nested validations with recursion.‍

Conclusion

It’s often seen that adding just another field is time-consuming in the traditional approach of writing the large boilerplate for forms, while with this approach, it eliminates the need for hardcoding the fields and allows them to be backend-driven.

Formik provides very optimized state management which reduces performance issues that we generally see when Redux is used and updated quite frequently.

As we see above, it’s very easy to build dynamic forms with Formik. We can save the templates and even create template libraries that are very common with question and answer systems. If utilized correctly, we can simply have the templates saved in some NoSQL databases, like MongoDB and can generate a vast number of forms quickly with ease along with validations.

To learn more and build optimized solutions you can also refer to <fastfield> and <field> APIs at their </field></fastfield>official documentation. Thanks for reading!

December 12, 2022

Building a Collaborative Editor Using Quill and Yjs
“Hope this email finds you well” is how 2020-2021 has been in a nutshell. Since we’ve all been working remotely since last year, actively collaborating with teammates became one notch harder, from activities like brainstorming a topic on a whiteboard to building documentation.

Having tools powered by collaborative systems had become a necessity, and to explore the same following the principle of build fast fail fast, I started building up a collaborative editor using existing available, open-source tools, which can eventually be extended for needs across different projects.

Conflicts, as they say, are inevitable, when multiple users are working on the same document constantly modifying it, especially if it’s the same block of content. Ultimately, the end-user experience is defined by how such conflicts are resolved.

There are various conflict resolution mechanisms, but two of the most commonly discussed ones are Operational Transformation (OT) and Conflict-Free Replicated Data Type (CRDT). So, let’s briefly talk about those first.

Operational Transformation

The order of operations matter in OT, as each user will have their own local copy of the document, and since mutations are atomic, such as insert V at index 4 and delete X at index 2. If the order of these operations is changed, the end result will be different. And that’s why all the operations are synchronized through a central server. The central server can then alter the indices and operations and then forward to the clients. For example, in the below image, User2 makes a delete(0) operation, but as the OT server realizes that User1 has made an insert operation, the User2’s operation needs to be changed as delete(1) before applying to User1.

OT with a central server is typically easier to implement. Plain text operations with OT in its basic form only has three defined operations: insert, delete, and apply.

Source: Conclave

“Fully distributed OT and adding rich text operations are very hard, and that’s why there’s a million papers.”

CRDT

Instead of performing operations directly on characters like in OT, CRDT uses a complex data structure to which it can then add/update/remove properties to signify transformation, enabling scope for commutativity and idempotency. CRDTs guarantee eventual consistency.

There are different algorithms, but in general, CRDT has two requirements: globally unique characters and globally ordered characters. Basically, this involves a global reference for each object, instead of positional indices, in which the ordering is based on the neighboring objects. Fractional indices can be used to assign index to an object.

Source: Conclave

As all the objects have their own unique reference, delete operation becomes idempotent. And giving fractional indices is one way to give unique references while insertion and updation.

There are two types of CRDT, one is state-based, where the whole state (or delta) is shared between the instances and merged continuously. The other is operational based, where only individual operations are sent between replicas. If you want to dive deep into CRDT, here’s a nice resource.

For our purposes, we choose CRDT since it can also support peer-to-peer networks. If you directly want to jump to the code, you can visit the repo here.

Tools used for this project:

As our goal was for a quick implementation, we targeted off-the-shelf tools for editor and backend to manage collaborative operations.
- Quill.js is an API-driven WYSIWYG rich text editor built for compatibility and extensibility. We choose Quill as our editor because of the ease to plug it into your application and availability of extensions.
- Yjs is a framework that provides shared editing capabilities by exposing its different shared data types (Array, Map, Text, etc) that are synced automatically. It’s also network agnostic, so the changes are synced when a client is online. We used it because it’s a CRDT implementation, and surprisingly had readily available bindings for quill.js.
Prerequisites:

To keep it simple, we’ll set up a client and server both in the same code base. Initialize a project with npm init and install the below dependencies:
```
npm i quill quill-cursors webpack webpack-cli webpack-dev-server y-quill y-websocket yjs
```
- Quill: Quill is the WYSIWYG rich text editor we will use as our editor.
- quill-cursors is an extension that helps us to display cursors of other connected clients to the same editor room.
- Webpack, webpack-cli, and webpack-dev-server are developer utilities, webpack being the bundler that creates a deployable bundle for your application.
- The Y-quill module provides bindings between Yjs and QuillJS with use of the SharedType y.Text. For more information, you can check out the module’s source on Github.
- Y-websocket provides a WebsocketProvider to communicate with Yjs server in a client-server manner to exchange awareness information and data.
- Yjs, this is the CRDT framework which orchestrates conflict resolution between multiple clients.
Code to use
```
const path = require('path');

module.exports = {
  mode: 'development',
  devtool: 'source-map',
  entry: {
    index: './index.js'
  },
  output: {
    globalObject: 'self',
    path: path.resolve(__dirname, './dist/'),
    filename: '[name].bundle.js',
    publicPath: '/quill/dist'
  },
  devServer: {
    contentBase: path.join(__dirname),
    compress: true,
    publicPath: '/dist/'
  }
}
```
This is a basic webpack config where we have provided which file is the starting point of our frontend project, i.e., the index.js file. Webpack then uses that file to build the internal dependency graph of your project. The output property is to define where and how the generated bundles should be saved. And the devServer config defines necessary parameters for the local dev server, which runs when you execute “npm start”.

We’ll first create an index.html file to define the basic skeleton:
```
<!DOCTYPE html>
<html>
  <head>
    <title>Yjs Quill Example</title>
    <script src="./dist/index.bundle.js" async defer></script>
    <link rel=stylesheet href="//cdn.quilljs.com/1.3.6/quill.snow.css" async defer>
  </head>
  <body>
    <button type="button" id="connect-btn">Disconnect</button>
    <div id="editor" style="height: 500px;"></div>
  </body>
</html>
```
The index.html has a pretty basic structure. In <head>, we’ve provided the path of the bundled js file that will be created by webpack, and the css theme for the quill editor. And for the <body> part, we’ve just created a button to connect/disconnect from the backend and a placeholder div where the quill editor will be plugged.
- Here, we’ve just made the imports, registered quill-cursors extension, and added an event listener for window load:
import Quill from "quill"; import * as Y from 'yjs'; import { QuillBinding } from 'y-quill'; import { WebsocketProvider } from 'y-websocket'; import QuillCursors from "quill-cursors"; // Register QuillCursors module to add the ability to show multiple cursors on the editor. Quill.register('modules/cursors', QuillCursors); window.addEventListener('load', () => { // We'll add more blocks as we continue });
```
import Quill from "quill";
import * as Y from 'yjs';
import { QuillBinding } from 'y-quill';
import { WebsocketProvider } from 'y-websocket';
import QuillCursors from "quill-cursors";

// Register QuillCursors module to add the ability to show multiple cursors on the editor.
Quill.register('modules/cursors', QuillCursors);

window.addEventListener('load', () => {
  // We'll add more blocks as we continue
});
```
- Let’s initialize the Yjs document, socket provider, and load the document:
```
window.addEventListener('load', () => {
  const ydoc = new Y.Doc();
  const provider = new WebsocketProvider('ws://localhost:3312', 'velotio-demo', ydoc);
  const type = ydoc.getText('Velotio-Blog');
});
```
- We’ll now initialize and plug the Quill editor with its bindings:
window.addEventListener('load', () => { // ### ABOVE CODE HERE ### const editorContainer = document.getElementById('editor'); const toolbarOptions = [ ['bold', 'italic', 'underline', 'strike'], // toggled buttons ['blockquote', 'code-block'], [{ 'header': 1 }, { 'header': 2 }], // custom button values [{ 'list': 'ordered' }, { 'list': 'bullet' }], [{ 'script': 'sub' }, { 'script': 'super' }], // superscript/subscript [{ 'indent': '-1' }, { 'indent': '+1' }], // outdent/indent [{ 'direction': 'rtl' }], // text direction // array for drop-downs, empty array = defaults [{ 'size': [] }], [{ 'header': [1, 2, 3, 4, 5, 6, false] }], [{ 'color': [] }, { 'background': [] }], // dropdown with defaults from theme [{ 'font': [] }], [{ 'align': [] }], ['image', 'video'], ['clean'] // remove formatting button ]; const editor = new Quill(editorContainer, { modules: { cursors: true, toolbar: toolbarOptions, history: { userOnly: true // only user changes will be undone or redone. } }, placeholder: "collab-edit-test", theme: "snow" }); const binding = new QuillBinding(type, editor, provider.awareness); });
```
window.addEventListener('load', () => {
  // ### ABOVE CODE HERE ###

  const editorContainer = document.getElementById('editor');
  const toolbarOptions = [
    ['bold', 'italic', 'underline', 'strike'],  // toggled buttons
    ['blockquote', 'code-block'],
    [{ 'header': 1 }, { 'header': 2 }],               // custom button values
    [{ 'list': 'ordered' }, { 'list': 'bullet' }],
    [{ 'script': 'sub' }, { 'script': 'super' }],      // superscript/subscript
    [{ 'indent': '-1' }, { 'indent': '+1' }],          // outdent/indent
    [{ 'direction': 'rtl' }],                         // text direction
    // array for drop-downs, empty array = defaults
    [{ 'size': [] }],
    [{ 'header': [1, 2, 3, 4, 5, 6, false] }],
    [{ 'color': [] }, { 'background': [] }],          // dropdown with defaults from theme
    [{ 'font': [] }],
    [{ 'align': [] }],
    ['image', 'video'],
    ['clean']                                         // remove formatting button
  ];

  const editor = new Quill(editorContainer, {
    modules: {
      cursors: true,
      toolbar: toolbarOptions,
      history: {
        userOnly: true  // only user changes will be undone or redone.
      }
    },
    placeholder: "collab-edit-test",
    theme: "snow"
  });

  const binding = new QuillBinding(type, editor, provider.awareness);
});
```
- Finally, let’s implement the Connect/Disconnect button and complete the callback:
window.addEventListener('load', () => { // ### ABOVE CODE HERE ### const connectBtn = document.getElementById('connect-btn'); connectBtn.addEventListener('click', () => { if (provider.shouldConnect) { provider.disconnect(); connectBtn.textContent = 'Connect' } else { provider.connect(); connectBtn.textContent = 'Disconnect' } }); window.example = { provider, ydoc, type, binding, Y } });
```
window.addEventListener('load', () => {
  // ### ABOVE CODE HERE ###

  const connectBtn = document.getElementById('connect-btn');
  connectBtn.addEventListener('click', () => {
	if (provider.shouldConnect) {
  	  provider.disconnect();
  	  connectBtn.textContent = 'Connect'
	} else {
  	  provider.connect();
  	  connectBtn.textContent = 'Disconnect'
	}
  });

  window.example = { provider, ydoc, type, binding, Y }
});
```
Steps to run:
- Server:
For simplicity, we’ll directly use the y-websocket-server out of the box.

NOTE: You can either let it run and open a new terminal for the next commands, or let it run in the background using `&` at the end of the command.
- Client:
Start the client by npm start. On successful compilation, it should open on your default browser, or you can just go to http://localhost:8080.

Show me the repo

You can find the repository here.

Conclusion:

Conflict resolution approaches are not relatively new, but with the trend of remote culture, it is important to have good collaborative systems in place to enhance productivity.

Although this example was just on rich text editing capabilities, we can extend existing resources to build more features and structures like tabular data, graphs, charts, etc. Yjs shared types can be used to define your own data format based on how your custom editor represents data internally.
December 12, 2022

A Comprehensive Tutorial to Implementing OpenTracing With Jaeger

Introduction

Recently, there has been a lot of discussion around OpenTracing. We’ll start this blog by introducing OpenTracing, explaining what it is and why it is gaining attention. Next, we will discuss distributed tracing system Jaeger and how it helps in troubleshooting microservices-based distributed systems. We will also set up Jaeger and learn to use it for monitoring and troubleshooting purposes.

Drift to Microservice Architecture

Microservice Architecture has now become the obvious choice for application developers. In the Microservice Architecture, a monolithic application is broken down into a group of independently deployed services. In simple words, an application is more like a collection of microservices. When we have millions of such intertwined microservices working together, it’s almost impossible to map the inter-dependencies of these services and understand the execution of a request.

If a monolithic application fails then it is more feasible to do the root cause analysis and understand the path of a transaction using some logging frameworks. But in a microservice architecture, logging alone fails to deliver the complete picture.

Is this service called first in the chain? How do I span all these services to get insight into the application? With questions like these, it becomes a significantly larger problem to debug a set of interdependent distributed services in comparison to a single monolithic application, making OpenTracing more and more popular.

OpenTracing

What is Distributed Tracing?

Distributed tracing is a method used to monitor applications, mostly those built using the microservices architecture. Distributed tracing helps to highlight what causes poor performance and where failures occur.

How OpenTracing Fits Into This?

The OpenTracing API provides a standard, vendor neutral framework for instrumentation. This means that if a developer wants to try out a different distributed tracing system, then instead of repeating the whole instrumentation process for the new distributed tracing system, the developer can easily change the configuration of Tracer.

OpenTracing uses basic terminologies, such as Span and Trace. You can read about them in detail here.

OpenTracing is a way for services to “describe and propagate distributed traces without knowledge of the underlying OpenTracing implementation.”

Let us take the example of a service like renting a movie on any rental service like iTunes. A service like this requires many other microservices to check that the movie is available, proper payment credentials are received, and enough space exists on the viewer’s device for download. If either one of those microservice fail, then the entire transaction fails. In such a case, having logs just for the main rental service wouldn’t be very useful for debugging. However, if you were able to analyze each service you wouldn’t have to scratch your head to troubleshoot which microservice failed and what made it fail.

In real life, applications are even more complex and with the increasing complexity of applications, monitoring the applications has been a tedious task. Opentracing helps us to easily monitor:

Spans of services
Time taken by each service
Latency between the services
Hierarchy of services
Errors or exceptions during execution of each service.

Jaeger: A Distributed Tracing System by Uber

Jaeger, is released as an open source distributed tracing system by Uber Technologies. It is used for monitoring and troubleshooting microservices-based distributed systems, including:

Distributed transaction monitoring
Performance and latency optimization
Root cause analysis
Service dependency analysis
Distributed context propagation

Major Components of Jaeger

Jaeger Client Libraries
Agent
Collector
Query
Ingester

Running Jaeger in a Docker Container

1. First, install Jaeger Client on your machine:

$ pip install jaeger-client

$ pip install jaeger-client

2. Now, let’s run Jaeger backend as an all-in-one Docker image. The image launches the Jaeger UI, collector, query, and agent:

$ docker run -d -p6831:6831/udp -p16686:16686 jaegertracing/all-in-one:latest

$ docker run -d -p6831:6831/udp -p16686:16686 jaegertracing/all-in-one:latest

TIP: To check if the docker container is running, use: Docker ps.

Once the container starts, open http://localhost:16686/ to access the Jaeger UI. The container runs the Jaeger backend with an in-memory store, which is initially empty, so there is not much we can do with the UI right now since the store has no traces.

Creating Traces on Jaeger UI

1. Create a Python program to create Traces:

Let’s generate some traces using a simple python program. You can clone the Jaeger-Opentracing repository given below for a sample program that is used in this blog.

import sys
import time
import logging
import random
from jaeger_client import Config
from opentracing_instrumentation.request_context import get_current_span, span_in_context

def init_tracer(service):
    logging.getLogger('').handlers = []
    logging.basicConfig(format='%(message)s', level=logging.DEBUG)    
    config = Config(
        config={
            'sampler': {
                'type': 'const',
                'param': 1,
            },
            'logging': True,
        },
        service_name=service,
    )
    return config.initialize_tracer()

def booking_mgr(movie):
    with tracer.start_span('booking') as span:
        span.set_tag('Movie', movie)
        with span_in_context(span):
            cinema_details = check_cinema(movie)
            showtime_details = check_showtime(cinema_details)
            book_show(showtime_details)

def check_cinema(movie):
    with tracer.start_span('CheckCinema', child_of=get_current_span()) as span:
        with span_in_context(span):
            num = random.randint(1,30)
            time.sleep(num)
            cinema_details = "Cinema Details"
            flags = ['false', 'true', 'false']
            random_flag = random.choice(flags)
            span.set_tag('error', random_flag)
            span.log_kv({'event': 'CheckCinema' , 'value': cinema_details })
            return cinema_details

def check_showtime( cinema_details ):
    with tracer.start_span('CheckShowtime', child_of=get_current_span()) as span:
        with span_in_context(span):
            num = random.randint(1,30)
            time.sleep(num)
            showtime_details = "Showtime Details"
            flags = ['false', 'true', 'false']
            random_flag = random.choice(flags)
            span.set_tag('error', random_flag)
            span.log_kv({'event': 'CheckCinema' , 'value': showtime_details })
            return showtime_details

def book_show(showtime_details):
    with tracer.start_span('BookShow',  child_of=get_current_span()) as span:
        with span_in_context(span):
            num = random.randint(1,30)
            time.sleep(num)
            Ticket_details = "Ticket Details"
            flags = ['false', 'true', 'false']
            random_flag = random.choice(flags)
            span.set_tag('error', random_flag)
            span.log_kv({'event': 'CheckCinema' , 'value': showtime_details })
            print(Ticket_details)

assert len(sys.argv) == 2
tracer = init_tracer('booking')
movie = sys.argv[1]
booking_mgr(movie)
# yield to IOLoop to flush the spans
time.sleep(2)
tracer.close()

import sys
import time
import logging
import random
from jaeger_client import Config
from opentracing_instrumentation.request_context import get_current_span, span_in_context

def init_tracer(service):
    logging.getLogger('').handlers = []
    logging.basicConfig(format='%(message)s', level=logging.DEBUG)    
    config = Config(
        config={
            'sampler': {
                'type': 'const',
                'param': 1,
            },
            'logging': True,
        },
        service_name=service,
    )
    return config.initialize_tracer()

def booking_mgr(movie):
    with tracer.start_span('booking') as span:
        span.set_tag('Movie', movie)
        with span_in_context(span):
            cinema_details = check_cinema(movie)
            showtime_details = check_showtime(cinema_details)
            book_show(showtime_details)

def check_cinema(movie):
    with tracer.start_span('CheckCinema', child_of=get_current_span()) as span:
        with span_in_context(span):
            num = random.randint(1,30)
            time.sleep(num)
            cinema_details = "Cinema Details"
            flags = ['false', 'true', 'false']
            random_flag = random.choice(flags)
            span.set_tag('error', random_flag)
            span.log_kv({'event': 'CheckCinema' , 'value': cinema_details })
            return cinema_details

def check_showtime( cinema_details ):
    with tracer.start_span('CheckShowtime', child_of=get_current_span()) as span:
        with span_in_context(span):
            num = random.randint(1,30)
            time.sleep(num)
            showtime_details = "Showtime Details"
            flags = ['false', 'true', 'false']
            random_flag = random.choice(flags)
            span.set_tag('error', random_flag)
            span.log_kv({'event': 'CheckCinema' , 'value': showtime_details })
            return showtime_details

def book_show(showtime_details):
    with tracer.start_span('BookShow',  child_of=get_current_span()) as span:
        with span_in_context(span):
            num = random.randint(1,30)
            time.sleep(num)
            Ticket_details = "Ticket Details"
            flags = ['false', 'true', 'false']
            random_flag = random.choice(flags)
            span.set_tag('error', random_flag)
            span.log_kv({'event': 'CheckCinema' , 'value': showtime_details })
            print(Ticket_details)

assert len(sys.argv) == 2
tracer = init_tracer('booking')
movie = sys.argv[1]
booking_mgr(movie)
# yield to IOLoop to flush the spans
time.sleep(2)
tracer.close()

The Python program takes a movie name as an argument and calls three functions that get the cinema details, movie showtime details, and finally book a movie ticket.

It creates some random delays in all the functions to make it more interesting, as in reality the functions would take certain time to get the details. Also the function throws random errors to give us a feel of how the traces of a real-life application may look like in case of failures.

Here is a brief description of how OpenTracing has been used in the program:

Initializing a tracer:

def init_tracer(service):
   logging.getLogger('').handlers = []
   logging.basicConfig(format='%(message)s', level=logging.DEBUG)   
   config = Config(
       config={
           'sampler': {
               'type': 'const',
               'param': 1,
           },
           'logging': True,
       },
       service_name=service,
   )
   return config.initialize_tracer()

def init_tracer(service):
   logging.getLogger('').handlers = []
   logging.basicConfig(format='%(message)s', level=logging.DEBUG)   
   config = Config(
       config={
           'sampler': {
               'type': 'const',
               'param': 1,
           },
           'logging': True,
       },
       service_name=service,
   )
   return config.initialize_tracer()

Using the tracer instance:

tracer = init_tracer('booking')

tracer = init_tracer('booking')

Starting new child spans using start_span:

with tracer.start_span('CheckCinema', child_of=get_current_span()) as span:

with tracer.start_span('CheckCinema', child_of=get_current_span()) as span:

Using Tags:

span.set_tag('Movie', movie)

span.set_tag('Movie', movie)

Using Logs:

span.log_kv({'event': 'CheckCinema' , 'value': cinema_details })

span.log_kv({'event': 'CheckCinema' , 'value': cinema_details })

2. Run the python program:‍

$ python booking-mgr.py <movie-name>

Initializing Jaeger Tracer with UDP reporter
Using sampler ConstSampler(True)
opentracing.tracer initialized to <jaeger_client.tracer.Tracer object at 0x7f72ffa25b50>[app_name=booking]
Reporting span cfe1cc4b355aacd9:8d6da6e9161f32ac:cfe1cc4b355aacd9:1 booking.CheckCinema
Reporting span cfe1cc4b355aacd9:88d294b85345ac7b:cfe1cc4b355aacd9:1 booking.CheckShowtime
Ticket Details
Reporting span cfe1cc4b355aacd9:98cbfafca3aa0fe2:cfe1cc4b355aacd9:1 booking.BookShow
Reporting span cfe1cc4b355aacd9:cfe1cc4b355aacd9:0:1 booking.booking

$ python booking-mgr.py <movie-name>

Initializing Jaeger Tracer with UDP reporter
Using sampler ConstSampler(True)
opentracing.tracer initialized to <jaeger_client.tracer.Tracer object at 0x7f72ffa25b50>[app_name=booking]
Reporting span cfe1cc4b355aacd9:8d6da6e9161f32ac:cfe1cc4b355aacd9:1 booking.CheckCinema
Reporting span cfe1cc4b355aacd9:88d294b85345ac7b:cfe1cc4b355aacd9:1 booking.CheckShowtime
Ticket Details
Reporting span cfe1cc4b355aacd9:98cbfafca3aa0fe2:cfe1cc4b355aacd9:1 booking.BookShow
Reporting span cfe1cc4b355aacd9:cfe1cc4b355aacd9:0:1 booking.booking

Now, check your Jaeger UI, you can see a new service “booking” added. Select the service and click on “Find Traces” to see the traces of your service. Every time you run the program a new trace will be created.

You can now compare the duration of traces through the graph shown above. You can also filter traces using “Tags” section under “Find Traces”. For example, Setting “error=true” tag will filter out all the jobs that have errors, as shown:

To view the detailed trace, you can select a specific trace instance and check details like the time taken by each service, errors during execution and logs.

The above trace instance has four spans, the first representing the root span “booking”, the second is the “CheckCinema”, the third is the “CheckShowtime” and last is the “BookShow”. In this particular trace instance, both the “CheckCinema” and “CheckShowtime” have reported errors, marked by the error=true tag.

Conclusion

In this blog, we’ve described the importance and benefits of OpenTracing, one of the core pillars of modern applications. We also explored how distributed tracer Jaeger collect and store traces while revealing inefficient portions of our applications. It is fully compatible with OpenTracing API and has a number of clients for different programming languages including Java, Go, Node.js, Python, PHP, and more.

References

https://www.jaegertracing.io/docs/1.9/
https://opentracing.io/docs/

December 12, 2022

BigQuery 101: All the Basics You Need to Know
Google BigQuery is an enterprise data warehouse built using BigTable and Google Cloud Platform. It’s serverless and completely managed. BigQuery works great with all sizes of data, from a 100 row Excel spreadsheet to several Petabytes of data. Most importantly, it can execute a complex query on those data within a few seconds.

We need to note before we proceed, BigQuery is not a transactional database. It takes around 2 seconds to run a simple query like ‘SELECT * FROM bigquery-public-data.object LIMIT 10’ on a 100 KB table with 500 rows. Hence, it shouldn’t be thought of as OLTP (Online Transaction Processing) database. BigQuery is for Big Data!

BigQuery supports SQL-like query, which makes it user-friendly and beginner friendly. It’s accessible via its web UI, command-line tool, or client library (written in C#, Go, Java, Node.js, PHP, Python, and Ruby). You can also take advantage of its REST APIs and get our job` done by sending a JSON request.

Now, let’s dive deeper to understand it better. Suppose you are a data scientist (or a startup which analyzes data) and you need to analyze terabytes of data. If you choose a tool like MySQL, the first step before even thinking about any query is to have an infrastructure in place, that can store this magnitude of data.

Designing this setup itself will be a difficult task because you have to figure out what will be the RAM size, DCOS or Kubernetes, and other factors. And if you have streaming data coming, you will need to set up and maintain a Kafka cluster. In BigQuery, all you have to do is a bulk upload of your CSV/JSON file, and you are done. BigQuery handles all the backend for you. If you need streaming data ingestion, you can use Fluentd. Another advantage of this is that you can connect Google Analytics with BigQuery seamlessly.

BigQuery is serverless, highly available, and petabyte scalable service which allows you to execute complex SQL queries quickly. It lets you focus on analysis rather than handling infrastructure. The idea of hardware is completely abstracted and not visible to us, not even as virtual machines.

Architecture of Google BigQuery

You don’t need to know too much about the underlying architecture of BigQuery. That’s actually the whole idea of it – you don’t need to worry about architecture and operation.

However, understanding BigQuery Architecture helps us in controlling costs, optimizing query performance, and optimizing storage. BigQuery is built using the Google Dremel paper.

Quoting an Abstract from the Google Dremel Paper –

“Dremel is a scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system scales to thousands of CPUs and petabytes of data and has thousands of users at Google. In this paper, we describe the architecture and implementation of Dremel and explain how it complements MapReduce-based computing. We present a novel columnar storage representation for nested records and discuss experiments on few-thousand node instances of the system.”

Dremel was in production at Google since 2006. Google used it for the following tasks –
- Analysis of crawled web documents.
- Tracking install data for applications on Android Market.
- Crash reporting for Google products.
- OCR results from Google Books.
- Spam analysis.
- Debugging of map tiles on Google Maps.
- Tablet migrations in managed Bigtable instances.
- Results of tests run on Google’s distributed build system.
- Disk I/O statistics for hundreds of thousands of disks.
- Resource monitoring for jobs run in Google’s data centers.
- Symbols and dependencies in Google’s codebase.
BigQuery is much more than Dremel. Dremel is just a query execution engine, whereas Bigquery is based on interesting technologies like Borg (predecessor of Kubernetes) and Colossus. Colossus is the successor to the Google File System (GFS) as mentioned in Google Spanner Paper.

How BigQuery Stores Data?

BigQuery stores data in a columnar format – Capacitor (which is a successor of ColumnarIO). BigQuery achieves very high compression ratio and scan throughput. Unlike ColumnarIO, now on BigQuery, you can directly operate on compressed data without decompressing it.

Columnar storage has the following advantages:
- Traffic minimization – When you submit a query, the required column values on each query are scanned and only those are transferred on query execution. E.g., a query `SELECT title FROM Collection` would access the title column values only.
- Higher compression ratio – Columnar storage can achieve a compression ratio of 1:10, whereas ordinary row-based storage can compress at roughly 1:3.
(Image source: Google Dremel Paper)

Columnar storage has the disadvantage of not working efficiently when updating existing records. That is why Dremel doesn’t support any update queries.

How the Query Gets Executed?

BigQuery depends on Borg for data processing. Borg simultaneously instantiates hundreds of Dremel jobs across required clusters made up of thousands of machines. In addition to assigning compute capacity for Dremel jobs, Borg handles fault-tolerance as well.

Now, how do you design/execute a query which can run on thousands of nodes and fetches the result? This challenge was overcome by using the Tree Architecture. This architecture forms a gigantically parallel distributed tree for pushing down a query to the tree and aggregating the results from the leaves at a blazingly fast speed.

(Image source: Google Dremel Paper)

BigQuery vs. MapReduce

The key differences between BigQuery and MapReduce are –
- Dremel is designed as an interactive data analysis tool for large datasets
- MapReduce is designed as a programming framework to batch process large datasets
Moreover, Dremel finishes most queries within seconds or tens of seconds and can even be used by non-programmers, whereas MapReduce takes much longer (sometimes even hours or days) to process a query.

Following is a comparison on running MapReduce on a row and columnar DB:

(Image source: Google Dremel Paper)

Another important thing to note is that BigQuery is meant to analyze structured data (SQL) but in MapReduce, you can write logic for unstructured data as well.

Comparing BigQuery and Redshift

In Redshift, you need to allocate different instance types and create your own clusters. The benefit of this is that it lets you tune the compute/storage to meet your needs. However, you have to be aware of (virtualized) hardware limits and scale up/out based on that. Note that you are charged by the hour for each instance you spin up.

In BigQuery, you just upload the data and query it. It is a truly managed service. You are charged by storage, streaming inserts, and queries.

There are more similarities in both the data warehouses than the differences.

A smart user will definitely take advantage of the hybrid cloud (GCE+AWS) and leverage different services offered by both the ecosystems. Check out your quintessential guide to AWS Athena here.

Getting Started With Google BigQuery

Following is a quick example to show how you can quickly get started with BigQuery:
1. There are many public datasets available on bigquery, you are going to play with ‘bigquery-public-data:stackoverflow’ dataset. You can click on the “Add Data” button on the left panel and select datasets.
2. Next, find a language that has the best community, based on the response time. You can write the following query to do that.
WITH question_answers_join AS ( SELECT * , GREATEST(1, TIMESTAMP_DIFF(answers.first, creation_date, minute)) minutes_2_answer FROM ( SELECT id, creation_date, title , (SELECT AS STRUCT MIN(creation_date) first, COUNT(*) c FROM `bigquery-public-data.stackoverflow.posts_answers` WHERE a.id=parent_id ) answers , SPLIT(tags, '|') tags FROM `bigquery-public-data.stackoverflow.posts_questions` a WHERE EXTRACT(year FROM creation_date) > 2014 ) ) SELECT COUNT(*) questions, tag , ROUND(EXP(AVG(LOG(minutes_2_answer))), 2) mean_geo_minutes , APPROX_QUANTILES(minutes_2_answer, 100)[SAFE_OFFSET(50)] median FROM question_answers_join, UNNEST(tags) tag WHERE tag IN ('javascript', 'python', 'rust', 'java', 'scala', 'ruby', 'go', 'react', 'c', 'c++') AND answers.c > 0 GROUP BY tag ORDER BY mean_geo_minutes
```
WITH question_answers_join AS (
  SELECT *
    , GREATEST(1, TIMESTAMP_DIFF(answers.first, creation_date, minute)) minutes_2_answer
  FROM (
    SELECT id, creation_date, title
      , (SELECT AS STRUCT MIN(creation_date) first, COUNT(*) c
         FROM `bigquery-public-data.stackoverflow.posts_answers` 
         WHERE a.id=parent_id
      ) answers
      , SPLIT(tags, '|') tags
    FROM `bigquery-public-data.stackoverflow.posts_questions` a
    WHERE EXTRACT(year FROM creation_date) > 2014
  )
)
SELECT COUNT(*) questions, tag
  , ROUND(EXP(AVG(LOG(minutes_2_answer))), 2) mean_geo_minutes
  , APPROX_QUANTILES(minutes_2_answer, 100)[SAFE_OFFSET(50)] median
FROM question_answers_join, UNNEST(tags) tag
WHERE tag IN ('javascript', 'python', 'rust', 'java', 'scala', 'ruby', 'go', 'react', 'c', 'c++')
AND answers.c > 0
GROUP BY tag
ORDER BY mean_geo_minutes
```
3. Now you can execute the query and get results –

You can see that C has the best community followed by JavaScript!

How to do Machine Learning on BigQuery?

Now that you have a sound understanding of BigQuery. It’s time for some real action.

As discussed above, you can connect Google Analytics with BigQuery by going to the Google Analytics Admin panel, then enable BigQuery by clicking on PROPERTY column, click All Products, then click Link BigQuery. After that, you need to enter BigQuery ID (or project number) and then BigQuery will be linked to Google Analytics. Note – Right now BigQuery integration is only available to Google Analytics 360.

Assuming that you already have uploaded your google analytics data, here is how you can create a logistic regression model. Here, you are predicting whether a website visitor will make a transaction or not.
CREATE MODEL `velotio_tutorial.sample_model` OPTIONS(model_type='logistic_reg') AS SELECT IF(totals.transactions IS NULL, 0, 1) AS label, IFNULL(device.operatingSystem, "") AS os, device.isMobile AS is_mobile, IFNULL(geoNetwork.country, "") AS country, IFNULL(totals.pageviews, 0) AS pageviews FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*` WHERE _TABLE_SUFFIX BETWEEN '20190401' AND '20180630'
```
CREATE MODEL `velotio_tutorial.sample_model`
OPTIONS(model_type='logistic_reg') AS
SELECT
  IF(totals.transactions IS NULL, 0, 1) AS label,
  IFNULL(device.operatingSystem, "") AS os,
  device.isMobile AS is_mobile,
  IFNULL(geoNetwork.country, "") AS country,
  IFNULL(totals.pageviews, 0) AS pageviews
FROM
  `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20190401' AND '20180630'
```
Create a model named ‘velotio_tutorial.sample_model’. Now set the ‘model_type’ as ‘logistic_reg’ because you want to train a logistic regression model. A logistic regression model splits input data into two classes and gives the probability that the data is in one of the classes. Usually, in “spam or not spam” type of problems, you use logistic regression. Here, the problem is similar – a transaction will be made or not.

The above query gets the total number of page views, the country from where the session originated, the operating system of visitors device, the total number of e-commerce transactions within the session, etc.

Now you just press run query to execute the query.

Conclusion

BigQuery is a query service that allows us to run SQL-like queries against multiple terabytes of data in a matter of seconds. If you have structured data, BigQuery is the best option to go for. It can help even a non-programmer to get the analytics right!

Learn how to build an ETL Pipeline for MongoDB & Amazon Redshift using Apache Airflow.

If you need help with using machine learning in product development for your organization, connect with experts at Velotio!
December 12, 2022
How to setup iOS app with Apple developer account and TestFlight from scratch
In this article, we will discuss how to set up the Apple developer account, build an app (create IPA files), configure TestFlight, and deploy it to TestFlight for the very first time.

There are tons of articles explaining how to configure and build an app or how to setup TestFlight or setup application for ad hoc distribution. However, most of them are either outdated or missing steps and can be misleading for someone who is doing it for the very first time.

If you haven’t done this before, don’t worry, just traverse through the minute details of this article, follow every step correctly, and you will be able to set up your iOS application end-to-end, ready for TestFlight or ad hoc distribution within an hour.

Prerequisites

Before we start, please make sure, you have:
- A React Native Project created and opened in the XCode
- XCode set up on your Mac
- An Apple developer account with access to create the Identifiers and Certificates, i.e. you have at least have a Developer or Admin access – https://developer.apple.com/account/
- Access to App Store Connect with your apple developer account -https://appstoreconnect.apple.com/
- Make sure you have an Apple developer account, if not, please get it created first.
The Setup contains 4 major steps:
- Creating Certificates, Identifiers, and Profiles from your Apple Developer account
- Configuring the iOS app using these Identifiers, Certificates, and Profiles in XCode
- Setting up TestFlight and Internal Testers group on App Store Connect
- Generating iOS builds, signing them, and uploading them to TestFlight on App Store Connect
Certificates, Identifiers, and Profiles

Before we do anything, we need to create:
- Bundle Identifier, which is an app bundle ID and a unique app identifier used by the App Store
- A Certificate – to sign the iOS app before submitting it to the App Store
- Provisioning Profile – for linking bundle ID and certificates together
Bundle Identifiers

For the App Store to recognize your app uniquely, we need to create a unique Bundle Identifier.

Go to https://developer.apple.com/account: you will see the Certificates, Identifiers & Profiles tab. Click on Identifiers.

Click the Plus icon next to Identifiers:

Select the App IDs option from the list of options and click Continue:

Select App from app types and click Continue

On the next page, you will need to enter the app ID and select the required services your application can have if required (this is optional—you can enable them in the future when you actually implement them).

Keep those unselected for now as we don’t need them for this setup.

Once filled with all the information, please click on continue and register your Bundle Identifier.

Generating Certificate

Certificates can be generated 2 ways:
- By automatically managing certificates from Xcode
- By manually generating them
We will generate them manually.

To create a certificate, we need a Certificate Signing Request form, which needs to be generated from your Mac’s KeyChain Access authority.

Creating Certificate Signing Request:

Open the KeyChain Access application and Click on the KeyChain Access Menu item at the left top of the screen, then select Preferences

Select Certificate Assistance -> Request Certificate from Managing Authority

Enter the required information like email address and name, then select the Save to Disk option.

Click Continue and save this form to a place so you can easily upload it to your Apple developer account

Now head back to the Apple developer account, click on Certificates. Again click on the + icon next to Certificates title and you will be taken to the new certificate form.

Select the iOS Distribution (App Store and ad hoc) option. Here, you can select the required services this certificate will need from a list of options (for example, Apple Push Notification service).

As we don’t need any services, ignore it for now and click continue.

On the next screen, upload the certificate signing request form we generated in the last step and click Continue.

At this step, your certificate will be generated and will be available to download.

NOTE: The certificate can be downloaded only once, so please download it and keep it in a secure location to use it in the future.

Download your certificate and install it by clicking on the downloaded certificate file. The certificate will be installed on your mac and can be used for generating builds in the next steps.

You can verify this by going back to the KeyChain Access app and seeing the newly installed certificate in the certificates list.

Generating a Provisioning Profile

Now link your identifier and certificate together by creating a provisioning profile.

Let’s go back to the Apple developer account, select the profiles option, and select the + icon next to the Profiles title.

You will be redirected to the new Profiles form page.

Select Distribution Profile and click continue:

Select the App ID we created in the first step and click Continue:

Now, select the certificate we created in the previous step:

Enter a Provisioning Profile name and click Generate:

Once Profile is generated, it will be available to download, please download it and keep it at the same location where you kept Certificate for future usage.

Configure App in XCode

Now, we need to configure our iOS application using the bundle ID and the Apple developer account we used for generating the certificate and profiles.

Open the <appname>.xcworkspace file in XCode and click on the app name on the left pan. It will open the app configuration page.

Select the app from targets, go to signing and capabilities, and enter the bundle identifier.

Now, to automatically manage the provisioning profile, we need to download the provisioning profile we generated recently.

For this, we need to sign into XCode using your Apple ID.

Select Preferences from the top left XCode Menu option, go to Accounts, and click on the + icon at the bottom.

Select Apple ID from the account you want to add to the list, click continue and enter the Apple ID.

It will prompt you to enter the password as well.

Once successfully logged in, XCode will fetch all the provisioning profiles associated with this account. Verify that you see your project in the Teams section of this account page.

Now, go back to the XCode Signing Capabilities page, select Automatically Manage Signing, and then select the required team from the Team dropdown.

At this point, your application will be able to generate the Archives to upload it to either TestFlight or Sign them ad hoc to distribute it using other mediums (Diawi, etc.).

Setup TestFlight

TestFlight and App Store management are managed by the App Store Connect portal.

Open the App Store Connect portal and log in to the application.

After you log in, please make sure you have selected the correct team from the top right corner (you can check the team name just below the user name).

Select My Apps from the list of options.

If this is the first time you are setting up an application on this team, you will see the + (Add app) option at the center of the page, but if your team has already set up applications, you will see the + icon right next to Apps Header.

Click on the + icon and select New App Option:

Enter the complete app details, like platform (iOS, MacOS OR tvOS), aApp name, bundle ID (the one we created), SKU, access type, and click the Create button.

You should now be able to see your newly created application on the Apps menu. Select the app and go to TestFlight. You will see no builds there as we did not push any yet.

Generate and upload the build to TestFlight

At this point, we are fully ready to generate a build from XCode and push it to TestFlight. To do this, head back to XCode.

On the top middle section, you will see your app name and right arrow. There might be an iPhone or other simulator selected. Pplease click on the options list and select Any iOS Device.

Select the Product menu from the Menu list and click on the Archive option.

Once the archive succeeds, XCode will open the Organizer window (you can also open this page from the Windows Menu list).

Here, we sign our application archive (build) using the certificate we created and upload it to the App Store Connect TestFlight.

On the Organizer window, you will see the recently generated build. Please select the build and click on Distribute Button from the right panel of the Organizer page.

On the next page, select App Store Connect from the “Select a method of distribution” window and click Continue.

NOTE: We are selecting the App Store Connect option as we want to upload a build to TestFlight, but if you want to distribute it privately using other channels, please select the Ad Hoc option.

Select Upload from the “Select a Destination” options and click continue. This will prepare your build to submit it to App Store Connect TestFlight.

For the first time, it will ask you how you want to sign the build, Automatically or Manually?

Please Select Automatically and click the Next button.

XCode may ask you to authenticate your certificate using your system password. Please authenticate it and wait until XCode uploads the build to TestFlight.

Once the build is uploaded successfully, XCode will prompt you with the Success modal.

Now, your app is uploaded to TestFlight and is being processed. This processing takes 5 to 15 minutes, at which point TestFlight makes it available for testing.

Add Internal Testers and other teammates to TestFlight

Once we are done with all the setup and uploaded the build to TestFlight, we need to add internal testers to TestFlight.

This is a 2-step process. First, you need to add a user to App Store Connect and then add a user to TestFlight.

Go to Users and Access

Add a new User and App Store sends an invitation to the user

Once the user accepts the invitation, go to TestFlight -> Internal Testing

In the Internal Testing section, create a new Testing group if not added already and

add the user to TestFlight testing group.

Now, you should be able to configure the app, upload it to TestFlight, and add users to the TestFlight testing group.

Hopefully, you enjoyed this article, and it helped in setting up iOS applications end-to-end quickly without getting too much confused.

Thanks.
December 12, 2022
A Beginner’s Guide to Edge Computing
In the world of data centers with wings and wheels, there is an opportunity to lay some work off from the centralized cloud computing by taking less compute intensive tasks to other components of the architecture. In this blog, we will explore the upcoming frontier of the web – Edge Computing.

What is the “Edge”?

The ‘Edge’ refers to having computing infrastructure closer to the source of data. It is the distributed framework where data is processed as close to the originating data source possible. This infrastructure requires effective use of resources that may not be continuously connected to a network such as laptops, smartphones, tablets, and sensors. Edge Computing covers a wide range of technologies including wireless sensor networks, cooperative distributed peer-to-peer ad-hoc networking and processing, also classifiable as local cloud/fog computing, mobile edge computing, distributed data storage and retrieval, autonomic self-healing networks, remote cloud services, augmented reality, and more.

Cloud Computing is expected to go through a phase of decentralization. Edge Computing is coming up with an ideology of bringing compute, storage and networking closer to the consumer.

But Why?

Legit question! Why do we even need Edge Computing? What are the advantages of having this new infrastructure?

Imagine a case of a self-driving car where the car is sending a live stream continuously to the central servers. Now, the car has to take a crucial decision. The consequences can be disastrous if the car waits for the central servers to process the data and respond back to it. Although algorithms like YOLO_v2 have sped up the process of object detection the latency is at that part of the system when the car has to send terabytes to the central server and then receive the response and then act! Hence, we need the basic processing like when to stop or decelerate, to be done in the car itself.

The goal of Edge Computing is to minimize the latency by bringing the public cloud capabilities to the edge. This can be achieved in two forms – custom software stack emulating the cloud services running on existing hardware, and the public cloud seamlessly extended to multiple point-of-presence (PoP) locations.

Following are some promising reasons to use Edge Computing:
1. Privacy: Avoid sending all raw data to be stored and processed on cloud servers.
2. Real-time responsiveness: Sometimes the reaction time can be a critical factor.
3. Reliability: The system is capable to work even when disconnected to cloud servers. Removes a single point of failure.
To understand the points mentioned above, let’s take the example of a device which responds to a hot keyword. Example, Jarvis from Iron Man. Imagine if your personal Jarvis sends all of your private conversations to a remote server for analysis. Instead, It is intelligent enough to respond when it is called. At the same time, it is real-time and reliable.

Intel CEO Brian Krzanich said in an event that autonomous cars will generate 40 terabytes of data for every eight hours of driving. Now with that flood of data, the time of transmission will go substantially up. In cases of self-driving cars, real-time or quick decisions are an essential need. Here edge computing infrastructure will come to rescue. These self-driving cars need to take decisions is split of a second whether to stop or not else consequences can be disastrous.

Another example can be drones or quadcopters, let’s say we are using them to identify people or deliver relief packages then the machines should be intelligent enough to take basic decisions like changing the path to avoid obstacles locally.

Forms of Edge Computing

Device Edge:

In this model, Edge Computing is taken to the customers in the existing environments. For example, AWS Greengrass and Microsoft Azure IoT Edge.

Cloud Edge:

This model of Edge Computing is basically an extension of the public cloud. Content Delivery Networks are classic examples of this topology in which the static content is cached and delivered through a geographically spread edge locations.

Vapor IO is an emerging player in this category. They are attempting to build infrastructure for cloud edge. Vapor IO has various products like Vapor Chamber. These are self-monitored. They have sensors embedded in them using which they are continuously monitored and evaluated by Vapor Software, VEC(Vapor Edge Controller). They also have built OpenDCRE, which we will see later in this blog.

The fundamental difference between device edge and cloud edge lies in the deployment and pricing models. The deployment of these models – device edge and cloud edge – are specific to different use cases. Sometimes, it may be an advantage to deploy both the models.

Edges around you

Edge Computing examples can be increasingly found around us:
1. Smart street lights
2. Automated Industrial Machines
3. Mobile devices
4. Smart Homes
5. Automated Vehicles (cars, drones etc)
Data Transmission is expensive. By bringing compute closer to the origin of data, latency is reduced as well as end users have better experience. Some of the evolving use cases of Edge Computing are Augmented Reality(AR) or Virtual Reality(VR) and the Internet of things. For example, the rush which people got while playing an Augmented Reality based pokemon game, wouldn’t have been possible if “real-timeliness” was not present in the game. It was made possible because the smartphone itself was doing AR not the central servers. Even Machine Learning(ML) can benefit greatly from Edge Computing. All the heavy-duty training of ML algorithms can be done on the cloud and the trained model can be deployed on the edge for near real-time or even real-time predictions. We can see that in today’s data-driven world edge computing is becoming a necessary component of it.

There is a lot of confusion between Edge Computing and IOT. If stated simply, Edge Computing is nothing but the intelligent Internet of things(IOT) in a way. Edge Computing actually complements traditional IOT. In the traditional model of IOT, all the devices, like sensors, mobiles, laptops etc are connected to a central server. Now let’s imagine a case where you give the command to your lamp to switch off, for such simple task, data needs to be transmitted to the cloud, analyzed there and then lamp will receive a command to switch off. Edge Computing brings computing closer to your home, that is either the fog layer present between lamp and cloud servers is smart enough to process the data or the lamp itself.

If we look at the below image, it is a standard IOT implementation where everything is centralized. While Edge Computing philosophy talks about decentralizing the architecture.

The Fog

Sandwiched between edge layer and cloud layer, there is the Fog Layer. It bridges connection between other two layers.

The difference between fog and edge computing is described in this article –
- Fog Computing – Fog computing pushes intelligence down to the local area network level of network architecture, processing data in a fog node or IoT gateway.
- Edge computing pushes the intelligence, processing power and communication capabilities of an edge gateway or appliance directly into devices like programmable automation controllers (PACs).
How do we manage Edge Computing?

The Device Relationship Management or DRM refers to managing, monitoring the interconnected components over the internet. AWS IOT Core and AWS Greengrass, Nebbiolo Technologies have developed Fog Node and Fog OS, Vapor IO has OpenDCRE using which one can control and monitor the data centers.

Following image (source – AWS) shows how to manage ML on Edge Computing using AWS infrastructure.

AWS Greengrass makes it possible for users to use Lambda functions to build IoT devices and application logic. Specifically, AWS Greengrass provides cloud-based management of applications that can be deployed for local execution. Locally deployed Lambda functions are triggered by local events, messages from the cloud, or other sources.

This GitHub repo demonstrates a traffic light example using two Greengrass devices, a light controller, and a traffic light.

Conclusion

We believe that next-gen computing will be influenced a lot by Edge Computing and will continue to explore new use-cases that will be made possible by the Edge.

References
December 12, 2022
Setting Up A Single Sign On (SSO) Environment For Your App
Single Sign On (SSO) makes it simple for users to begin using an application. Support for SSO is crucial for enterprise apps, as many corporate security policies mandate that all applications use certified SSO mechanisms. While the SSO experience is straightforward, the SSO standard is anything but straightforward. It’s easy to get confused when you’re surrounded by complex jargon, including SAML, OAuth 1.0, 1.0a, 2.0, OpenID, OpenID Connect, JWT, and tokens like refresh tokens, access tokens, bearer tokens, and authorization tokens. Standards documentation is too precise to allow generalization, and vendor literature can make you believe it’s too difficult to do it yourself.

I’ve created SSO for a lot of applications in the past. Knowing your target market, norms, and platform are all crucial.

Single Sign On

Single Sign On is an authentication method that allows apps to securely authenticate users into numerous applications by using just one set of login credentials.

This allows applications to avoid the hassle of storing and managing user information like passwords and also cuts down on troubleshooting login-related issues. With SSO configured, applications check with the SSO provider (Okta, Google, Salesforce, Microsoft) if the user’s identity can be verified.

Types of SSO
- Security Access Markup Language (SAML)
- OpenID Connect (OIDC)
- OAuth (specifically OAuth 2.0 nowadays)
- Federated Identity Management (FIM)
Security Assertion Markup Language – SAML

SAML (Security Assertion Markup Language) is an open standard that enables identity providers (IdP) to send authorization credentials to service providers (SP). Meaning you can use one set of credentials to log in to many different websites. It’s considerably easier to manage a single login per user than to handle several logins to email, CRM software, Active Directory, and other systems.

For standardized interactions between the identity provider and service providers, SAML transactions employ Extensible Markup Language (XML). SAML is the link between a user’s identity authentication and authorization to use a service.

In our example implementation, we will be using SAML 2.0 as the standard for the authentication flow.

Technical details
- A Service Provider (SP) is the entity that provides the service, which is in the form of an application. Examples: Active Directory, Okta Inbuilt IdP, Salesforce IdP, Google Suite.
- An Identity Provider (IdP) is the entity that provides identities, including the ability to authenticate a user. The user profile is normally stored in the Identity Provider typically and also includes additional information about the user such as first name, last name, job code, phone number, address, and so on. Depending on the application, some service providers might require a very simple profile (username, email), while others may need a richer set of user data (department, job code, address, location, and so on). Examples: Google – GDrive, Meet, Gmail.
- The SAML sign-in flow initiated by the Identity Provider is referred to as an Identity Provider Initiated (IdP-initiated) sign-in. In this flow, the Identity Provider begins a SAML response that is routed to the Service Provider to assert the user’s identity, rather than the SAML flow being triggered by redirection from the Service Provider. When a Service Provider initiates the SAML sign-in process, it is referred to as an SP-initiated sign-in. When end-users try to access a protected resource, such as when the browser tries to load a page from a protected network share, this is often triggered.
Configuration details
- Certificate – To validate the signature, the SP must receive the IdP’s public certificate. On the SP side, the certificate is kept and used anytime a SAML response is received.
- Assertion Consumer Service (ACS) Endpoint – The SP sign-in URL is sometimes referred to simply as the URL. This is the endpoint supplied by the SP for posting SAML responses. This information must be sent by the SP to the IdP.
- IdP Sign-in URL – This is the endpoint where SAML requests are posted on the IdP side. This information must be obtained by the SP from the IdP.
OpenID Connect – OIDC

OIDC protocol is based on the OAuth 2.0 framework. OIDC authenticates the identity of a specific user, while OAuth 2.0 allows two applications to trust each other and exchange data.

So, while the main flow appears to be the same, the labels are different.

How are SAML and OIDC similar?

The basic login flow for both is the same.

1. A user tries to log into the application directly.

2. The program sends the user’s login request to the IdP via the browser.

3. The user logs in to the IdP or confirms that they are already logged in.

4. The IdP verifies that the user has permission to use the program that initiated the request.

5. Information about the user is sent from the IdP to the user’s browser.

6. Their data is subsequently forwarded to the application.

7. The application verifies that they have permission to use the resources.

8. The user has been granted access to the program.

Difference between SAML and OIDC

1. SAML transmits user data in XML, while OpenID Connect transmits data in JSON.

2. SAML calls the data it sends an assertion. OAuth2 calls the data it sends a claim.

3. In SAML, the application or system the user is trying to get into is referred to as the Service Provider. In OIDC, it’s called the Relying Party.

SAML vs. OIDC

1. OpenID Connect is becoming increasingly popular. Because it interacts with RESTful API endpoints, it is easier to build than SAML and is easily available through APIs. This also implies that it is considerably more compatible with mobile apps.

2. You won’t often have a choice between SAML and OIDC when configuring Single Sign On (SSO) for an application through an identity provider like OneLogin. If you do have a choice, it is important to understand not only the differences between the two, but also which one is more likely to be sustained over time. OIDC appears to be the clear winner at this time because developers find it much easier to work with as it is more versatile.

Use Cases

1. SAML with OIDC:

– Log in with Salesforce: SAML Authentication where Salesforce was used as IdP and the web application as an SP.

Key Reason:

All users are centrally managed in Salesforce, so SAML was the preferred choice for authentication.

– Log in with Okta: OIDC Authentication where Okta used IdP and the web application as an SP.

Key Reason:

Okta Active Directory (AD) is already used for user provisioning and de-provisioning of all internal users and employees. Okta AD enables them to integrate Okta with any on-premise AD.

In both the implementation user provisioning and de-provisioning takes place at the IdP side.

SP-initiated (From web application)

IdP-initiated (From Okta Active Directory)

‍

2. Only OIDC login flow:
- OIDC Authentication where Google, Salesforce, Office365, and Okta are used as IdP and the web application as SP.
‍

Why not use OAuth for SSO

1. OAuth 2.0 is not a protocol for authentication. It explicitly states this in its documentation.

2. With authentication, you’re basically attempting to figure out who the user is when they authenticated, and how they authenticated. These inquiries are usually answered with SAML assertions rather than access tokens and permission grants.

OIDC vs. OAuth 2.0
- OAuth 2.0 is a framework that allows a user of a service to grant third-party application access to the service’s data without revealing the user’s credentials (ID and password).
- OpenID Connect is a framework on top of OAuth 2.0 where a third-party application can obtain a user’s identity information which is managed by a service. OpenID Connect can be used for SSO.
- In OAuth flow, Authorization Server gives back Access Token only. In the OpenID flow, the Authorization server returns Access Code and ID Token. A JSON Web Token, or JWT, is a specially formatted string of characters that serves as an ID Token. The Client can extract information from the JWT, such as your ID, name, when you logged in, the expiration of the ID Token, and if the JWT has been tampered with.
Federated Identity Management (FIM)

Identity Federation, also known as federated identity management, is a system that allows users from different companies to utilize the same verification method for access to apps and other resources.

In short, it’s what allows you to sign in to Spotify with your Facebook account.
- Single Sign On (SSO) is a subset of the identity federation.
- SSO generally enables users to use a single set of credentials to access multiple systems within a single organization, while FIM enables users to access systems across different organizations.
‍

How does FIM work?
- To log in to their home network, users use the security domain to authenticate.
- Users attempt to connect to a distant application that employs identity federation after authenticating to their home domain.
- Instead of the remote application authenticating the user itself, the user is prompted to authenticate from their home authentication server.
- The user’s home authentication server authorizes the user to the remote application and the user is permitted to access the app. The user’s home client is authenticated to the remote application, and the user is permitted access to the application.
A user can log in to their home domain once, to their home domain; remote apps in other domains can then grant access to the user without an additional login process.

‍Applications:
- Auth0: Auth0 uses OpenID Connect and OAuth 2.0 to authenticate users and get their permission to access protected resources. Auth0 allows developers to design and deploy applications and APIs that easily handle authentication and authorization issues such as the OIDC/OAuth 2.0 protocol with ease.
- AWS Cognito
- User pools – In Amazon Cognito, a user pool is a user directory. Your users can sign in to your online or mobile app using Amazon Cognito or federate through a third-party identity provider using a user pool (IdP). All members of the user pool have a directory profile that you may access using an SDK, whether they sign indirectly or through a third party.
- Identity pools – An identity pool allows your users to get temporary AWS credentials for services like Amazon S3 and DynamoDB.
Conclusion:

I hope you found the summary of my SSO research beneficial. The optimum implementation approach is determined by your unique situation, technological architecture, and business requirements.
December 12, 2022