Author: admin

Machine Learning for your Infrastructure: Anomaly Detection with Elastic + X-Pack
Introduction

The world continues to go through digital transformation at an accelerating pace. Modern applications and infrastructure continues to expand and operational complexity continues to grow. According to a recent ManageEngine Application Performance Monitoring Survey:
- 28 percent use ad-hoc scripts to detect issues in over 50 percent of their applications.
- 32 percent learn about application performance issues from end users.
- 59 percent trust monitoring tools to identify most performance deviations.
Most enterprises and web-scale companies have instrumentation & monitoring capabilities with an ElasticSearch cluster. They have a high amount of collected data but struggle to use it effectively. This available data can be used to improve availability and effectiveness of performance and uptime along with root cause analysis and incident prediction.

IT Operations & Machine Learning

Here is the main question: How to make sense of the huge piles of collected data? The first step towards making sense of data is to understand the correlations between the time series data. But only understanding will not work since correlation does not imply causation. We need a practical and scalable approach to understand the cause-effect relationship between data sources and events across complex infrastructure of VMs, containers, networks, micro-services, regions, etc.

It’s very likely that due to one component something goes wrong with another component. In such cases, operational historical data can be used to identify the root cause by investigating through a series of intermediate causes and effects. Machine learning is particularly useful for such problems where we need to identify “what changed”, since machine learning algorithms can easily analyze existing data to understand the patterns, thus making easier to recognize the cause. This is known as unsupervised learning, where the algorithm learns from the experience and identifies similar patterns when they come along again.

Let’s see how you can setup Elastic + X-Pack to enable anomaly detection for your infrastructure & applications.

Anomaly Detection using Elastic’s machine learning with X-Pack

Step I: Setup

1. Setup Elasticsearch:

According to Elastic documentation, it is recommended to use the Oracle JDK version 1.8.0_131. Check if you have required Java version installed on your system. It should be at least Java 8, if required install/upgrade accordingly.
- Download elasticsearch tarball and untar it‍
```
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.tar.gz
$ tar -xzvf elasticsearch-5.5.1.tar.gz
```
- It will then create a folder named elasticsearch-5.5.1. Go into the folder.‍‍
```
$ cd elasticsearch-5.5.1
```
- Install X-Pack into Elasticsearch‍‍
```
$ ./bin/elasticsearch-plugin install x-pack
```
- Start elasticsearch‍‍
```
$ bin/elasticsearch
```
2. Setup Kibana

Kibana is an open source analytics and visualization platform designed to work with Elasticsearch.
- Download kibana tarball and untar it‍
```
$ wget https://artifacts.elastic.co/downloads/kibana/kibana-5.5.1-linux-x86_64.tar.gz
$ tar -xzf kibana-5.5.1-linux-x86_64.tar.gz
```
- It will then create a folder named kibana-5.5.1. Go into the directory.‍
```
$ cd kibana-5.5.1-linux-x86_64
```
- Install X-Pack into Kibana‍
```
$ ./bin/kibana-plugin install x-pack
```
- Running kibana‍
```
$ ./bin/kibana
```
- Navigate to Kibana at http://localhost:5601/
- Log in as the built-in user elastic and password changeme.
- You will see the below screen:
Kibana: X-Pack Welcome Page

3. Metricbeat:

Metricbeat helps in monitoring servers and the services they host by collecting metrics from the operating system and services. We will use it to get CPU utilization metrics of our local system in this blog.
- Download Metric Beat’s tarball and untar it‍
```
$ wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-5.5.1-linux-x86_64.tar.gz
$ tar -xvzf metricbeat-5.5.1-linux-x86_64.tar.gz
```
- It will create a folder metricbeat-5.5.1-linux-x86_64. Go to the folder‍
```
$ cd metricbeat-5.5.1-linux-x86_64
```
- By default, Metricbeat is configured to send collected data to elasticsearch running on localhost. If your elasticsearch is hosted on any server, change the IP and authentication credentials in metricbeat.yml file.
Metricbeat Config
- Metric beat provides following stats:
- System load
- CPU stats
- IO stats
- Per filesystem stats
- Per CPU core stats
- File system summary stats
- Memory stats
- Network stats
- Per process stats
- Start Metricbeat as daemon process‍
```
$ sudo ./metricbeat -e -c metricbeat.yml &
```
Now, all setup is done. Let’s go to step 2 to create machine learning jobs.

Step II: Time Series data
- Real-time data: We have metricbeat providing us the real-time series data which will be used for unsupervised learning. Follow below steps to define index pattern metricbeat-* in Kibana to search against this pattern in Elasticsearch:
  – Go to Management -> Index Patterns
  – Provide Index name or pattern as metricbeat-*
  – Select Time filter field name as @timestamp
  – Click Create
You will not be able to create an index if elasticsearch did not contain any metric beat data. Make sure your metric beat is running and output is configured as elasticsearch.
- Saved Historic data: Just to see quickly how machine learning detect the anomalies you can also use data provided by Elastic. Download sample data by clicking here.
- Unzip the files in a folder: tar -zxvf server_metrics.tar.gz
- Download this script. It will be used to upload sample data to elastic.
- Provide execute permissions to the file: chmod +x upload_server-metrics.sh
- Run the script.
- As we created index pattern for metricbeat data, in same way create index pattern server-metrics*
Step III: Creating Machine Learning jobs

There are two scenarios in which data is considered anomalous. First, when the behavior of key indicator changes over time relative to its previous behavior. Secondly, when within a population behavior of an entity deviates from other entities in population over single key indicator.

To detect these anomalies, there are three types of jobs we can create:
1. Single Metric job: This job is used to detect Scenario 1 kind of anomalies over only one key performance indicator.
2. Multimetric job: Multimetric job also detects Scenario 1 kind of anomalies but in this type of job we can track more than one performance indicators, such as CPU utilization along with memory utilization.
3. Advanced job: This kind of job is created to detect anomalies of type 2.
For simplicity, we are creating following single metric jobs:
1. Tracking CPU Utilization: Using metric beat data
2. Tracking total requests made on server: Using sample server data
Follow below steps to create single metric jobs:

Job1: Tracking CPU Utilization

Job2: Tracking total requests made on server
- Go to http://localhost:5601/
- Go to Machine learning tab on the left panel of Kibana.
- Click on Create new job
- Click Create single metric job
- Select index we created in Step 2 i.e. metricbeat-* and server-metrics* respectively
- Configure jobs by providing following values:
1. Aggregation: Here you need to select an aggregation function that will be applied to a particular field of data we are analyzing.
2. Field: It is a drop down, will show you all field that you have w.r.t index pattern.
3. Bucket span: It is interval time for analysis. Aggregation function will be applied on selected field after every interval time specified here.
- If your data contains so many empty buckets i.e. data is sparse and you don’t want to consider it as anomalous check the checkbox named sparse data (if it appears).
- Click on Use full <index pattern=””> data to use all available data for analysis.</index>
Metricbeats Description

Server Description
- Click on play symbol
- Provide job name and description
- Click on Create Job
After creating job the data available will be analyzed. Click on view results, you will see a chart which will show the actual and upper & lower bound of predicted value. If actual value lies outside of the range, it will be considered as anomalous. The Color of the circles represents the severity level.

Here we are getting a high range of prediction values since it just started learning. As we get more data the prediction will get better.

You can see here predictions are pretty good since there is a lot of data to understand the pattern
- Click on machine learning tab in the left panel. The jobs we created will be listed here.
- You will see the list of actions for every job you have created.
- Since we are storing every minute data for Job1 using metricbeat. We can feed the data to the job in real time. Click on play button to start data feed. As we get more and more data prediction will improve.
- You see details of anomalies by clicking Anomaly Viewer.
Anomaly in metricbeats data

Server metrics anomalies

We have seen how machine learning can be used to get patterns among the different statistics along with anomaly detection. After identifying anomalies, it is required to find the context of those events. For example, to know about what other factors are contributing to the problem? In such cases, we can troubleshoot by creating multimetric jobs.
December 12, 2022
API Testing Using Postman and Newman
In the last few years, we have an exponential increase in the development and use of APIs. We are in the era of API-first companies like Stripe, Twilio, Mailgun etc. where the entire product or service is exposed via REST APIs. Web applications also today are powered by REST-based Web Services. APIs today encapsulate critical business logic with high SLAs. Hence it is important to test APIs as part of the continuous integration process to reduce errors, improve predictability and catch nasty bugs.

In the context of API development, Postman is great REST client to test APIs. Although Postman is not just a REST Client, it contains a full-featured testing sandbox that lets you write and execute Javascript based tests for your API.

Postman comes with a nifty CLI tool – Newman. Newman is the Postman’s Collection Runner engine that sends API requests, receives the response and then runs your tests against the response. Newman lets developments easily integrate Postman into continuous integration systems like Jenkins. Some of the important features of Postman & Newman include:-
1. Ability to test any API and see the response instantly.
2. Ability to create test suites or collections using a collection of API endpoints.
3. Ability to collaborate with team members on these collections.
4. Ability to easily export/import collections as JSON files.
We are going to look at all these features, some are intuitive and some not so much unless you’ve been using Postman for a while.

Setting up Your Postman

You can install Postman either as a Chrome extension or as a native application

Later, can then look it up in your installed apps and open it. You can choose to Sign Up & create an account if you want, this is important especially for saving your API collections and accessing them anytime on any machine. However, for this article, we can skip this. There’s a button for that towards the bottom when you first launch the app.

Postman Collections

Postman Collections in simple words is a collection of tests. It is essentially a test suite of related tests. These tests can be scenario-based tests or sequence/workflow-based tests.

There’s a Collections tab on the top left of Postman, with an example Postman Echo collection. You can open and go through it.

Just like in the above screenshot, select a API request and click on the Tests. Check the first line:
```
tests["response code is 200"] = responseCode.code === 200;
```
The above line is a simple test to check if the response code for the API is 200. This is the pattern for writing Assertions/Tests in Postman (using JavaScript), and this is actually how you are going to write the tests for API’s need to be tested.You can open the other API requests in the POSTMAN Echo collection to get a sense of how requests are made.

Adding a COLLECTION

To make your own collection, click on the ‘Add Collection‘ button on the top left of Postman and call it “Test API”

You will be prompted to give details about the collection, I’ve added a name Github API and given it a description.

Clicking on Create should add the collection to the left pane, above, or below the example “POSTMAN Echo” collection.

If you need a hierarchy for maintaining relevance between multiple API’s inside a collection, APIs can further be added to a folder inside a collection. Folders are a great way of separating different parts of your API workflow. You can be added folders through the “3 dot” button beside Collection Name:

Eg.: name the folder “Get Calls” and give a description once again.

Now that we have the folder, the next task is to add an API call that is related to the TEST_API_COLLECTION to that folder. That API call is to https://api.github.com/.

If you still have one of the TEST_API_COLLECTION collections open, you can close it the same way you close tabs in a browser, or just click on the plus button to add a new tab on the right pane where we make requests.

Type in or paste in https://api.github.com/ and press Send to see the response.

Once you get the response, you can click on the arrow next to the Save button on the far right, and select Save As, a pop up will be displayed asking where to save the API call.

Give a name, it can be the request URL, or a name like “GET Github Basic”, and a description, then choose the collection and folder, in this case, TEST_API_COLLECTION> GET CALLS, then click on Save. The API call will be added to the Github Root API folder on the left pane.

Whenever you click on this request from the collection, it will open in the center pane.

Write the Tests

We’ve seen that the GET Github Basic request has a JSON response, which is usually the case for most of the APIs.This response has properties such as current_user_url, emails_url, followers_url and following_url to pick a few. The current_user_url has a value of https://api.github.com/user. Let’s add a test, for this URL. Click on the ‘GET Github Basic‘ and click on the test tab in the section just below where the URL is put.

You will notice on the right pane, we have some snippets which Postman creates when you click so that you don’t have to write a lot of code. Let’s add Response Body: JSON value check. Clicking on it produces the following snippet.
```
var jsonData = JSON.parse(responseBody);
tests["Your test name"] = jsonData.value === 100;
```
From these two lines, it is apparent that Postman stores the response in a global object called responseBody, and we can use this to access response and assert values in tests as required.

Postman also has another global variable object called tests, which is an object you can use to name your tests, and equate it to a boolean expression. If the boolean expression returns true, then the test passes.
```
tests['some random test'] = x === y
```
If you click on Send to make the request, you will see one of the tests failing.

Lets create a test that relevant to our usecase.
```
var jsonData = JSON.parse(responseBody);
var usersURL = "https://api.github.com/user"
tests["Gets the correct users url"] = jsonData.current_user_url === usersURL;
```
Clicking on ‘Send‘, you’ll see the test passing.

Let’s modify the test further to test some of the properties we want to check

Ideally the things to be tested in an API Response Body should be:
- Response Code ( Assert Correct Response Code for any request)
- Response Time ( to check api responds in an acceptable time range / is not delayed)
- Response Body is not empty / null
```
tests["Status code is 200"] = responseCode.code === 200;
tests["Response time is less than 200ms"] = responseTime < 200;
tests["Response time is acceptable"] = _.inRange(responseTime, 0, 500);
tests["Body is not empty"] = (responseBody!==null || responseBody.length!==0);
```
Newman CLI

Once you’ve set up all your collections and written tests for them, it may be tedious to go through them one by one and clicking send to see if a given collection test passes. This is where Newman comes in. Newman is a command-line collection runner for Postman.

All you need to do is export your collection and the environment variables, then use Newman to run the tests from your terminal.

NOTE: Make sure you’ve clicked on ‘Save’ to save your collection first before exporting.

USING NEWMAN

So the first step is to export your collection and environment variables. Click on the Menu icon for Github API collection, and select export.

Select version 2, and click on “Export”

Save the JSON file in a location you can access with your terminal. I created a local directory/folder called “postman” and saved it there.

Install Newman CLI globally, then navigate to the where you saved the collection.
```
npm install -g newman 
cd postman
```
Using Newman is quite straight-forward, and the documentation is extensive. You can even require it as a Node.js module and run the tests there. However, we will use the CLI.

Once you are in the directory, run newman run <collection_name.json>, </collection_name.json> replacing the collection_name with the name you used to save the collection.
```
newman run TEST_API_COLLECTION.postman_collection.json     
```
NEWMAN CLI Options

Newman provides a rich set of options to customize a run. A list of options can be retrieved by running it with the -h flag.
$ newman run -h Options - Additional args: Utility: -h, --help output usage information -v, --version output the version number Basic setup: --folder [folderName] Specify a single folder to run from a collection. -e, --environment [file|URL] Specify a Postman environment as a JSON [file] -d, --data [file] Specify a data file to use either json or csv -g, --global [file] Specify a Postman globals file as JSON [file] -n, --iteration-count [number] Define the number of iterations to run Request options: --delay-request [number] Specify a delay (in ms) between requests [number] --timeout-request [number] Specify a request timeout (in ms) for a request Misc.: --bail Stops the runner when a test case fails --silent Disable terminal output --no-color Disable colored output -k, --insecure Disable strict ssl -x, --suppress-exit-code Continue running tests even after a failure, but exit with code=0 --ignore-redirects Disable automatic following of 3XX responses
```
$ newman run -h
Options - Additional args: 
Utility:
-h, --help output usage information
-v, --version output the version number
Basic setup:
--folder [folderName] Specify a single folder to run from a collection.
-e, --environment [file|URL] Specify a Postman environment as a JSON [file]
-d, --data [file] Specify a data file to use either json or csv
-g, --global [file] Specify a Postman globals file as JSON [file]
-n, --iteration-count [number] Define the number of iterations to run
Request options:
--delay-request [number] Specify a delay (in ms) between requests [number] --timeout-request [number] Specify a request timeout (in ms) for a request
Misc.:
--bail Stops the runner when a test case fails
--silent Disable terminal output --no-color Disable colored output
-k, --insecure Disable strict ssl
-x, --suppress-exit-code Continue running tests even after a failure, but exit with code=0
--ignore-redirects Disable automatic following of 3XX responses
```
Lets try out of some of the options.

Iterations

Lets use the -n option to set the number of iterations to run the collection.
```
$ newman run mycollection.json -n 10 # runs the collection 10 times
```
To provide a different set of data, i.e. variables for each iteration, you can use the -d to specify a JSON or CSV file. For example, a data file such as the one shown below will run 2 iterations, with each iteration using a set of variables.
```
[{
"url": "http://127.0.0.1:5000",
  "user_id": "1",
  "id": "1",
  "token_id": "123123",
},{
  "url": "http://postman-echo.com",
  "user_id": "2",
  "id": "2",
  "token_id": "899899",
}]$ newman run mycollection.json -d data.json
```
Alternately, the CSV file for the above set of variables would look like:
```
url, user_id, id, token_id 
http://127.0.0.1:5000, 1, 1, 123123123 
http://postman-echo.com, 2, 2, 899899
```
Environment Variables

Each environment is a set of key-value pairs, with the key as the variable name. These Environment configurations can be used to differentiate between configurations specific to your execution environments eg. Dev, Test & Production.

To provide a different execution environment, you can use the -e to specify a JSON or CSV file. For example, a environment file such as the one shown below will provide the environment variables globally to all tests during execution.
```
postman_dev_env.json
{
"id": "b5c617ad-7aaf-6cdf-25c8-fc0711f8941b",
"name": "dev env",
"values": [
{
"enabled": true,
"key": "env",
"value": "dev.example.com",
"type": "text"
}  
],
"timestamp": 1507210123364,
"_postman_variable_scope": "environment",
"_postman_exported_at": "2017-10-05T13:28:45.041Z",
"_postman_exported_using": "Postman/5.2.1"
}
```
Bail FLAG

Newman, by default, exits with a status code of 0 if everything runs well i.e. without any exceptions. Continuous integration tools respond to these exit codes and correspondingly pass or fail a build. You can use the –bail flag to tell Newman to halt on a test case error with a status code of 1 which can then be picked up by a CI tool or build system.
```
$ newman run PostmanCollection.json -e environment.json --bail newman
```
Conclusion

Postman and Newman can be used for a number of test cases, including creating usage scenarios, Suites, Packs for your API Test Cases. Further NEWMAN / POSTMAN can be very well Integrated with CI/CD Tools such as Jenkins, Travis etc.
December 12, 2022

Autoscaling in Kubernetes using HPA and VPA

Autoscaling, a key feature of Kubernetes, lets you improve the resource utilization of your cluster by automatically adjusting the application’s resources or replicas depending on the load at that time.

This blog talks about Pod Autoscaling in Kubernetes and how to set up and configure autoscalers to optimize the resource utilization of your application.

Horizontal Pod Autoscaling

What is the Horizontal Pod Autoscaler?

The Horizontal Pod Autoscaler (HPA) scales the number of pods of a replica-set/ deployment/ statefulset based on per-pod metrics received from resource metrics API (metrics.k8s.io) provided by metrics-server, the custom metrics API (custom.metrics.k8s.io), or the external metrics API (external.metrics.k8s.io).

‍Prerequisite

Verify that the metrics-server is already deployed and running using the command below, or deploy it using instructions here.

kubectl get deployment metrics-server -n kube-system

kubectl get deployment metrics-server -n kube-system

HPA using Multiple Resource Metrics‍

HPA fetches per-pod resource metrics (like CPU, memory) from the resource metrics API and calculates the current metric value based on the mean values of all targeted pods. It compares the current metric value with the target metric value specified in the HPA spec and produces a ratio used to scale the number of desired replicas.

A. Setup: Create a Deployment and HPA resource

In this blog post, I have used the config below to create a deployment of 3 replicas, with some memory load defined by “–vm-bytes”, “850M”.

apiVersion: apps/v1
kind: Deployment
metadata:
 name: autoscale-tester
spec:
 replicas: 3
 selector:
   matchLabels:
     app: autoscale-tester
 template:
   metadata:
     labels:
       app: autoscale-tester
   spec:
     containers:
     - args: [ "--vm", "1", "--vm-bytes", "850M", "--vm-hang", "1"]
       command:
       - stress
       image: polinux/stress
       name: autoscale-tester
       resources:
         limits:
           cpu: "1"
           memory: 1000Mi
         requests:
           cpu: "1"
           memory: 1000Mi

apiVersion: apps/v1
kind: Deployment
metadata:
 name: autoscale-tester
spec:
 replicas: 3
 selector:
   matchLabels:
     app: autoscale-tester
 template:
   metadata:
     labels:
       app: autoscale-tester
   spec:
     containers:
     - args: [ "--vm", "1", "--vm-bytes", "850M", "--vm-hang", "1"]
       command:
       - stress
       image: polinux/stress
       name: autoscale-tester
       resources:
         limits:
           cpu: "1"
           memory: 1000Mi
         requests:
           cpu: "1"
           memory: 1000Mi

NOTE: It’s recommended not to use HPA and VPA on the same pods or deployments.

kubectl top po
NAME                            	CPU(cores)   MEMORY(bytes)   
autoscale-tester-878b8c6c8-42gmk   326m     	853Mi      	 
autoscale-tester-878b8c6c8-gp45f   410m     	852Mi      	 
autoscale-tester-878b8c6c8-tz4mg   388m     	852Mi

kubectl top po
NAME                            	CPU(cores)   MEMORY(bytes)   
autoscale-tester-878b8c6c8-42gmk   326m     	853Mi      	 
autoscale-tester-878b8c6c8-gp45f   410m     	852Mi      	 
autoscale-tester-878b8c6c8-tz4mg   388m     	852Mi

Lets create an HPA resource for this deployment with multiple metric blocks defined. The HPA will consider each metric one-by-one and calculate the desired replica counts based on each of the metrics, and then select the one with the highest replica count.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
 name: autoscale-tester
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: autoscale-tester
 minReplicas: 1
 maxReplicas: 10
 metrics:
 - type: Resource
   resource:
     name: cpu
     target:
       type: Utilization
       averageUtilization: 50
 - type: Resource
   resource:
     name: memory
     target:
       type: AverageValue
       averageValue: 500Mi

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
 name: autoscale-tester
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: autoscale-tester
 minReplicas: 1
 maxReplicas: 10
 metrics:
 - type: Resource
   resource:
     name: cpu
     target:
       type: Utilization
       averageUtilization: 50
 - type: Resource
   resource:
     name: memory
     target:
       type: AverageValue
       averageValue: 500Mi

We have defined the minimum number of replicas HPA can scale down to as 1 and the maximum number that it can scale up to as 10.
Target Average Utilization and Target Average Values implies that the HPA should scale the replicas up/down to keep the Current Metric Value equal or closest to Target Metric Value.

B. Understanding the HPA Algorithm

kubectl describe hpa autoscale-tester
Name:       autoscale-tester
Namespace:  autoscale-tester
...
Metrics:                                           	( current / target )
  resource memory on pods:                         	894188202666m / 500Mi
  resource cpu on pods  (as a percentage of request):  36% (361m) / 50%
Min replicas:                                      	1
Max replicas:                                      	10
Deployment pods:                                   	3 current / 6 desired
Conditions:
  Type        	Status  Reason          	Message
  ----        	------  ------          	-------
  AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 6
  ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from memory resource
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type	Reason         	Age   From                   	Message
  ----	------         	----  ----                   	-------
  Normal  SuccessfulRescale  7s	horizontal-pod-autoscaler  New size: 6; reason: memory resource above target

kubectl describe hpa autoscale-tester
Name:       autoscale-tester
Namespace:  autoscale-tester
...
Metrics:                                           	( current / target )
  resource memory on pods:                         	894188202666m / 500Mi
  resource cpu on pods  (as a percentage of request):  36% (361m) / 50%
Min replicas:                                      	1
Max replicas:                                      	10
Deployment pods:                                   	3 current / 6 desired
Conditions:
  Type        	Status  Reason          	Message
  ----        	------  ------          	-------
  AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 6
  ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from memory resource
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type	Reason         	Age   From                   	Message
  ----	------         	----  ----                   	-------
  Normal  SuccessfulRescale  7s	horizontal-pod-autoscaler  New size: 6; reason: memory resource above target

HPA calculates pod utilization as total usage of all containers in the pod divided by total request. It looks at all containers individually and returns if container doesn’t have request.
The calculated Current Metric Value for memory, i,e., 894188202666m, is higher than the Target Average Value of 500Mi, so the replicas need to be scaled up.
The calculated Current Metric Value for CPU i.e., 36%, is lower than the Target Average Utilization of 50, so hence the replicas need to be scaled down.
Replicas are calculated based on both metrics and the highest replica count selected. So, the replicas are scaled up to 6 in this case.

HPA using Custom metrics

We will use the prometheus-adapter resource to expose custom application metrics to custom.metrics.k8s.io/v1beta1, which are retrieved by HPA. By defining our own metrics through the adapter’s configuration, we can let HPA perform scaling based on our custom metrics.

A. Setup: Install Prometheus Adapter

Create prometheus-adapter.yaml with the content below:

prometheus:
 url: http://prometheus-server
 port: 0
image:
 tag: latest
rules:
 custom:
   - seriesQuery: 'container_network_receive_packets_total{namespace!="",pod!=""}'
     resources:
       overrides:
         namespace: {resource: "namespace"}
         pod: {resource: "pod"}
  	name:
      	matches: "container_network_receive_packets_total"
      	as: "packets_in"
  	metricsQuery: <<.Series>>{<<.LabelMatchers>>}

prometheus:
 url: http://prometheus-server
 port: 0
image:
 tag: latest
rules:
 custom:
   - seriesQuery: 'container_network_receive_packets_total{namespace!="",pod!=""}'
     resources:
       overrides:
         namespace: {resource: "namespace"}
         pod: {resource: "pod"}
  	name:
      	matches: "container_network_receive_packets_total"
      	as: "packets_in"
  	metricsQuery: <<.Series>>{<<.LabelMatchers>>}

helm install stable/prometheus -n prometheus --namespace prometheus
helm install stable/prometheus-adapter -n prometheus-adapter --namespace prometheus -f prometheus-adapter.yaml

helm install stable/prometheus -n prometheus --namespace prometheus
helm install stable/prometheus-adapter -n prometheus-adapter --namespace prometheus -f prometheus-adapter.yaml

Once the charts are deployed, verify the metrics are exposed at v1beta1.custom.metrics.k8s.io:

kubectl get apiservice
NAME                               	SERVICE                     	AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io      	prometheus/prometheus-adapter   True    	19m 


kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/autoscale-hpa/pods/*/packets_in | jq
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
	"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/autoscale-hpa/pods/%2A/packets_in"
  },
 "items": [
	{
  	"describedObject": {
    	"kind": "Pod",
    	"namespace": "autoscale-hpa",
    	"name": "autoscale-tester-878b8c6c8-42gmk",
    	"apiVersion": "/v1"
  	},
  	"metricName": "packets_in",
  	"timestamp": "2020-07-31T05:59:33Z",
  	"value": "33",
  	"selector": null
	},
	{
  	"describedObject": {
    	"kind": "Pod",
    	"namespace": "autoscale-hpa",
    	"name": "autoscale-tester-878b8c6c8-hfts8",
    	"apiVersion": "/v1"
  	},
  	"metricName": "packets_in",
  	"timestamp": "2020-07-31T05:59:33Z",
  	"value": "11",
  	"selector": null
	},
	{
  	"describedObject": {
    	"kind": "Pod",
    	"namespace": "autoscale-hpa",
    	"name": "autoscale-tester-878b8c6c8-rb9v2",
    	"apiVersion": "/v1"
  	},
  	"metricName": "packets_in",
  	"timestamp": "2020-07-31T05:59:33Z",
  	"value": "10",
  	"selector": null
	}
  ]
}

kubectl get apiservice
NAME                               	SERVICE                     	AVAILABLE   AGE
v1beta1.custom.metrics.k8s.io      	prometheus/prometheus-adapter   True    	19m 


kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/autoscale-hpa/pods/*/packets_in | jq
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
	"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/autoscale-hpa/pods/%2A/packets_in"
  },
 "items": [
	{
  	"describedObject": {
    	"kind": "Pod",
    	"namespace": "autoscale-hpa",
    	"name": "autoscale-tester-878b8c6c8-42gmk",
    	"apiVersion": "/v1"
  	},
  	"metricName": "packets_in",
  	"timestamp": "2020-07-31T05:59:33Z",
  	"value": "33",
  	"selector": null
	},
	{
  	"describedObject": {
    	"kind": "Pod",
    	"namespace": "autoscale-hpa",
    	"name": "autoscale-tester-878b8c6c8-hfts8",
    	"apiVersion": "/v1"
  	},
  	"metricName": "packets_in",
  	"timestamp": "2020-07-31T05:59:33Z",
  	"value": "11",
  	"selector": null
	},
	{
  	"describedObject": {
    	"kind": "Pod",
    	"namespace": "autoscale-hpa",
    	"name": "autoscale-tester-878b8c6c8-rb9v2",
    	"apiVersion": "/v1"
  	},
  	"metricName": "packets_in",
  	"timestamp": "2020-07-31T05:59:33Z",
  	"value": "10",
  	"selector": null
	}
  ]
}

You can see the metrics value of all the replicas in the output.

B. Understanding Prometheus Adapter Configuration

The adapter considers metrics defined with the parameters below:

1. seriesQuery tells the Prometheus Metric name to the adapter‍

2. resources tells which Kubernetes resources each metric is associated with or which labels does the metric include, e.g., namespace, pod etc.

3. metricsQuery is the actual Prometheus query that needs to be performed to calculate the actual values.

4. name with which the metric should be exposed to the custom metrics API

For instance, if we want to calculate the rate of container_network_receive_packets_total, we will need to write this query in Prometheus UI:

sum(rate(container_network_receive_packets_total{namespace=”autoscale-tester”,pod=~”autoscale-tester.*”}[10m])) by (pod)

This query is represented as below in the adapter configuration:

metricsQuery: ‘sum(rate(<<.series>>{<<.labelmatchers>>}10m])) by (<<.groupby>>)'</.groupby></.labelmatchers></.series>

C. Create an HPA resource

Now, let’s create an HPA resource with the pod metric packets_in using the config below, and then describe the HPA resource.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
 name: autoscale-tester
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: autoscale-tester
 minReplicas: 1
 maxReplicas: 10
 metrics:
 - type: Pods
   pods:
     metric:
       name: packets_in
     target:
       type: AverageValue
       averageValue: 50

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
 name: autoscale-tester
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: autoscale-tester
 minReplicas: 1
 maxReplicas: 10
 metrics:
 - type: Pods
   pods:
     metric:
       name: packets_in
     target:
       type: AverageValue
       averageValue: 50

kubectl describe hpa autoscale-tester
Name:                	autoscale-tester
Namespace:           	autoscale-tester
...
Metrics:             	( current / target )
  "packets_in" on pods:  18666m / 50
Min replicas:        	1
Max replicas:        	10
Deployment pods:     	3 current / 3 desired
Conditions:
  Type        	Status  Reason          	Message
  ----        	------  ------          	-------
  AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 2
  ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from pods metric packets_in
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type	Reason         	Age   From                   	Message
  ----	------         	----  ----                   	-------
  Normal  SuccessfulRescale  2s	horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal  SuccessfulRescale  2m51s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target 
kubectl describe hpa autoscale-tester
Name:                	autoscale-tester
Namespace:           	autoscale-tester
...
Metrics:             	( current / target )
  "packets_in" on pods:  18666m / 50
Min replicas:        	1
Max replicas:        	10
Deployment pods:     	3 current / 3 desired
Conditions:
  Type        	Status  Reason          	Message
  ----        	------  ------          	-------
  AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 2
  ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from pods metric packets_in
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type	Reason         	Age   From                   	Message
  ----	------         	----  ----                   	-------
  Normal  SuccessfulRescale  2s	horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal  SuccessfulRescale  2m51s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

kubectl describe hpa autoscale-tester
Name:                	autoscale-tester
Namespace:           	autoscale-tester
...
Metrics:             	( current / target )
  "packets_in" on pods:  18666m / 50
Min replicas:        	1
Max replicas:        	10
Deployment pods:     	3 current / 3 desired
Conditions:
  Type        	Status  Reason          	Message
  ----        	------  ------          	-------
  AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 2
  ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from pods metric packets_in
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type	Reason         	Age   From                   	Message
  ----	------         	----  ----                   	-------
  Normal  SuccessfulRescale  2s	horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal  SuccessfulRescale  2m51s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target 
kubectl describe hpa autoscale-tester
Name:                	autoscale-tester
Namespace:           	autoscale-tester
...
Metrics:             	( current / target )
  "packets_in" on pods:  18666m / 50
Min replicas:        	1
Max replicas:        	10
Deployment pods:     	3 current / 3 desired
Conditions:
  Type        	Status  Reason          	Message
  ----        	------  ------          	-------
  AbleToScale 	True	SucceededRescale	the HPA controller was able to update the target scale to 2
  ScalingActive   True	ValidMetricFound	the HPA was able to successfully calculate a replica count from pods metric packets_in
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type	Reason         	Age   From                   	Message
  ----	------         	----  ----                   	-------
  Normal  SuccessfulRescale  2s	horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal  SuccessfulRescale  2m51s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

Here, the current calculated metric value is 18666m. The m represents milli-units. So, for example, 18666m means 18.666 which is what we expect ((33 + 11 + 10 )/3 = 18.666). Since it’s less than the target average value (i.e., 50), the HPA scales down the replicas to make the Current Metric Value : Target Metric Value ratio closest to 1. Hence, replicas are scaled down to 2 and later to 1.

Fig:- container_network_receive_packets_total

‍Vertical Pod Autoscaling

What is Vertical Pod Autoscaler?

Vertical Pod autoscaling (VPA) ensures that a container’s resources are not under- or over-utilized. It recommends optimized CPU and memory requests/limits values, and can also automatically update them for you so that the cluster resources are efficiently used.

Architecture

VPA consists of 3 components:

VPA admission controller
Once you deploy and enable the Vertical Pod Autoscaler in your cluster, every pod submitted to the cluster goes through this webhook, which checks whether a VPA object is referencing it.
VPA recommender
The recommender pulls the current and past resource consumption (CPU and memory) data for each container from metrics-server running in the cluster and provides optimal resource recommendations based on it, so that a container uses only what it needs.
VPA updater
The updater checks at regular intervals if a pod is running within the recommended range. Otherwise, it accepts it for update, and the pod is evicted by the VPA updater to apply resource recommendation.

Installation

If you are on Google Cloud Platform, you can simply enable vertical-pod-autoscaling:‍

gcloud container clusters update <cluster-name> --enable-vertical-pod-autoscaling

gcloud container clusters update <cluster-name> --enable-vertical-pod-autoscaling

To install it manually follow below steps:

Verify that the metrics-server deployment is running, or deploy it using instructions here.

kubectl get deployment metrics-server -n kube-system

kubectl get deployment metrics-server -n kube-system

Also, verify the API below is enabled:

kubectl api-versions | grep admissionregistration
admissionregistration.k8s.io/v1beta1

kubectl api-versions | grep admissionregistration
admissionregistration.k8s.io/v1beta1

Clone the kubernetes/autoscaler GitHub repository, and then deploy the Vertical Pod Autoscaler with the following command.

git clone https://github.com/kubernetes/autoscaler.git
./autoscaler/vertical-pod-autoscaler/hack/vpa-up.sh

git clone https://github.com/kubernetes/autoscaler.git
./autoscaler/vertical-pod-autoscaler/hack/vpa-up.sh

Verify that the Vertical Pod Autoscaler pods are up and running:

kubectl get po -n kube-system
NAME                                        READY   STATUS    RESTARTS   AGE
vpa-admission-controller-68c748777d-ppspd   1/1     Running   0          7s
vpa-recommender-6fc8c67d85-gljpl            1/1     Running   0          8s
vpa-updater-786b96955c-bgp9d                1/1     Running   0          8s

kubectl get crd
verticalpodautoscalers.autoscaling.k8s.io

kubectl get po -n kube-system
NAME                                        READY   STATUS    RESTARTS   AGE
vpa-admission-controller-68c748777d-ppspd   1/1     Running   0          7s
vpa-recommender-6fc8c67d85-gljpl            1/1     Running   0          8s
vpa-updater-786b96955c-bgp9d                1/1     Running   0          8s

kubectl get crd
verticalpodautoscalers.autoscaling.k8s.io

VPA using Resource Metrics

A. Setup: Create a Deployment and VPA resource

Use the same deployment config to create a new deployment with “–vm-bytes”, “850M”. Then create a VPA resource in Recommendation Mode with updateMode : Off

apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
 name: autoscale-tester-recommender
spec:
 targetRef:
   apiVersion: "apps/v1"
   kind:       Deployment
   name:       autoscale-tester
 updatePolicy:
   updateMode: "Off"
 resourcePolicy:
   containerPolicies:
   - containerName: autoscale-tester
     minAllowed:
       cpu: "500m"
       memory: "500Mi"
     maxAllowed:
       cpu: "4"
       memory: "8Gi"

apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
 name: autoscale-tester-recommender
spec:
 targetRef:
   apiVersion: "apps/v1"
   kind:       Deployment
   name:       autoscale-tester
 updatePolicy:
   updateMode: "Off"
 resourcePolicy:
   containerPolicies:
   - containerName: autoscale-tester
     minAllowed:
       cpu: "500m"
       memory: "500Mi"
     maxAllowed:
       cpu: "4"
       memory: "8Gi"

minAllowed is an optional parameter that specifies the minimum CPU request and memory request allowed for the container.
maxAllowed is an optional parameter that specifies the maximum CPU request and memory request allowed for the container.

B. Check the Pod’s Resource Utilization

Check the resource utilization of the pods. Below, you can see only ~50 Mi memory is being used out of 1000Mi and only ~30m CPU out of 1000m. This clearly indicates that the pod resources are underutilized.

Kubectl top po
NAME                            	CPU(cores)   MEMORY(bytes)   
autoscale-tester-5d6b48d64f-8zgb9   39m      	51Mi       	 
autoscale-tester-5d6b48d64f-npts4   32m      	50Mi       	 
autoscale-tester-5d6b48d64f-vctx5   35m      	50Mi

Kubectl top po
NAME                            	CPU(cores)   MEMORY(bytes)   
autoscale-tester-5d6b48d64f-8zgb9   39m      	51Mi       	 
autoscale-tester-5d6b48d64f-npts4   32m      	50Mi       	 
autoscale-tester-5d6b48d64f-vctx5   35m      	50Mi

If you describe the VPA resource, you can see the Recommendations provided. (It may take some time to show them.)

kubectl describe vpa autoscale-tester-recommender
Name:     	autoscale-tester-recommender
Namespace:	autoscale-tester
...
  Recommendation:
	Container Recommendations:
  	Container Name:  autoscale-tester
  	Lower Bound:
    	Cpu: 	500m
    	Memory:  500Mi
  	Target:
    	Cpu: 	500m
    	Memory:  500Mi
  	Uncapped Target:
    	Cpu: 	93m
    	Memory:  262144k
  	Upper Bound:
    	Cpu: 	4
    	Memory:  4Gi

kubectl describe vpa autoscale-tester-recommender
Name:     	autoscale-tester-recommender
Namespace:	autoscale-tester
...
  Recommendation:
	Container Recommendations:
  	Container Name:  autoscale-tester
  	Lower Bound:
    	Cpu: 	500m
    	Memory:  500Mi
  	Target:
    	Cpu: 	500m
    	Memory:  500Mi
  	Uncapped Target:
    	Cpu: 	93m
    	Memory:  262144k
  	Upper Bound:
    	Cpu: 	4
    	Memory:  4Gi

C. Understand the VPA recommendations

Target: The recommended CPU request and memory request for the container that will be applied to the pod by VPA.

Uncapped Target: The recommended CPU request and memory request for the container if you didn’t configure upper/lower limits in the VPA definition. These values will not be applied to the pod. They’re used only as a status indication.

Lower Bound: The minimum recommended CPU request and memory request for the container. There is a –pod-recommendation-min-memory-mb flag that determines the minimum amount of memory the recommender will set—it defaults to 250MiB.

Upper Bound: The maximum recommended CPU request and memory request for the container. It helps the VPA updater avoid eviction of pods that are close to the recommended target values. Eventually, the Upper Bound is expected to reach close to target recommendation.

 Recommendation:
	Container Recommendations:
  	Container Name:  autoscale-tester
  	Lower Bound:
    	Cpu: 	500m
    	Memory:  500Mi
  	Target:
    	Cpu: 	500m
    	Memory:  500Mi
  	Uncapped Target:
    	Cpu: 	93m
    	Memory:  262144k
  	Upper Bound:
    	Cpu: 	500m
    	Memory:  1274858485

 Recommendation:
	Container Recommendations:
  	Container Name:  autoscale-tester
  	Lower Bound:
    	Cpu: 	500m
    	Memory:  500Mi
  	Target:
    	Cpu: 	500m
    	Memory:  500Mi
  	Uncapped Target:
    	Cpu: 	93m
    	Memory:  262144k
  	Upper Bound:
    	Cpu: 	500m
    	Memory:  1274858485

D. VPA processing with Update Mode Off/Auto

Now, if you check the logs of vpa-updater, you can see it’s not processing VPA objects as the Update Mode is set as Off.

kubectl logs -f vpa-updater-675d47464b-k7xbx
1 updater.go:135] skipping VPA object autoscale-tester-recommender because its mode is not "Recreate" or "Auto"
1 updater.go:151] no VPA objects to process

kubectl logs -f vpa-updater-675d47464b-k7xbx
1 updater.go:135] skipping VPA object autoscale-tester-recommender because its mode is not "Recreate" or "Auto"
1 updater.go:151] no VPA objects to process

VPA allows various Update Modes, detailed here.

Let’s change the VPA updateMode to “Auto” to see the processing.

As soon as you do that, you can see vpa-updater has started processing objects, and it’s terminating all 3 pods.

kubectl logs -f vpa-updater-675d47464b-k7xbx
1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-8zgb9 with priority 1
1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-npts4 with priority 1
1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-vctx5 with priority 1
1 updater.go:193] evicting pod autoscale-tester-5d6b48d64f-8zgb9
1 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"autoscale-tester", Name:"autoscale-tester-5d6b48d64f-8zgb9", UID:"ed8c54c7-a87a-4c39-a000-0e74245f18c6", APIVersion:"v1", ResourceVersion:"378376", FieldPath:""}): 
type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.

kubectl logs -f vpa-updater-675d47464b-k7xbx
1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-8zgb9 with priority 1
1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-npts4 with priority 1
1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-vctx5 with priority 1
1 updater.go:193] evicting pod autoscale-tester-5d6b48d64f-8zgb9
1 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"autoscale-tester", Name:"autoscale-tester-5d6b48d64f-8zgb9", UID:"ed8c54c7-a87a-4c39-a000-0e74245f18c6", APIVersion:"v1", ResourceVersion:"378376", FieldPath:""}): 
type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.

You can also check the logs of vpa-admission-controller:

kubectl logs -f vpa-admission-controller-bbf4f4cc7-cb6pb
Sending patches: [{add /metadata/annotations map[]} {add /spec/containers/0/resources/requests/cpu 500m} {add /spec/containers/0/resources/requests/memory 500Mi} {add /spec/containers/0/resources/limits/cpu 500m} {add /spec/containers/0/resources/limits/memory 500Mi} {add /metadata/annotations/vpaUpdates Pod resources updated by autoscale-tester-recommender: container 0: cpu request, memory request, cpu limit, memory limit} {add /metadata/annotations/vpaObservedContainers autoscale-tester}]

kubectl logs -f vpa-admission-controller-bbf4f4cc7-cb6pb
Sending patches: [{add /metadata/annotations map[]} {add /spec/containers/0/resources/requests/cpu 500m} {add /spec/containers/0/resources/requests/memory 500Mi} {add /spec/containers/0/resources/limits/cpu 500m} {add /spec/containers/0/resources/limits/memory 500Mi} {add /metadata/annotations/vpaUpdates Pod resources updated by autoscale-tester-recommender: container 0: cpu request, memory request, cpu limit, memory limit} {add /metadata/annotations/vpaObservedContainers autoscale-tester}]

NOTE: Ensure that you have more than 1 running replicas. Otherwise, the pods won’t be restarted, and vpa-updater will give you this warning:

1 pods_eviction_restriction.go:209] too few replicas for ReplicaSet autoscale-tester/autoscale-tester1-7698974f6. Found 1 live pods

1 pods_eviction_restriction.go:209] too few replicas for ReplicaSet autoscale-tester/autoscale-tester1-7698974f6. Found 1 live pods

Now, describe the new pods created and check that the resources match the Target recommendations:

kubectl get po
NAME                            	READY   STATUS    	RESTARTS   AGE
autoscale-tester-5d6b48d64f-5dlb7   1/1 	Running   	0      	77s
autoscale-tester-5d6b48d64f-9wq4w   1/1 	Running   	0      	37s
autoscale-tester-5d6b48d64f-qrlxn   1/1 	Running   	0      	17s


kubectl describe po autoscale-tester-5d6b48d64f-5dlb7
Name:     	autoscale-tester-5d6b48d64f-5dlb7
Namespace:	autoscale-tester
...
	Limits:
  	cpu: 	500m
  	memory:  500Mi
	Requests:
  	cpu:    	500m
  	memory: 	500Mi
	Environment:  <none>

kubectl get po
NAME                            	READY   STATUS    	RESTARTS   AGE
autoscale-tester-5d6b48d64f-5dlb7   1/1 	Running   	0      	77s
autoscale-tester-5d6b48d64f-9wq4w   1/1 	Running   	0      	37s
autoscale-tester-5d6b48d64f-qrlxn   1/1 	Running   	0      	17s


kubectl describe po autoscale-tester-5d6b48d64f-5dlb7
Name:     	autoscale-tester-5d6b48d64f-5dlb7
Namespace:	autoscale-tester
...
	Limits:
  	cpu: 	500m
  	memory:  500Mi
	Requests:
  	cpu:    	500m
  	memory: 	500Mi
	Environment:  <none>

The Target Recommendation can not go below the minAllowed defined in the VPA spec.

‍E. Stress Loading Pods

Let’s recreate the deployment with memory request and limit set to 2000Mi and “–vm-bytes”, “500M”.

Gradually stress load one of these pods to increase its memory utilization.
You can login to the pod and run stress –vm 1 –vm-bytes 1400M –timeout 120000s.

kubectl top po
NAME                            	CPU(cores)   MEMORY(bytes)   
autoscale-tester-5d6b48d64f-5dlb7   1000m     	1836Mi       	 
autoscale-tester-5d6b48d64f-9wq4w   252m      	501Mi       	 
autoscale-tester-5d6b48d64f-qrlxn   252m      	501Mi


kubectl top po
NAME                            	CPU(cores)   MEMORY(bytes)   
autoscale-tester-5d6b48d64f-5dlb7   1000m     	1836Mi       	 
autoscale-tester-5d6b48d64f-9wq4w   252m      	501Mi       	 
autoscale-tester-5d6b48d64f-qrlxn   252m      	501Mi

Fig:- Prometheus memory utilized by each Replica

You will notice that the VPA recommendation is also calculated accordingly and applied to all replicas.

kubectl describe vpa autoscale-tester-recommender
Name:     	autoscale-tester-recommender
Namespace:	autoscale-tester
...
  Recommendation:
	Container Recommendations:
  	Container Name:  autoscale-tester
  	Lower Bound:
    	Cpu: 	500m
    	Memory:  500Mi
  	Target:
    	Cpu: 	500m
    	Memory:  628694953
  	Uncapped Target:
    	Cpu: 	49m
    	Memory:  628694953
  	Upper Bound:
    	Cpu: 	500m
    	Memory:  1553712527

kubectl describe vpa autoscale-tester-recommender
Name:     	autoscale-tester-recommender
Namespace:	autoscale-tester
...
  Recommendation:
	Container Recommendations:
  	Container Name:  autoscale-tester
  	Lower Bound:
    	Cpu: 	500m
    	Memory:  500Mi
  	Target:
    	Cpu: 	500m
    	Memory:  628694953
  	Uncapped Target:
    	Cpu: 	49m
    	Memory:  628694953
  	Upper Bound:
    	Cpu: 	500m
    	Memory:  1553712527

Limits v/s Request
VPA always works with the requests defined for a container and not the limits. So, the VPA recommendations are also applied to the container requests, and it maintains a limit to request ratio specified for all containers.

For example, if the initial container configuration defines a 100m Memory Request and 300m Memory Limit, then when the VPA target recommendation is 150m Memory, the container Memory Request will be updated to 150m and Memory Limit to 450m.

Selective Container Scaling

If you have a pod with multiple containers and you want to opt-out some of them, you can use the “Off” mode to turn off recommendations for a container.

You can also set containerName: “*” to include all containers.

spec:
 targetRef:
   apiVersion: "apps/v1"
   kind:       Deployment
   name:       autoscale-tester
 updatePolicy:
   updateMode: "Auto"
 resourcePolicy:
   containerPolicies:
   - containerName: autoscale-tester
     minAllowed:
       cpu: "500m"
       memory: "500Mi"
     maxAllowed:
       cpu: "4"
       memory: "4Gi"
   - containerName: opt-out-container
     mode: "Off"

spec:
 targetRef:
   apiVersion: "apps/v1"
   kind:       Deployment
   name:       autoscale-tester
 updatePolicy:
   updateMode: "Auto"
 resourcePolicy:
   containerPolicies:
   - containerName: autoscale-tester
     minAllowed:
       cpu: "500m"
       memory: "500Mi"
     maxAllowed:
       cpu: "4"
       memory: "4Gi"
   - containerName: opt-out-container
     mode: "Off"

Conclusion

Both the Horizontal Pod Autoscaler and the Vertical Pod Autoscaler serve different purposes and one can be more useful than the other depending on your application’s requirement.

The HPA can be useful when, for example, your application is serving a large number of lightweight (low resource-consuming) requests. In that case, scaling number of replicas can distribute the workload on each of the pod. The VPA, on the other hand, can be useful when your application serves heavyweight requests, which requires higher resources.

1. A Practical Guide to Deploying Multi-tier Applications on Google Container Engine (GKE)

2. Know Everything About Spinnaker & How to Deploy Using Kubernetes Engine

December 12, 2022

Automation Testing with Nightwatch JS and Cucumber: Everything You Need to Know

What is Nightwatch JS?

Nightwatch.js is a test automation framework on web applications, developed in Node.js which uses W3C WebDriver API (formerly Selenium WebDriver). It is a complete End-to-End testing solution which aims to simplify writing automated tests and setting up Continuous Integration. Nightwatch works by communicating over a restful HTTP API with a WebDriver server (such as ChromeDriver or Selenium Server). The latest version available in market is 1.0.

Why Use Nightwatch JS Over Any Other Automation Tool?

Selenium is demanded for developing automation framework since it supports various programming languages, provides cross-browser testing and also used in both web application and mobile application testing.

But Nightwatch, built on Node.js, exclusively uses JavaScript as the programming language for end-to-end testing which has the listed advantages –

Lightweight framework
Robust configuration
Integrates with cloud servers like SauceLabs and Browserstack for web and mobile testing with JavaScript, Appium
Allows configuration with Cucumber to build a strong BDD (Behaviour Driven Development) setup
High performance of the automation execution
Improves test structuring
Minimum usage and less Maintenance of code

Installation and Configuration of Nightwatch Framework

For configuring Nightwatch framework, all needed are the following in your system –

Download latest Node.js
Install npm

$ npm install

$ npm install

Package.json file for the test settings and dependencies

$ npm init

$ npm init

Install nightwatch and save as dev dependency

$ npm install nightwatch --save-dev

$ npm install nightwatch --save-dev

Install chromedriver/geckodriver and save as dev dependency for running the execution on the required browser

$ npm install chromedriver --save-dev

$ npm install chromedriver --save-dev

{
  "name": "demotest",
  "version": "1.0.0",
  "description": "Demo Practice",
  "main": "index.js",
  "scripts": {
    "test": "nightwatch"
  },
  "author": "",
  "license": "ISC",
  "devDependencies": {
    "chromedriver": "^74.0.0",
    "nightwatch": "^1.0.19"
  }
}

{
  "name": "demotest",
  "version": "1.0.0",
  "description": "Demo Practice",
  "main": "index.js",
  "scripts": {
    "test": "nightwatch"
  },
  "author": "",
  "license": "ISC",
  "devDependencies": {
    "chromedriver": "^74.0.0",
    "nightwatch": "^1.0.19"
  }
}

Create a nightwatch.conf.js for webdriver and test settings with nightwatch

const chromedriver = require('chromedriver');

module.exports = {
  src_folders : ["tests"], //tests is a folder in workspace which has the step definitions
  test_settings: {
    default: {
      webdriver: {
        start_process: true,
        server_path: chromedriver.path,
        port: 4444,
        cli_args: ['--port=4444']
      },
      desiredCapabilities: {
        browserName: 'chrome'
      }
    }
  }
};

const chromedriver = require('chromedriver');

module.exports = {
  src_folders : ["tests"], //tests is a folder in workspace which has the step definitions
  test_settings: {
    default: {
      webdriver: {
        start_process: true,
        server_path: chromedriver.path,
        port: 4444,
        cli_args: ['--port=4444']
      },
      desiredCapabilities: {
        browserName: 'chrome'
      }
    }
  }
};

Using Nightwatch – Writing and Running Tests

We create a JavaScript file named demo.js for running a test through nightwatch with the command

$ npm test

$ npm test

//demo.js is a JS file under tests folder
module.exports = {
    'step one: navigate to google' : function (browser) { //step one
      browser
        .url('https://www.google.com')
        .waitForElementVisible('body', 1000)
        .setValue('input[type=text]', 'nightwatch')
        .waitForElementVisible('input[name=btnK]', 1000)
    },
  
    'step two: click input' : function (browser) { //step two
      browser
        .click('input[name=btnK]')
        .pause(1000)
        .assert.containsText('#main', 'Night Watch')
        .end(); //to close the browser session after all the steps
    }

//demo.js is a JS file under tests folder
module.exports = {
    'step one: navigate to google' : function (browser) { //step one
      browser
        .url('https://www.google.com')
        .waitForElementVisible('body', 1000)
        .setValue('input[type=text]', 'nightwatch')
        .waitForElementVisible('input[name=btnK]', 1000)
    },
  
    'step two: click input' : function (browser) { //step two
      browser
        .click('input[name=btnK]')
        .pause(1000)
        .assert.containsText('#main', 'Night Watch')
        .end(); //to close the browser session after all the steps
    }

This command on running picks the value “nightwatch” from “test” key in package.json file which hits the nightwatch api to trigger the URL in chromedriver.

There can be one or more steps in demo.js(step definition js) file as per requirement or test cases.

Also, it is a good practice to maintain a separate .js file for page objects which consists of the locator strategy and selectors of the UI web elements.

module.exports = {
    elements: {
      googleInputBox: '//input[@type="text"]',
      searchButton: '(//input[@value="Google Search"])[2]',
      headingText: `//h3[contains(text(),'Nightwatch.js')]`
    }
}

module.exports = {
    elements: {
      googleInputBox: '//input[@type="text"]',
      searchButton: '(//input[@value="Google Search"])[2]',
      headingText: `//h3[contains(text(),'Nightwatch.js')]`
    }
}

The locator strategy is set to CSS and Xpath to inspect the UI elements.

locateStrategy: async function (selector) { return await selector.startsWith('/') ? 'xpath' : 'css selector'; }

locateStrategy: async function (selector) { return await selector.startsWith('/') ? 'xpath' : 'css selector'; }

Nightwatch.conf.js file is also updated with the page_objects location.

const chromedriver = require('chromedriver');

module.exports = {
  src_folders : ["tests"], //tests is a folder in workspace which has the step definitions
  page_objects_path: 'page_objects/', //page_objects folder where selectors are saved
  test_settings: {
    default: {
      webdriver: {
        start_process: true,
        server_path: chromedriver.path,
        port: 4444,
        cli_args: ['--port=4444']
      },
      desiredCapabilities: {
        browserName: 'chrome'
      }
    }
  }
};

const chromedriver = require('chromedriver');

module.exports = {
  src_folders : ["tests"], //tests is a folder in workspace which has the step definitions
  page_objects_path: 'page_objects/', //page_objects folder where selectors are saved
  test_settings: {
    default: {
      webdriver: {
        start_process: true,
        server_path: chromedriver.path,
        port: 4444,
        cli_args: ['--port=4444']
      },
      desiredCapabilities: {
        browserName: 'chrome'
      }
    }
  }
};

Nightwatch and Cucumber JS

Cucumber is a tool that supports Behavior Driven Development (BDD) and allows to write tests in simple english language in Given, When, Then format.

It is helpful to involve business stakeholders who can’t easily read code
Cucumber testing focuses on covering the UI scenarios from end-user’s perspective
Reuse of code is easily possible
Quick set up and execution
Efficient tool for UI testing

We add cucumber as dev dependency in the code.

$ npm install --save-dev nightwatch-api nightwatch cucumber chromedriver cucumber-pretty

$ npm install --save-dev nightwatch-api nightwatch cucumber chromedriver cucumber-pretty

{
  "name": "nightwatchdemo",
  "version": "1.0.0",
  "description": "To learn automation by nightwatch",
  "main": "google.js",
  "scripts": {
    "test": "nightwatch",
    "test:cucumber": "cucumber-js --require cucumber.conf.js --require tests --format node_modules/cucumber-pretty"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "cucumber": "^5.1.0",
    "cucumber-pretty": "^1.5.0"
  },
  "devDependencies": {
    "chromedriver": "^2.40.0",
    "nightwatch": "^1.0.19",
    "nightwatch-api": "^2.2.0"
  }
}

{
  "name": "nightwatchdemo",
  "version": "1.0.0",
  "description": "To learn automation by nightwatch",
  "main": "google.js",
  "scripts": {
    "test": "nightwatch",
    "test:cucumber": "cucumber-js --require cucumber.conf.js --require tests --format node_modules/cucumber-pretty"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "cucumber": "^5.1.0",
    "cucumber-pretty": "^1.5.0"
  },
  "devDependencies": {
    "chromedriver": "^2.40.0",
    "nightwatch": "^1.0.19",
    "nightwatch-api": "^2.2.0"
  }
}

Cucumber can be configured in the nightwatch framework to help maintaining the test scenarios in its .feature files. We create a file cucumber.conf.js in the root folder which has the setup of starting, creating and closing webdriver sessions.

const { setDefaultTimeout, AfterAll, BeforeAll } = require('cucumber');
const { createSession, closeSession, startWebDriver, stopWebDriver } = require('nightwatch-api');

setDefaultTimeout(60000);

BeforeAll(async () => {
  await startWebDriver();
  await createSession();
});

AfterAll(async () => {
  await closeSession();
  await stopWebDriver();
});

const { setDefaultTimeout, AfterAll, BeforeAll } = require('cucumber');
const { createSession, closeSession, startWebDriver, stopWebDriver } = require('nightwatch-api');

setDefaultTimeout(60000);

BeforeAll(async () => {
  await startWebDriver();
  await createSession();
});

AfterAll(async () => {
  await closeSession();
  await stopWebDriver();
});

Then we create a feature file which has the test scenarios in Given, When, Then format.

Feature: Google Search

Scenario: Searching Google

  Given I open Google's search page
  Then the title is "Google"
  And the Google search form exists

Feature: Google Search

Scenario: Searching Google

  Given I open Google's search page
  Then the title is "Google"
  And the Google search form exists

For Cucumber to be able to understand and execute the feature file we need to create matching step definitions for every feature step we use in our feature file. Create a step definition file under tests folder called google.js. Step definitions which uses Nightwatch client should return the result of api call as it returns a Promise. For example,

Given(/^I open Google's search page$/, () => { 
return client 
.url('http://google.com') 
.waitForElementVisible('body', 1000);
});

Given(/^I open Google's search page$/, () => { 
return client 
.url('http://google.com') 
.waitForElementVisible('body', 1000);
});

Given(/^I open Google's search page$/, async () => {
await client
.url('http://google.com')
.waitForElementVisible('body', 1000);
});

Given(/^I open Google's search page$/, async () => {
await client
.url('http://google.com')
.waitForElementVisible('body', 1000);
});

const { client } = require('nightwatch-api');
const { Given, Then, When } = require('cucumber');

Given(/^I open Google's search page$/, () => {
  return client.url('http://google.com').waitForElementVisible('body', 1000);
});

Then(/^the title is "([^"]*)"$/, title => {
  return client.assert.title(title);
});

Then(/^the Google search form exists$/, () => {
  return client.assert.visible('input[name="q"]');
});

const { client } = require('nightwatch-api');
const { Given, Then, When } = require('cucumber');

Given(/^I open Google's search page$/, () => {
  return client.url('http://google.com').waitForElementVisible('body', 1000);
});

Then(/^the title is "([^"]*)"$/, title => {
  return client.assert.title(title);
});

Then(/^the Google search form exists$/, () => {
  return client.assert.visible('input[name="q"]');
});

$ npm run e2e-test

$ npm run e2e-test

Executing Individual Feature Files or Scenarios

Single feature file

npm run e2e-test -- features/file1.feature

npm run e2e-test -- features/file1.feature

Multiple feature files

npm run e2e-test -- features/file1.feature features/file2.feature

npm run e2e-test -- features/file1.feature features/file2.feature

Scenario by its line number

npm run e2e-test -- features/my_feature.feature:3

npm run e2e-test -- features/my_feature.feature:3

Feature directory

npm run e2e-test -- features/dir

npm run e2e-test -- features/dir

Scenario by its name matching a regular expression

npm run e2e-test -- --name "topic 1"

npm run e2e-test -- --name "topic 1"

Feature and Scenario Tags

Cucumber allows to add tags to features or scenarios and we can selectively run a scenario using those tags. The tags can be used with conditional operators also, depending on the requirement.

Single tag

# google.feature
@google
Feature: Google Search
@search
Scenario: Searching Google 
Given I open Google's search page 
Then the title is "Google" 
And the Google search form exists

# google.feature
@google
Feature: Google Search
@search
Scenario: Searching Google 
Given I open Google's search page 
Then the title is "Google" 
And the Google search form exists

npm run e2e-test -- --tags @google

npm run e2e-test -- --tags @google

Multiple tags

npm run e2e-test -- --tags "@google or @duckduckgo"

npm run e2e-test -- --tags "(@google or @duckduckgo) and @search"

npm run e2e-test -- --tags "@google or @duckduckgo"

npm run e2e-test -- --tags "(@google or @duckduckgo) and @search"

To skip tags

npm run e2e-test -- --tags "not @google"

npm run e2e-test -- --tags "not(@google or @duckduckgo)"

npm run e2e-test -- --tags "not @google"

npm run e2e-test -- --tags "not(@google or @duckduckgo)"

Custom Reporters in Nightwatch and Cucumber Framework

Reporting is again an advantage provided by Cucumber which generates a report of test results at the end of the execution and it provides an immediate visual clue of a possible problem and will simplify the debugging process. HTML reports are best suited and easy to understand due to its format. To generate the same, we will add cucumber-html-reporter as a dependency in our nightwatch.conf.js file.

$ npm install --save-dev cucumber-html-reporter mkdirp

$ npm install --save-dev cucumber-html-reporter mkdirp

Cucumber-html-reporter in node_modules manages the creation of reports and generates in the output location after the test execution. Screenshot feature can enabled by adding the below code snippet in nightwatch.conf.js

module.exports = { 
test_settings: { 
default: { 
screenshots: { 
enabled: true, 
path: 'screenshots'
} }  } };

module.exports = { 
test_settings: { 
default: { 
screenshots: { 
enabled: true, 
path: 'screenshots'
} }  } };

The Cucumber configuration file can be extended with the handling of screenshots and attaching them to the report. Now – It also enables generating HTML test report at the end of the execution. It is generated based on Cucumber built-can be configured here in JSON report using different templates. We use a setTimeout() block in our cucumber.conf.js to run the generation after Cucumber finishes with the creation of json report.

const fs = require('fs');
const path = require('path');
const { setDefaultTimeout, After, AfterAll, BeforeAll } = require('cucumber');
const { createSession, closeSession, startWebDriver, stopWebDriver } = require('nightwatch-api');
const reporter = require('cucumber-html-reporter');

const attachedScreenshots = getScreenshots();

function getScreenshots() {
  try {
    const folder = path.resolve(__dirname, 'screenshots');

    const screenshots = fs.readdirSync(folder).map(file => path.resolve(folder, file));
    return screenshots;
  } catch (err) {
    return [];
  }
}

setDefaultTimeout(60000);

BeforeAll(async () => {
  await startWebDriver({ env: process.env.NIGHTWATCH_ENV || 'chromeHeadless' });
  await createSession();
});

AfterAll(async () => {
  await closeSession();
  await stopWebDriver();
  setTimeout(() => {
    reporter.generate({
      theme: 'bootstrap',
      jsonFile: 'report/cucumber_report.json',
      output: 'report/cucumber_report.html',
      reportSuiteAsScenarios: true,
      launchReport: true,
      metadata: {
        'App Version': '0.3.2',
        'Test Environment': 'POC'
      }
    });
  }, 0);
});

After(function() {
  return Promise.all(
    getScreenshots()
      .filter(file => !attachedScreenshots.includes(file))
      .map(file => {
        attachedScreenshots.push(file);
        return this.attach(fs.readFileSync(file), 'image/png');
      })
  );
});

const fs = require('fs');
const path = require('path');
const { setDefaultTimeout, After, AfterAll, BeforeAll } = require('cucumber');
const { createSession, closeSession, startWebDriver, stopWebDriver } = require('nightwatch-api');
const reporter = require('cucumber-html-reporter');

const attachedScreenshots = getScreenshots();

function getScreenshots() {
  try {
    const folder = path.resolve(__dirname, 'screenshots');

    const screenshots = fs.readdirSync(folder).map(file => path.resolve(folder, file));
    return screenshots;
  } catch (err) {
    return [];
  }
}

setDefaultTimeout(60000);

BeforeAll(async () => {
  await startWebDriver({ env: process.env.NIGHTWATCH_ENV || 'chromeHeadless' });
  await createSession();
});

AfterAll(async () => {
  await closeSession();
  await stopWebDriver();
  setTimeout(() => {
    reporter.generate({
      theme: 'bootstrap',
      jsonFile: 'report/cucumber_report.json',
      output: 'report/cucumber_report.html',
      reportSuiteAsScenarios: true,
      launchReport: true,
      metadata: {
        'App Version': '0.3.2',
        'Test Environment': 'POC'
      }
    });
  }, 0);
});

After(function() {
  return Promise.all(
    getScreenshots()
      .filter(file => !attachedScreenshots.includes(file))
      .map(file => {
        attachedScreenshots.push(file);
        return this.attach(fs.readFileSync(file), 'image/png');
      })
  );
});

In package.json file, we have added the JSON formatter to create a JSON report and it is used by cucumber-html-reporter for the same. We use mkdirp to make sure report folder exists before running the test.

"scripts": { 
"e2e-test": "mkdirp report && cucumber-js --require cucumber.conf.js --require step-definitions --format node_modules/cucumber-pretty --format json:report/cucumber_report.json" 
}

"scripts": { 
"e2e-test": "mkdirp report && cucumber-js --require cucumber.conf.js --require step-definitions --format node_modules/cucumber-pretty --format json:report/cucumber_report.json" 
}

After adding this, run the command again

npm run e2e-test

npm run e2e-test

When the test run completes, the HTML report is displayed in a new browser tab in the format given below

Conclusion

Nightwatch-Cucumber is a great module for linking the accessibility of Cucumber.js with the robust testing framework of Nightwatch.js. Together they can not only provide easily readable documentation of test suite, but also highly configurable automated user tests, all while keeping everything in JavaScript.

December 12, 2022

Automating test cases for text-messaging (SMS) feature of your application was never so easy

Almost all the applications that you work on or deal with throughout the day use SMS (short messaging service) as an efficient and effective way to communicate with end users.

Some very common use-cases include:

Receiving an OTP for authenticating your login
Getting deals from the likes of Flipkart and Amazon informing you regarding the latest sale.
Getting reminder notifications for the doctor’s appointment that you have
Getting details for your debit and credit transactions.

The practical use cases for an SMS can be far-reaching.

Even though SMS integration forms an integral part of any application, due to the limitations and complexities involved in automating it via web automation tools like selenium, these are often neglected to be automated.

Teams often opt for verifying these sets of test cases manually, which, even though is important in getting bugs earlier, it does pose some real-time challenges.

Pitfalls with Manual Testing

With these limitations, you obviously do not want your application sending faulty Text Messages after that major Release.

Automation Testing … #theSaviour ‍

To overcome the limitations of manual testing, delegating your task to a machine comes in handy.

Now that we have talked about the WHY, we will look into HOW the feature can be automated.
Technically, you shouldn’t / can’t use selenium to read the SMS via mobile.
So, we were looking for a third-party library that is

Easy to integrate with the existing code base
Supports a range of languages
Does not involve highly complex codes and focuses on the problem at hand
Supports both incoming and outgoing messages

After a lot of research, we settled with Twilio.

In this article, we will look at an example of working with Twilio APIs to Read SMS and eventually using it to automate SMS flows.

Twilio supports a bunch of different languages. For this article, we stuck with Node.js

Account Setup

Registration

‍To start working with the service, you need to register.

Once that is done, Twilio will prompt you with a bunch of simple questions to understand why you want to use their service.

Twilio Dashboard

‍A trial balance of $15.50 is received upon signing up for your usage. This can be used for sending and receiving text messages. A unique Account SID and Auth Token is also generated for your account.

‍Buy a Number

‍
Navigate to the buy a number link under Phone Numbers > Manage and purchase a number that you would eventually be using in your automation scripts for receiving text messages from the application.

Note – for the free trial, Twilio does not support Indian Number (+91)

Code Setup

Install Twilio in your code base

Code snippet

‍For simplification,
Just pass in the accountSid and authToken that you will receive from the Dashboard Console to the twilio library.This would return you with a client object containing the list of all the messages in your inbox.

const accountSid = 'AC13fb4ed9a621140e19581a14472a75b0'
const authToken = 'fac9498ac36ac29e8dae647d35624af7'
const client = require('twilio')(accountSid, authToken)
let messageBody
let messageContent
let sentFrom
let sentTo
let OTP
describe('My Login application', () => {
  it('Read Text Message', () => {
    const username = $('#login_field');
    const pass = $('#password');
    const signInBtn = $('input[type="submit"]');
    const otpField = $('#otp');
    const verifyBtn = $(
      'form[action="/sessions/two-factor"] button[type="submit"]'
    );
    browser.url('https://github.com/login');
    username.setValue('your_email@mail.com');
    pass.setValue('your_pass123');
    signInBtn.click();
    // Get Message ...
    const latestMsg = await client.messages.list({ limit: 1 })
    
    messageContent = JSON.stringify(latestMsg,null,"\t")
    messageBody = JSON.stringify(latestMsg.body)
    sentFrom = JSON.stringify(latestMsg.from)
    sentTo = JSON.stringify(latestMsg.to)
    OTP = JSON.stringify(latestMsg.body.match(/\d+/)[0])
    otpField.setValue(OTP);
    verifyBtn.click();
    expect(browser).toHaveUrl('https://github.com/');
  });
})

const accountSid = 'AC13fb4ed9a621140e19581a14472a75b0'
const authToken = 'fac9498ac36ac29e8dae647d35624af7'
const client = require('twilio')(accountSid, authToken)
let messageBody
let messageContent
let sentFrom
let sentTo
let OTP
describe('My Login application', () => {
  it('Read Text Message', () => {
    const username = $('#login_field');
    const pass = $('#password');
    const signInBtn = $('input[type="submit"]');
    const otpField = $('#otp');
    const verifyBtn = $(
      'form[action="/sessions/two-factor"] button[type="submit"]'
    );
    browser.url('https://github.com/login');
    username.setValue('your_email@mail.com');
    pass.setValue('your_pass123');
    signInBtn.click();
    // Get Message ...
    const latestMsg = await client.messages.list({ limit: 1 })
    
    messageContent = JSON.stringify(latestMsg,null,"\t")
    messageBody = JSON.stringify(latestMsg.body)
    sentFrom = JSON.stringify(latestMsg.from)
    sentTo = JSON.stringify(latestMsg.to)
    OTP = JSON.stringify(latestMsg.body.match(/\d+/)[0])
    otpField.setValue(OTP);
    verifyBtn.click();
    expect(browser).toHaveUrl('https://github.com/');
  });
})

List of other APIs to read an SMS provided by Twilio

List all messages: Using this API Here you can see how to retrieve all messages from your account.

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages.list({limit: 20})
               .then(messages => messages.forEach(m => console.log(m.sid)));

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages.list({limit: 20})
               .then(messages => messages.forEach(m => console.log(m.sid)));

List Messages matching filter criteria: If you’d like to have Twilio narrow down this list of messages for you, you can do so by specifying a To number, From the number, and a DateSent.

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages
      .list({
         dateSent: new Date(Date.UTC(2016, 7, 31, 0, 0, 0)),
         from: '+15017122661',
         to: '+15558675310',
         limit: 20
       })
      .then(messages => messages.forEach(m => console.log(m.sid)));

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages
      .list({
         dateSent: new Date(Date.UTC(2016, 7, 31, 0, 0, 0)),
         from: '+15017122661',
         to: '+15558675310',
         limit: 20
       })
      .then(messages => messages.forEach(m => console.log(m.sid)));

Get a Message : If you know the message SID (i.e., the message’s unique identifier), then you can retrieve that specific message directly. Using this method, you can send emails without attachments.

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages('MM800f449d0399ed014aae2bcc0cc2f2ec')
      .fetch()
      .then(message => console.log(message.to));

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages('MM800f449d0399ed014aae2bcc0cc2f2ec')
      .fetch()
      .then(message => console.log(message.to));

Delete a message : If you want to delete a message from history, you can easily do so by deleting the Message instance resource.

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages('MM800f449d0399ed014aae2bcc0cc2f2ec').remove();

const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);

client.messages('MM800f449d0399ed014aae2bcc0cc2f2ec').remove();

Limitations with a Trial Twilio Account

The trial version does not support Indian numbers (+91).
The trial version just provides an initial balance of $15.50.
This is sufficient enough for your use case that involves only receiving messages on your Twilio number. But if the use case requires sending back the message from the Twilio number, a paid version can solve the purpose.
Messages sent via a short code (557766) are not received on the Twilio number.
Only long codes are accepted in the trial version.
You can buy only a single number with the trial version. If purchasing multiple numbers is required, the user may have to switch to a paid version.

Conclusion

In a nutshell, we saw how important it is to thoroughly verify the SMS functionality of our application since it serves as one of the primary ways of communicating with the end users.
We also saw what the limitations are with following the traditional manual testing approach and how automating SMS scenarios would help us deliver high-quality products.
Finally, we demonstrated a feasible, efficient and easy-to-use way to Automate SMS test scenarios using Twilio APIs.

Hope this was a useful read and that you will now be able to easily automate SMS scenarios.
Happy testing… Do like and share …

December 12, 2022

Automated Containerization and Migration of On-premise Applications to Cloud Platforms
Containerized applications are becoming more popular with each passing year. All enterprise applications are adopting container technology as they modernize their IT systems. Migrating your applications from VMs or physical machines to containers comes with multiple advantages like optimal resource utilization, faster deployment times, replication, quick cloning, lesser lock-in and so on. Various container orchestration platforms like Kubernetes, Google Container Engine (GKE), Amazon EC2 Container Service (Amazon ECS) help in quick deployment and easy management of your containerized applications. But in order to use these platforms, you need to migrate your legacy applications to containers or rewrite/redeploy your applications from scratch with the containerization approach. Rearchitecting your applications using containerization approach is preferable, but is that possible for complex legacy applications? Is your deployment team capable enough to list down each and every detail about the deployment process of your application? Do you have the patience of authoring a Docker file for each of the components of your complex application stack?

Automated migrations!

Velotio has been helping customers with automated migration of VMs and bare-metal servers to various container platforms. We have developed automation to convert these migrated applications as containers on various container deployment platforms like GKE, Amazon ECS and Kubernetes. In this blog post, we will cover one such migration tool developed at Velotio which will migrate your application running on a VM or physical machine to Google Container Engine (GKE) by running a single command.

Migration tool details

We have named our migration tool as A2C(Anything to Container). It can migrate applications running on any Unix or Windows operating system.

The migration tool requires the following information about the server to be migrated:
- IP of the server
- SSH User, SSH Key/Password of the application server
- Configuration file containing data paths for application/database/components (more details below)
- Required name of your docker image (The docker image that will get created for your application)
- GKE Container Cluster details
In order to store persistent data, volumes can be defined in container definition. Data changes done on volume path remain persistent even if the container is killed or crashes. Volumes are basically filesystem path from host machine on which your container is running, NFS or cloud storage. Containers will mount the filesystem path from your local machine to container, leading to data changes being written on the host machine filesystem instead of the container’s filesystem. Our migration tool supports data volumes which can be defined in the configuration file. It will automatically create disks for the defined volumes and copy data from your application server to these disks in a consistent way.

The configuration file we have been talking about is basically a YAML file containing filesystem level information about your application server. A sample of this file can be found below:
```
includes:
- /
volumes:
- var/log/httpd
- var/log/mariadb
- var/www/html
- var/lib/mysql
excludes:
- mnt
- var/tmp
- etc/fstab
- proc
- tmp
```
The configuration file contains 3 sections: includes, volumes and excludes:
- Includes contains filesystem paths on your application server which you want to add to your container image.
- Volumes contain filesystem paths on your application server which stores your application data. Generally, filesystem paths containing database files, application code files, configuration files, log files are good candidates for volumes.
- The excludes section contains filesystem paths which you don’t want to make part of the container. This may include temporary filesystem paths like /proc, /tmp and also NFS mounted paths. Ideally, you would include everything by giving “/” in includes section and exclude specifics in exclude section.
Docker image name to be given as input to the migration tool is the docker registry path in which the image will be stored, followed by the name and tag of the image. Docker registry is like GitHub of docker images, where you can store all your images. Different versions of the same image can be stored by giving version specific tag to the image. GKE also provides a Docker registry. Since in this demo we are migrating to GKE, we will also store our image to GKE registry.

GKE container cluster details to be given as input to the migration tool, contains GKE specific details like GKE project name, GKE container cluster name and GKE region name. A container cluster can be created in GKE to host the container applications. We have a separate set of scripts to perform cluster creation operation. Container cluster creation can also be done easily through GKE UI. For now, we will assume that we have a 3 node cluster created in GKE, which we will use to host our application.

Tasks performed under migration

Our migration tool (A2C), performs the following set of activities for migrating the application running on a VM or physical machine to GKE Container Cluster:

1. Install the A2C migration tool with all it’s dependencies to the target application server

2. Create a docker image of the application server, based on the filesystem level information given in the configuration file

3. Capture metadata from the application server like configured services information, port usage information, network configuration, external services, etc.

4. Push the docker image to GKE container registry

5. Create disk in Google Cloud for each volume path defined in configuration file and prepopulate disks with data from application server

6. Create deployment spec for the container application in GKE container cluster, which will open the required ports, configure required services, add multi container dependencies, attach the pre populated disks to containers, etc.

7. Deploy the application, after which you will have your application running as containers in GKE with application software in running state. New application URL’s will be given as output.

8. Load balancing, HA will be configured for your application.

Demo

For demonstration purpose, we will deploy a LAMP stack (Apache+PHP+Mysql) on a CentOS 7 VM and will run the migration utility for the VM, which will migrate the application to our GKE cluster. After the migration we will show our application preconfigured with the same data as on our VM, running on GKE.

Step 1

We setup LAMP stack using Apache, PHP and Mysql on a CentOS 7 VM in GCP. The PHP application can be used to list, add, delete or edit user data. The data is getting stored in MySQL database. We added some data to the database using the application and the UI would show the following:

Step 2

Now we run the A2C migration tool, which will migrate this application stack running on a VM into a container and auto-deploy it to GKE.
```
# ./migrate.py -c lamp_data_handler.yml -d "tcp://35.202.201.247:4243" -i migrate-lamp -p glassy-chalice-XXXXX -u root -k ~/mykey -l a2c-host --gcecluster a2c-demo --gcezone us-central1-b 130.211.231.58
```
Pushing converter binary to target machine Pushing data config to target machine Pushing installer script to target machine Running converter binary on target machine [130.211.231.58] out: creating docker image [130.211.231.58] out: image created with id 6dad12ba171eaa8615a9c353e2983f0f9130f3a25128708762228f293e82198d [130.211.231.58] out: Collecting metadata for image [130.211.231.58] out: Generating metadata for cent7 [130.211.231.58] out: Building image from metadata Pushing the docker image to GCP container registryInitiate remote data copy Activated service account credentials for: [glassy-chaliceXXXXX@appspot.gserviceaccount.com] for volume var/log/httpd Creating disk migrate-lamp-0 Disk Created Successfully transferring data from sourcefor volume var/log/mariadb Creating disk migrate-lamp-1 Disk Created Successfully transferring data from sourcefor volume var/www/html Creating disk migrate-lamp-2 Disk Created Successfully transferring data from sourcefor volume var/lib/mysql Creating disk migrate-lamp-3 Disk Created Successfully transferring data from sourceConnecting to GCP cluster for deployment Created service file /tmp/gcp-service.yaml Created deployment file /tmp/gcp-deployment.yaml
```
Pushing converter binary to target machine
Pushing data config to target machine
Pushing installer script to target machine
Running converter binary on target machine
[130.211.231.58] out: creating docker image
[130.211.231.58] out: image created with id 6dad12ba171eaa8615a9c353e2983f0f9130f3a25128708762228f293e82198d
[130.211.231.58] out: Collecting metadata for image
[130.211.231.58] out: Generating metadata for cent7
[130.211.231.58] out: Building image from metadata
Pushing the docker image to GCP container registryInitiate remote data copy
Activated service account credentials for: [glassy-chaliceXXXXX@appspot.gserviceaccount.com]
for volume var/log/httpd
Creating disk migrate-lamp-0
Disk Created Successfully
transferring data from sourcefor volume var/log/mariadb
Creating disk migrate-lamp-1
Disk Created Successfully
transferring data from sourcefor volume var/www/html
Creating disk migrate-lamp-2
Disk Created Successfully
transferring data from sourcefor volume var/lib/mysql
Creating disk migrate-lamp-3
Disk Created Successfully
transferring data from sourceConnecting to GCP cluster for deployment
Created service file /tmp/gcp-service.yaml
Created deployment file /tmp/gcp-deployment.yaml
```
Deploying to GKE‍
```
$ kubectl get pod

NAMEREADY STATUSRESTARTS AGE
migrate-lamp-3707510312-6dr5g 0/1 ContainerCreating 058s
```
```
$ kubectl get deployment

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
migrate-lamp 1 1 10 1m
```
```
$ kubectl get service

NAME CLUSTER-IP EXTERNAL-IP PORT(S)AGE
kubernetes 10.59.240.1443/TCP23hmigrate-lamp 10.59.248.44 35.184.53.100 3306:31494/TCP,80:30909/TCP,22:31448/TCP 53s
```
You can access your application using above connection details!

Step 3

Access LAMP stack on GKE using the IP 35.184.53.100 on default 80 port as was done on the source machine.

Here is the Docker image being created in GKE Container Registry:

We can also see that disks were created with migrate-lamp-x, as part of this automated migration.

Load Balancer also got provisioned in GCP as part of the migration process

Following service files and deployment files were created by our migration tool to deploy the application on GKE:
```
# cat /tmp/gcp-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: migrate-lamp
name: migrate-lamp
spec:
ports:
- name: migrate-lamp-3306
port: 3306
- name: migrate-lamp-80
port: 80
- name: migrate-lamp-22
port: 22
selector:
app: migrate-lamp
type: LoadBalancer
```
# cat /tmp/gcp-deployment.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: app: migrate-lamp name: migrate-lamp spec: replicas: 1 selector: matchLabels: app: migrate-lamp template: metadata: labels: app: migrate-lamp spec: containers: - image: us.gcr.io/glassy-chalice-129514/migrate-lamp name: migrate-lamp ports: - containerPort: 3306 - containerPort: 80 - containerPort: 22 securityContext: privileged: true volumeMounts: - mountPath: /var/log/httpd name: migrate-lamp-var-log-httpd - mountPath: /var/www/html name: migrate-lamp-var-www-html - mountPath: /var/log/mariadb name: migrate-lamp-var-log-mariadb - mountPath: /var/lib/mysql name: migrate-lamp-var-lib-mysql volumes: - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-0 name: migrate-lamp-var-log-httpd - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-2 name: migrate-lamp-var-www-html - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-1 name: migrate-lamp-var-log-mariadb - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-3 name: migrate-lamp-var-lib-mysql
```
# cat /tmp/gcp-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: migrate-lamp
name: migrate-lamp
spec:
replicas: 1
selector:
matchLabels:
app: migrate-lamp
template:
metadata:
labels:
app: migrate-lamp
spec:
containers:
- image: us.gcr.io/glassy-chalice-129514/migrate-lamp
name: migrate-lamp
ports:
- containerPort: 3306
- containerPort: 80
- containerPort: 22
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/log/httpd
name: migrate-lamp-var-log-httpd
- mountPath: /var/www/html
name: migrate-lamp-var-www-html
- mountPath: /var/log/mariadb
name: migrate-lamp-var-log-mariadb
- mountPath: /var/lib/mysql
name: migrate-lamp-var-lib-mysql
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-0
name: migrate-lamp-var-log-httpd
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-2
name: migrate-lamp-var-www-html
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-1
name: migrate-lamp-var-log-mariadb
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-3
name: migrate-lamp-var-lib-mysql
```
Conclusion

Migrations are always hard for IT and development teams. At Velotio, we have been helping customers to migrate to cloud and container platforms using streamlined processes and automation. Feel free to reach out to us at contact@rsystems.com to know more about our cloud and container adoption/migration offerings.
December 12, 2022

Automating Serverless Framework Deployment using Watchdog

These days, we see that most software development is moving towards serverless architecture, and that’s no surprise. Almost all top cloud service providers have serverless services that follow a pay-as-you-go model. This way, consumers don’t have to pay for any unused resources. Also, there’s no need to worry about procuring dedicated servers, network/hardware management, operating system security updates, etc.

Unfortunately, for cloud developers, serverless tools don’t provide auto-deploy services for updating local environments. This is still a headache. The developer must deploy and test changes manually. Web app projects using Node or Django have a watcher on the development environment during app bundling on their respective server runs. Thus, when changes happen in the code directory, the server automatically restarts with these new changes, and the developer can check if the changes are working as expected.

In this blog, we will talk about automating serverless application deployment by changing the local codebase. We are using AWS as a cloud provider and primarily focusing on lambda to demonstrate the functionality.

Prerequisites:

This article uses AWS, so command and programming access are necessary.
This article is written with deployment to AWS in mind, so AWS credentials are needed to make changes in the Stack. In the case of other cloud providers, we would require that provider’s command-line access.
We are using a serverless application framework for deployment. (This example will also work for other tools like Zappa.) So, some serverless context would be required.

Before development, let’s divide the problem statement into sub-tasks and build them one step at a time.

Problem Statement

Create a codebase watcher service that would trigger either a stack update on AWS or run a local test. By doing this, developers would not have to worry about manual deployment on the cloud provider. This service needs to keep an eye on the code and generate events when an update/modify/copy/delete occurs in the given codebase.

Solution

First, to watch the codebase, we need logic that acts as a trigger and notifies when underlining files changes. For this, there are already packages present in different programming languages. In this example, we are using ‘python watchdog.’

from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

CODE_PATH = "<codebase path>"

class WatchMyCodebase:
    # Set the directory on watch
    def __init__(self):
        self.observer = Observer()

    def run(self):
        event_handler = EventHandler()
        # recursive flag decides if watcher should collect changes in CODE_PATH directory tree.
        self.observer.schedule(event_handler, CODE_PATH, recursive=True)
        self.observer.start()
        self.observer.join()


class EventHandler(FileSystemEventHandler):
    """Handle events generated by Watchdog Observer"""

    @classmethod
    def on_any_event(cls, event):
        if event.is_directory:
            """Ignore directory level events, like creating new empty directory etc.."""
            return None

        elif event.event_type == 'modified':
            print("file under codebase directory is modified...")

if __name__ == '__main__':
    watch = WatchMyCodebase()
    watch.run()

from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

CODE_PATH = "<codebase path>"

class WatchMyCodebase:
    # Set the directory on watch
    def __init__(self):
        self.observer = Observer()

    def run(self):
        event_handler = EventHandler()
        # recursive flag decides if watcher should collect changes in CODE_PATH directory tree.
        self.observer.schedule(event_handler, CODE_PATH, recursive=True)
        self.observer.start()
        self.observer.join()


class EventHandler(FileSystemEventHandler):
    """Handle events generated by Watchdog Observer"""

    @classmethod
    def on_any_event(cls, event):
        if event.is_directory:
            """Ignore directory level events, like creating new empty directory etc.."""
            return None

        elif event.event_type == 'modified':
            print("file under codebase directory is modified...")

if __name__ == '__main__':
    watch = WatchMyCodebase()
    watch.run()

Here, the on_any_event() class method gets called on any updates in the mentioned directory, and we need to add deployment logic here. However, we can’t just deploy once it receives a notification from the watcher because modern IDEs save files as soon as the user changes them. And if we add logic that deploys on every change, then most of the time, it will deploy half-complete services.

To handle this, we must add some timeout before deploying the service.

Here, the program will wait for some time after the file is changed. And if it finds that, for some time, there have been no new changes in the codebase, it will deploy the service.

import time
import subprocess
import threading
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

valid_events = ['created', 'modified', 'deleted', 'moved']
DEPLOY_AFTER_CHANGE_THRESHOLD = 300
STAGE_NAME = ""
CODE_PATH = "<codebase path>"

def deploy_env():
    process = subprocess.Popen(['sls', 'deploy', '--stage', STAGE_NAME, '-v'],
                               stdout=subprocess.PIPE,
                               stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    print(stdout, stderr)

def deploy_service_on_change():
    while True:
        if EventHandler.last_update_time and (int(time.time() - EventHandler.last_update_time) > DEPLOY_AFTER_CHANGE_THRESHOLD):
            EventHandler.last_update_time = None
            deploy_env()
        time.sleep(5)

def start_interval_watcher_thread():
    interval_watcher_thread = threading.Thread(target=deploy_service_on_change)
    interval_watcher_thread.start()


class WatchMyCodebase:
    # Set the directory on watch
    def __init__(self):
        self.observer = Observer()

    def run(self):
        event_handler = EventHandler()
        self.observer.schedule(event_handler, CODE_PATH, recursive=True)
        self.observer.start()
        self.observer.join()


class EventHandler(FileSystemEventHandler):
    """Handle events generated by Watchdog Observer"""
    last_update_time = None

    @classmethod
    def on_any_event(cls, event):
        if event.is_directory:
            """Ignore directory level events, like creating new empty directory etc.."""
            return None

        elif event.event_type in valid_events and '.serverless' not in event.src_path:
            # Ignore events related to changes in .serverless directory, serverless creates few temp file while deploy
            cls.last_update_time = time.time()


if __name__ == '__main__':
    start_interval_watcher_thread()
    watch = WatchMyCodebase()
    watch.run()

import time
import subprocess
import threading
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

valid_events = ['created', 'modified', 'deleted', 'moved']
DEPLOY_AFTER_CHANGE_THRESHOLD = 300
STAGE_NAME = ""
CODE_PATH = "<codebase path>"

def deploy_env():
    process = subprocess.Popen(['sls', 'deploy', '--stage', STAGE_NAME, '-v'],
                               stdout=subprocess.PIPE,
                               stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    print(stdout, stderr)

def deploy_service_on_change():
    while True:
        if EventHandler.last_update_time and (int(time.time() - EventHandler.last_update_time) > DEPLOY_AFTER_CHANGE_THRESHOLD):
            EventHandler.last_update_time = None
            deploy_env()
        time.sleep(5)

def start_interval_watcher_thread():
    interval_watcher_thread = threading.Thread(target=deploy_service_on_change)
    interval_watcher_thread.start()


class WatchMyCodebase:
    # Set the directory on watch
    def __init__(self):
        self.observer = Observer()

    def run(self):
        event_handler = EventHandler()
        self.observer.schedule(event_handler, CODE_PATH, recursive=True)
        self.observer.start()
        self.observer.join()


class EventHandler(FileSystemEventHandler):
    """Handle events generated by Watchdog Observer"""
    last_update_time = None

    @classmethod
    def on_any_event(cls, event):
        if event.is_directory:
            """Ignore directory level events, like creating new empty directory etc.."""
            return None

        elif event.event_type in valid_events and '.serverless' not in event.src_path:
            # Ignore events related to changes in .serverless directory, serverless creates few temp file while deploy
            cls.last_update_time = time.time()


if __name__ == '__main__':
    start_interval_watcher_thread()
    watch = WatchMyCodebase()
    watch.run()

The specified valid_events acts as a filter to deploy, and we are only considering these events and acting upon them.

Moreover, to add a delay after file changes and ensure that there are no new changes, we added interval_watcher_thread. This checks the difference between current and last directory update time, and if it’s greater than the specified threshold, we deploy serverless resources.

def deploy_service_on_change():
    while True:
        if EventHandler.last_update_time and (int(time.time() - EventHandler.last_update_time) > DEPLOY_AFTER_CHANGE_SEC):
            EventHandler.last_update_time = None
            deploy_env()
        time.sleep(5)

def start_interval_watcher_thread():
    interval_watcher_thread = threading.Thread(target=deploy_service_on_change)
    interval_watcher_thread.start()

def deploy_service_on_change():
    while True:
        if EventHandler.last_update_time and (int(time.time() - EventHandler.last_update_time) > DEPLOY_AFTER_CHANGE_SEC):
            EventHandler.last_update_time = None
            deploy_env()
        time.sleep(5)

def start_interval_watcher_thread():
    interval_watcher_thread = threading.Thread(target=deploy_service_on_change)
    interval_watcher_thread.start()

Here, the sleep time in deploy_service_on_change is important. It will prevent the program from consuming more CPU cycles to check whether the condition to deploy serverless is satisfied. Also, too much delay would cause more delay in the deployment than the specified value of DEPLOY_AFTER_CHANGE_THRESHOLD.

Note: With programming languages like Golang, and its features like goroutine and channels, we can build an even more efficient application—or even with Python with the help of thread signals.

Let’s build one lambda function that automatically deploys on a change. Let’s also be a little lazy and develop a basic python lambda that takes a number as an input and returns its factorial value.

import math

def lambda_handler(event, context):
    """
    Handler for get factorial
    """

    number = event['number']
    return math.factorial(number)

import math

def lambda_handler(event, context):
    """
    Handler for get factorial
    """

    number = event['number']
    return math.factorial(number)

We are using a serverless application framework, so to deploy this lambda, we need a serverless.yml file that specifies stack details like execution environment, cloud provider, environment variables, etc. More parameters are listed in this guide.

service: get-factorial

provider:
  name: aws
  runtime: python3.7

functions:
  get_factorial:
    handler: handler.lambda_handler

service: get-factorial

provider:
  name: aws
  runtime: python3.7

functions:
  get_factorial:
    handler: handler.lambda_handler

We need to keep both handler.py and serverless.yml in the same folder, or we need to update the function handler in serverless.yml.

We can deploy it manually using this serverless command:

sls deploy --stage production -v

sls deploy --stage production -v

Note: Before deploying, export AWS credentials.

The above command deployed a stack using cloud formation:

–stage is how to specify the environment where the stack should be deployed. Like any other software project, it can have stage names such as production, dev, test, etc.
-v specifies verbose.

To auto-deploy changes from now on, we can use the watcher.

Start the watcher with this command:

python3  auto_deploy_sls.py

python3  auto_deploy_sls.py

This will run continuously and keep an eye on the codebase directory, and if any changes are detected, it will deploy them. We can customize this to some extent, like post-deploy, so it can run test cases against a new stack.

If you are worried about network traffic when the stack has lots of dependencies, using an actual cloud provider for testing might increase billing. However, we can easily fix this by using serverless local development.

Here is a serverless blog that specifies local development and testing of a cloudformation stack. It emulates cloud behavior on the local setup, so there’s no need to worry about cloud service billing.

One great upgrade supports complex directory structure.

In the above example, we are assuming that only one single directory is present, so it’s fine to deploy using the command:

sls deploy --stage production -v

sls deploy --stage production -v

But in some projects, one might have multiple stacks present in the codebase at different hierarchies. Consider the below example: We have three different lambdas, so updating in the `check-prime` directory requires updating only that lambda and not the others.

├── check-prime
│   ├── handler.py
│   └── serverless.yml
├── get-factorial
│   ├── handler.py
│   └── serverless.yml
└── get-factors
    ├── handler.py
    └── serverless.yml

├── check-prime
│   ├── handler.py
│   └── serverless.yml
├── get-factorial
│   ├── handler.py
│   └── serverless.yml
└── get-factors
    ├── handler.py
    └── serverless.yml

The above can be achieved in on_any_event(). By using the variable event.src_path, we can learn the file path that received the event.

Now, deployment command changes to:

cd <updated_directory> && sls deploy --stage <your-stage> -v

cd <updated_directory> && sls deploy --stage <your-stage> -v

This will deploy only an updated stack.

Conclusion

We learned that even if serverless deployment is a manual task, it can be automated with the help of Watchdog for better developer workflow.

With the help of serverless local development, we can test changes as we are making them without needing an explicit deployment to the cloud environment manually to test all the changes being made.

We hope this helps you improve your serverless development experience and close the loop faster.

1. To Go Serverless Or Not Is The Question

2. Building Your First AWS Serverless Application? Here’s Everything You Need to Know

December 12, 2022

An Introduction to Asynchronous Programming in Python

Introduction

Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread. When the work is complete, it notifies the main thread about completion or failure of the worker thread. There are numerous benefits to using it, such as improved application performance and enhanced responsiveness.

Asynchronous programming has been gaining a lot of attention in the past few years, and for good reason. Although it can be more difficult than the traditional linear style, it is also much more efficient.

For example, instead of waiting for an HTTP request to finish before continuing execution, with Python async coroutines you can submit the request and do other work that’s waiting in a queue while waiting for the HTTP request to finish.

Asynchronicity seems to be a big reason why Node.js so popular for server-side programming. Much of the code we write, especially in heavy IO applications like websites, depends on external resources. This could be anything from a remote database call to POSTing to a REST service. As soon as you ask for any of these resources, your code is waiting around with nothing to do. With asynchronous programming, you allow your code to handle other tasks while waiting for these other resources to respond.

How Does Python Do Multiple Things At Once?

1. Multiple Processes

The most obvious way is to use multiple processes. From the terminal, you can start your script two, three, four…ten times and then all the scripts are going to run independently or at the same time. The operating system that’s underneath will take care of sharing your CPU resources among all those instances. Alternately you can use the multiprocessing library which supports spawning processes as shown in the example below.

from multiprocessing import Process


def print_func(continent='Asia'):
    print('The name of continent is : ', continent)

if __name__ == "__main__":  # confirms that the code is under main function
    names = ['America', 'Europe', 'Africa']
    procs = []
    proc = Process(target=print_func)  # instantiating without any argument
    procs.append(proc)
    proc.start()

    # instantiating process with arguments
    for name in names:
        # print(name)
        proc = Process(target=print_func, args=(name,))
        procs.append(proc)
        proc.start()

    # complete the processes
    for proc in procs:
        proc.join()

from multiprocessing import Process


def print_func(continent='Asia'):
    print('The name of continent is : ', continent)

if __name__ == "__main__":  # confirms that the code is under main function
    names = ['America', 'Europe', 'Africa']
    procs = []
    proc = Process(target=print_func)  # instantiating without any argument
    procs.append(proc)
    proc.start()

    # instantiating process with arguments
    for name in names:
        # print(name)
        proc = Process(target=print_func, args=(name,))
        procs.append(proc)
        proc.start()

    # complete the processes
    for proc in procs:
        proc.join()

Output:

The name of continent is :  Asia
The name of continent is :  America
The name of continent is :  Europe
The name of continent is :  Africa

The name of continent is :  Asia
The name of continent is :  America
The name of continent is :  Europe
The name of continent is :  Africa

2. Multiple Threads

The next way to run multiple things at once is to use threads. A thread is a line of execution, pretty much like a process, but you can have multiple threads in the context of one process and they all share access to common resources. But because of this, it’s difficult to write a threading code. And again, the operating system is doing all the heavy lifting on sharing the CPU, but the global interpreter lock (GIL) allows only one thread to run Python code at a given time even when you have multiple threads running code. So, In CPython, the GIL prevents multi-core concurrency. Basically, you’re running in a single core even though you may have two or four or more.

import threading
 
def print_cube(num):
    """
    function to print cube of given num
    """
    print("Cube: {}".format(num * num * num))
 
def print_square(num):
    """
    function to print square of given num
    """
    print("Square: {}".format(num * num))
 
if __name__ == "__main__":
    # creating thread
    t1 = threading.Thread(target=print_square, args=(10,))
    t2 = threading.Thread(target=print_cube, args=(10,))
 
    # starting thread 1
    t1.start()
    # starting thread 2
    t2.start()
 
    # wait until thread 1 is completely executed
    t1.join()
    # wait until thread 2 is completely executed
    t2.join()
 
    # both threads completely executed
    print("Done!")

import threading
 
def print_cube(num):
    """
    function to print cube of given num
    """
    print("Cube: {}".format(num * num * num))
 
def print_square(num):
    """
    function to print square of given num
    """
    print("Square: {}".format(num * num))
 
if __name__ == "__main__":
    # creating thread
    t1 = threading.Thread(target=print_square, args=(10,))
    t2 = threading.Thread(target=print_cube, args=(10,))
 
    # starting thread 1
    t1.start()
    # starting thread 2
    t2.start()
 
    # wait until thread 1 is completely executed
    t1.join()
    # wait until thread 2 is completely executed
    t2.join()
 
    # both threads completely executed
    print("Done!")

Output:

Square: 100
Cube: 1000
Done!

Square: 100
Cube: 1000
Done!

3. Coroutines using yield:

Coroutines are generalization of subroutines. They are used for cooperative multitasking where a process voluntarily yield (give away) control periodically or when idle in order to enable multiple applications to be run simultaneously. Coroutines are similar to generators but with few extra methods and slight change in how we use yield statement. Generators produce data for iteration while coroutines can also consume data.

def print_name(prefix):
    print("Searching prefix:{}".format(prefix))
    try : 
        while True:
                # yeild used to create coroutine
                name = (yield)
                if prefix in name:
                    print(name)
    except GeneratorExit:
            print("Closing coroutine!!")
 
corou = print_name("Dear")
corou.__next__()
corou.send("James")
corou.send("Dear James")
corou.close()

def print_name(prefix):
    print("Searching prefix:{}".format(prefix))
    try : 
        while True:
                # yeild used to create coroutine
                name = (yield)
                if prefix in name:
                    print(name)
    except GeneratorExit:
            print("Closing coroutine!!")
 
corou = print_name("Dear")
corou.__next__()
corou.send("James")
corou.send("Dear James")
corou.close()

Output:

Searching prefix:Dear
Dear James
Closing coroutine!!

Searching prefix:Dear
Dear James
Closing coroutine!!

4. Asynchronous Programming

The fourth way is an asynchronous programming, where the OS is not participating. As far as OS is concerned you’re going to have one process and there’s going to be a single thread within that process, but you’ll be able to do multiple things at once. So, what’s the trick?

The answer is asyncio

asyncio is the new concurrency module introduced in Python 3.4. It is designed to use coroutines and futures to simplify asynchronous code and make it almost as readable as synchronous code as there are no callbacks.

asyncio uses different constructs: event loops, coroutines and futures.

An event loop manages and distributes the execution of different tasks. It registers them and handles distributing the flow of control between them.
Coroutines (covered above) are special functions that work similarly to Python generators, on await they release the flow of control back to the event loop. A coroutine needs to be scheduled to run on the event loop, once scheduled coroutines are wrapped in Tasks which is a type of Future.
Futures represent the result of a task that may or may not have been executed. This result may be an exception.

Using Asyncio, you can structure your code so subtasks are defined as coroutines and allows you to schedule them as you please, including simultaneously. Coroutines contain yield points where we define possible points where a context switch can happen if other tasks are pending, but will not if no other task is pending.

A context switch in asyncio represents the event loop yielding the flow of control from one coroutine to the next.

In the example, we run 3 async tasks that query Reddit separately, extract and print the JSON. We leverage aiohttp which is a http client library ensuring even the HTTP request runs asynchronously.

import signal  
import sys  
import asyncio  
import aiohttp  
import json

loop = asyncio.get_event_loop()  
client = aiohttp.ClientSession(loop=loop)

async def get_json(client, url):  
    async with client.get(url) as response:
        assert response.status == 200
        return await response.read()

async def get_reddit_top(subreddit, client):  
    data1 = await get_json(client, 'https://www.reddit.com/r/' + subreddit + '/top.json?sort=top&t=day&limit=5')

    j = json.loads(data1.decode('utf-8'))
    for i in j['data']['children']:
        score = i['data']['score']
        title = i['data']['title']
        link = i['data']['url']
        print(str(score) + ': ' + title + ' (' + link + ')')

    print('DONE:', subreddit + '\n')

def signal_handler(signal, frame):  
    loop.stop()
    client.close()
    sys.exit(0)

signal.signal(signal.SIGINT, signal_handler)

asyncio.ensure_future(get_reddit_top('python', client))  
asyncio.ensure_future(get_reddit_top('programming', client))  
asyncio.ensure_future(get_reddit_top('compsci', client))  
loop.run_forever()

import signal  
import sys  
import asyncio  
import aiohttp  
import json

loop = asyncio.get_event_loop()  
client = aiohttp.ClientSession(loop=loop)

async def get_json(client, url):  
    async with client.get(url) as response:
        assert response.status == 200
        return await response.read()

async def get_reddit_top(subreddit, client):  
    data1 = await get_json(client, 'https://www.reddit.com/r/' + subreddit + '/top.json?sort=top&t=day&limit=5')

    j = json.loads(data1.decode('utf-8'))
    for i in j['data']['children']:
        score = i['data']['score']
        title = i['data']['title']
        link = i['data']['url']
        print(str(score) + ': ' + title + ' (' + link + ')')

    print('DONE:', subreddit + '\n')

def signal_handler(signal, frame):  
    loop.stop()
    client.close()
    sys.exit(0)

signal.signal(signal.SIGINT, signal_handler)

asyncio.ensure_future(get_reddit_top('python', client))  
asyncio.ensure_future(get_reddit_top('programming', client))  
asyncio.ensure_future(get_reddit_top('compsci', client))  
loop.run_forever()

Output:

50: Undershoot: Parsing theory in 1965 (http://jeffreykegler.github.io/Ocean-of-Awareness-blog/individual/2018/07/knuth_1965_2.html)
12: Question about best-prefix/failure function/primal match table in kmp algorithm (https://www.reddit.com/r/compsci/comments/8xd3m2/question_about_bestprefixfailure_functionprimal/)
1: Question regarding calculating the probability of failure of a RAID system (https://www.reddit.com/r/compsci/comments/8xbkk2/question_regarding_calculating_the_probability_of/)
DONE: compsci

336: /r/thanosdidnothingwrong -- banning people with python (https://clips.twitch.tv/AstutePluckyCocoaLitty)
175: PythonRobotics: Python sample codes for robotics algorithms (https://atsushisakai.github.io/PythonRobotics/)
23: Python and Flask Tutorial in VS Code (https://code.visualstudio.com/docs/python/tutorial-flask)
17: Started a new blog on Celery - what would you like to read about? (https://www.python-celery.com)
14: A Simple Anomaly Detection Algorithm in Python (https://medium.com/@mathmare_/pyng-a-simple-anomaly-detection-algorithm-2f355d7dc054)
DONE: python

1360: git bundle (https://dev.to/gabeguz/git-bundle-2l5o)
1191: Which hashing algorithm is best for uniqueness and speed? Ian Boyd's answer (top voted) is one of the best comments I've seen on Stackexchange. (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed)
430: ARM launches “Facts” campaign against RISC-V (https://riscv-basics.com/)
244: Choice of search engine on Android nuked by “Anonymous Coward” (2009) (https://android.googlesource.com/platform/packages/apps/GlobalSearch/+/592150ac00086400415afe936d96f04d3be3ba0c)
209: Exploiting freely accessible WhatsApp data or “Why does WhatsApp web know my phone’s battery level?” (https://medium.com/@juan_cortes/exploiting-freely-accessible-whatsapp-data-or-why-does-whatsapp-know-my-battery-level-ddac224041b4)
DONE: programming

50: Undershoot: Parsing theory in 1965 (http://jeffreykegler.github.io/Ocean-of-Awareness-blog/individual/2018/07/knuth_1965_2.html)
12: Question about best-prefix/failure function/primal match table in kmp algorithm (https://www.reddit.com/r/compsci/comments/8xd3m2/question_about_bestprefixfailure_functionprimal/)
1: Question regarding calculating the probability of failure of a RAID system (https://www.reddit.com/r/compsci/comments/8xbkk2/question_regarding_calculating_the_probability_of/)
DONE: compsci

336: /r/thanosdidnothingwrong -- banning people with python (https://clips.twitch.tv/AstutePluckyCocoaLitty)
175: PythonRobotics: Python sample codes for robotics algorithms (https://atsushisakai.github.io/PythonRobotics/)
23: Python and Flask Tutorial in VS Code (https://code.visualstudio.com/docs/python/tutorial-flask)
17: Started a new blog on Celery - what would you like to read about? (https://www.python-celery.com)
14: A Simple Anomaly Detection Algorithm in Python (https://medium.com/@mathmare_/pyng-a-simple-anomaly-detection-algorithm-2f355d7dc054)
DONE: python

1360: git bundle (https://dev.to/gabeguz/git-bundle-2l5o)
1191: Which hashing algorithm is best for uniqueness and speed? Ian Boyd's answer (top voted) is one of the best comments I've seen on Stackexchange. (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed)
430: ARM launches “Facts” campaign against RISC-V (https://riscv-basics.com/)
244: Choice of search engine on Android nuked by “Anonymous Coward” (2009) (https://android.googlesource.com/platform/packages/apps/GlobalSearch/+/592150ac00086400415afe936d96f04d3be3ba0c)
209: Exploiting freely accessible WhatsApp data or “Why does WhatsApp web know my phone’s battery level?” (https://medium.com/@juan_cortes/exploiting-freely-accessible-whatsapp-data-or-why-does-whatsapp-know-my-battery-level-ddac224041b4)
DONE: programming

Using Redis and Redis Queue(RQ):

Using asyncio and aiohttp may not always be in an option especially if you are using older versions of python. Also, there will be scenarios when you would want to distribute your tasks across different servers. In that case we can leverage RQ (Redis Queue). It is a simple Python library for queueing jobs and processing them in the background with workers. It is backed by Redis – a key/value data store.

In the example below, we have queued a simple function count_words_at_url using redis.

from mymodule import count_words_at_url
from redis import Redis
from rq import Queue


q = Queue(connection=Redis())
job = q.enqueue(count_words_at_url, 'http://nvie.com')


******mymodule.py******

import requests

def count_words_at_url(url):
    """Just an example function that's called async."""
    resp = requests.get(url)

    print( len(resp.text.split()))
    return( len(resp.text.split()))

from mymodule import count_words_at_url
from redis import Redis
from rq import Queue


q = Queue(connection=Redis())
job = q.enqueue(count_words_at_url, 'http://nvie.com')


******mymodule.py******

import requests

def count_words_at_url(url):
    """Just an example function that's called async."""
    resp = requests.get(url)

    print( len(resp.text.split()))
    return( len(resp.text.split()))

Output:

15:10:45 RQ worker 'rq:worker:EMPID18030.9865' started, version 0.11.0
15:10:45 *** Listening on default...
15:10:45 Cleaning registries for queue: default
15:10:50 default: mymodule.count_words_at_url('http://nvie.com') (a2b7451e-731f-4f31-9232-2b7e3549051f)
322
15:10:51 default: Job OK (a2b7451e-731f-4f31-9232-2b7e3549051f)
15:10:51 Result is kept for 500 seconds

15:10:45 RQ worker 'rq:worker:EMPID18030.9865' started, version 0.11.0
15:10:45 *** Listening on default...
15:10:45 Cleaning registries for queue: default
15:10:50 default: mymodule.count_words_at_url('http://nvie.com') (a2b7451e-731f-4f31-9232-2b7e3549051f)
322
15:10:51 default: Job OK (a2b7451e-731f-4f31-9232-2b7e3549051f)
15:10:51 Result is kept for 500 seconds

Conclusion:

Let’s take a classical example chess exhibition where one of the best chess players competes against a lot of people. And if there are 24 games with 24 people to play with and the chess master plays with all of them synchronically, it’ll take at least 12 hours (taking into account that the average game takes 30 moves, the chess master thinks for 5 seconds to come up with a move and the opponent – for approximately 55 seconds). But using the asynchronous mode gives chess master the opportunity to make a move and leave the opponent thinking while going to the next one and making a move there. This way a move on all 24 games can be done in 2 minutes and all of them can be won in just one hour.

So, this is what’s meant when people talk about asynchronous being really fast. It’s this kind of fast. Chess master doesn’t play chess faster, the time is just more optimized and it’s not get wasted on waiting around. This is how it works.

In this analogy, the chess master will be our CPU and the idea is that we wanna make sure that the CPU doesn’t wait or waits the least amount of time possible. It’s about always finding something to do.

A practical definition of Async is that it’s a style of concurrent programming in which tasks release the CPU during waiting periods, so that other tasks can use it. In Python, there are several ways to achieve concurrency, based on our requirement, code flow, data manipulation, architecture design and use cases we can select any of these methods.

December 12, 2022

Node.js – Async Your Way out of Callback Hell with Promises, Async & Async/Await

In this blog, I will compare various methods to avoid the dreaded callback hells that are common in Node.js. What exactly am I talking about? Have a look at this piece of code below. Every child function executes only when the result of its parent function is available. Callbacks are the very essence of the unblocking (and hence performant) nature of Node.js.

foo(arg, (err, val) => {
     if (err) {
          console.log(err);
     } else {
          val += 1;
          bar(val, (err1, val1) => {
               if (err) {
                    console.log(err1);
               } else {
                    val1 += 2;
                    baz(val1, (err2, result) => {
                         if (err2) {
                              console.log(err2);
                         } else {
                              result += 3;
                              console.log(result); // 6
                         }
                    });
               }
          });
     }
});

foo(arg, (err, val) => {
     if (err) {
          console.log(err);
     } else {
          val += 1;
          bar(val, (err1, val1) => {
               if (err) {
                    console.log(err1);
               } else {
                    val1 += 2;
                    baz(val1, (err2, result) => {
                         if (err2) {
                              console.log(err2);
                         } else {
                              result += 3;
                              console.log(result); // 6
                         }
                    });
               }
          });
     }
});

Convinced yet? Even though there is some seemingly unnecessary error handling done here, I assume you get the drift! The problem with such code is more than just indentation. Instead, our programs entire flow is based on side effects – one function only incidentally calling the inner function.

There are multiple ways in which we can avoid writing such deeply nested code. Let’s have a look at our options:

Promises

According to the official specification, promise represents an eventual result of an asynchronous operation. Basically, it represents an operation that has not completed yet but is expected to in the future. The then method is a major component of a promise. It is used to get the return value (fulfilled or rejected) of a promise. Only one of these two values will ever be set. Let’s have a look at a simple file read example without using promises:

fs.readFile(filePath, (err, result) => {
     if (err) { console.log(err); }
     console.log(data);
});

fs.readFile(filePath, (err, result) => {
     if (err) { console.log(err); }
     console.log(data);
});

Now, if readFile function returned a promise, the same logic could be written like so:

var fileReadPromise = fs.readFile(filePath);
fileReadPromise.then(console.log, console.error)

var fileReadPromise = fs.readFile(filePath);
fileReadPromise.then(console.log, console.error)

The fileReadPromise can then be passed around multiple times in a code where you need to read a file. This helps in writing robust unit tests for your code since you now only have to write a single test for a promise. And more readable code!

Chaining using promises

The then function itself returns a promise which can again be used to do the next operation. Changing the first code snippet to using promises results in this:

foo(arg, (err, val) => {
     if (err) {
          console.log(err);
     } else {
          val += 1;
          bar(val, (err1, val1) => {
               if (err) {
                    console.log(err1);
               } else {
                    val1 += 2;
                    baz(val1, (err2, result) => {
                         if (err2) {
                              console.log(err2);
                         } else {
                              result += 3;
                              console.log(result); // 6
                         }
                    });
               }
          });
     }
});

foo(arg, (err, val) => {
     if (err) {
          console.log(err);
     } else {
          val += 1;
          bar(val, (err1, val1) => {
               if (err) {
                    console.log(err1);
               } else {
                    val1 += 2;
                    baz(val1, (err2, result) => {
                         if (err2) {
                              console.log(err2);
                         } else {
                              result += 3;
                              console.log(result); // 6
                         }
                    });
               }
          });
     }
});

As in evident, it makes the code more composed, readable and easier to maintain. Also, instead of chaining we could have used Promise.all. Promise.all takes an array of promises as input and returns a single promise that resolves when all the promises supplied in the array are resolved. Other useful information on promises can be found here.

The async utility module

Async is an utility module which provides a set of over 70 functions that can be used to elegantly solve the problem of callback hells. All these functions follow the Node.js convention of error-first callbacks which means that the first callback argument is assumed to be an error (null in case of success). Let’s try to solve the same foo-bar-baz problem using async module. Here is the code snippet:

function foo(arg, callback) {
  if (arg < 0) {
    callback('error');
    return;
  }
  callback(null, arg+1);
}

function bar(arg, callback) {
  if (arg < 0) {
    callback('error');
    return;
  }
  callback(null, arg+2);
}

function baz(arg, callback) {
  if (arg < 0) {
    callback('error');
    return;
  }
  callback(null, arg+3);
}

async.waterfall([
  (cb) => {
    foo(0, cb);
  },
  (arg, cb) => {
    bar(arg, cb);
  },
  (arg, cb) => {
    baz(arg, cb);
  }
], (err, result) => {
  if (err) {
    console.log(err);
  } else {
    console.log(result); //6
  }
});

function foo(arg, callback) {
  if (arg < 0) {
    callback('error');
    return;
  }
  callback(null, arg+1);
}

function bar(arg, callback) {
  if (arg < 0) {
    callback('error');
    return;
  }
  callback(null, arg+2);
}

function baz(arg, callback) {
  if (arg < 0) {
    callback('error');
    return;
  }
  callback(null, arg+3);
}

async.waterfall([
  (cb) => {
    foo(0, cb);
  },
  (arg, cb) => {
    bar(arg, cb);
  },
  (arg, cb) => {
    baz(arg, cb);
  }
], (err, result) => {
  if (err) {
    console.log(err);
  } else {
    console.log(result); //6
  }
});

Here, I have used the async.waterfall function as an example. There are a multiple functions available according to the nature of the problem you are trying to solve like async.each – for parallel execution, async.eachSeries – for serial execution etc.

Async/Await

Now, this is one of the most exciting features coming to Javascript in near future. It internally uses promises but handles them in a more intuitive manner. Even though it seems like promises and/or 3rd party modules like async would solve most of the problems, a further simplification is always welcome! For those of you who have worked with C# async/await, this concept is directly cribbed from there and being brought into ES7.

Async/await enables us to write asynchronous promise-based code as if it were synchronous, but without blocking the main thread. An async function always returns a promise whether await is used or not. But whenever an await is observed, the function is paused until the promise either resolves or rejects. Following code snippet should make it clearer:

async function asyncFun() {
  try {
    const result = await promise;
  } catch(error) {
    console.log(error);
  }
}

async function asyncFun() {
  try {
    const result = await promise;
  } catch(error) {
    console.log(error);
  }
}

Here, asyncFun is an async function which captures the promised result using await. This has made the code readable and a major convenience for developers who are more comfortable with linearly executed languages, without blocking the main thread.

Now, like before, lets solve the foo-bar-baz problem using async/await. Note that foo, bar and baz individually return promises just like before. But instead of chaining, we have written the code linearly.

async fooBarBaz(arg) {
  try {
  const fooResponse = await foo(arg+1);
  const barResponse = await bar(arg+2);
  const bazResponse = await baz(arg+3);

  return bazResponse;
  } catch (error) {
    return Error(error);
  }
}

async fooBarBaz(arg) {
  try {
  const fooResponse = await foo(arg+1);
  const barResponse = await bar(arg+2);
  const bazResponse = await baz(arg+3);

  return bazResponse;
  } catch (error) {
    return Error(error);
  }
}

How long should you (a)wait for async to come to fore?

Well, it’s already here in the Chrome 55 release and the latest update of the V8 engine. The native support in the language means that we should see a much more widespread use of this feature. The only, catch is that if you would want to use async/await on a codebase which isn’t promise aware and based completely on callbacks, it probably will require a lot of wrapping the existing functions to make them usable.

To wrap up, async/await definitely make coding numerous async operations an easier job. Although promises and callbacks would do the job for most, async/await looks like the way to make some architectural problems go away and improve code quality.

December 12, 2022

Implementing Async Features in Python – A Step-by-step Guide

Asynchronous programming is a characteristic of modern programming languages that allows an application to perform various operations without waiting for any of them. Asynchronicity is one of the big reasons for the popularity of Node.js.

We have discussed Python’s asynchronous features as part of our previous post: an introduction to asynchronous programming in Python. This blog is a natural progression on the same topic. We are going to discuss async features in Python in detail and look at some hands-on examples.

Consider a traditional web scraping application that needs to open thousands of network connections. We could open one network connection, fetch the result, and then move to the next ones iteratively. This approach increases the latency of the program. It spends a lot of time opening a connection and waiting for others to finish their bit of work.

On the other hand, async provides you a method of opening thousands of connections at once and swapping among each connection as they finish and return their results. Basically, it sends the request to a connection and moves to the next one instead of waiting for the previous one’s response. It continues like this until all the connections have returned the outputs.

Source: phpmind

From the above chart, we can see that using synchronous programming on four tasks took 45 seconds to complete, while in asynchronous programming, those four tasks took only 20 seconds.

Where Does Asynchronous Programming Fit in the Real-world?

Asynchronous programming is best suited for popular scenarios such as:

1. The program takes too much time to execute.

2. The reason for the delay is waiting for input or output operations, not computation.

3. For the tasks that have multiple input or output operations to be executed at once.

And application-wise, these are the example use cases:

Web Scraping
Network Services

Difference Between Parallelism, Concurrency, Threading, and Async IO

Because we discussed this comparison in detail in our previous post, we will just quickly go through the concept as it will help us with our hands-on example later.

Parallelism involves performing multiple operations at a time. Multiprocessing is an example of it. It is well suited for CPU bound tasks.

Concurrency is slightly broader than Parallelism. It involves multiple tasks running in an overlapping manner.

Threading – a thread is a separate flow of execution. One process can contain multiple threads and each thread runs independently. It is ideal for IO bound tasks.

Async IO is a single-threaded, single-process design that uses cooperative multitasking. In simple words, async IO gives a feeling of concurrency despite using a single thread in a single process.

Fig:- A comparison in concurrency and parallelism

Components of Async IO Programming

Let’s explore the various components of Async IO in depth. We will also look at an example code to help us understand the implementation.

1. Coroutines

Coroutines are mainly generalization forms of subroutines. They are generally used for cooperative tasks and behave like Python generators.

An async function uses the await keyword to denote a coroutine. When using the await keyword, coroutines release the flow of control back to the event loop.

To run a coroutine, we need to schedule it on the event loop. After scheduling, coroutines are wrapped in Tasks as a Future object.

Example:

In the below snippet, we called async_func from the main function. We have to add the await keyword while calling the sync function. As you can see, async_func will do nothing unless the await keyword implementation accompanies it.

import asyncio
async def async_func():
    print('Velotio ...')
    await asyncio.sleep(1)
    print('... Technologies!')

async def main():
    async_func()#this will do nothing because coroutine object is created but not awaited
    await async_func()

asyncio.run(main())

import asyncio
async def async_func():
    print('Velotio ...')
    await asyncio.sleep(1)
    print('... Technologies!')

async def main():
    async_func()#this will do nothing because coroutine object is created but not awaited
    await async_func()

asyncio.run(main())

Output

RuntimeWarning: coroutine 'async_func' was never awaited
 async_func()#this will do nothing because coroutine object is created but not awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Velotio ...
... Blog!

RuntimeWarning: coroutine 'async_func' was never awaited
 async_func()#this will do nothing because coroutine object is created but not awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Velotio ...
... Blog!

2. Tasks

Tasks are used to schedule coroutines concurrently.

When submitting a coroutine to an event loop for processing, you can get a Task object, which provides a way to control the coroutine’s behavior from outside the event loop.

Example:

In the snippet below, we are creating a task using create_task (an inbuilt function of asyncio library), and then we are running it.

import asyncio
async def async_func():
    print('Velotio ...')
    await asyncio.sleep(1)
    print('... Blog!')

async def main():
    task = asyncio.create_task (async_func())
    await task
asyncio.run(main())

import asyncio
async def async_func():
    print('Velotio ...')
    await asyncio.sleep(1)
    print('... Blog!')

async def main():
    task = asyncio.create_task (async_func())
    await task
asyncio.run(main())

Output

Velotio ...
... Blog!

Velotio ...
... Blog!

3 Event Loops

This mechanism runs coroutines until they complete. You can imagine it as while(True) loop that monitors coroutine, taking feedback on what’s idle, and looking around for things that can be executed in the meantime.

It can wake up an idle coroutine when whatever that coroutine is waiting on becomes available.

Only one event loop can run at a time in Python.

Example:

In the snippet below, we are creating three tasks and then appending them in a list and executing all tasks asynchronously using get_event_loop, create_task and the await function of the asyncio library.

import asyncio
async def async_func(task_no):
    print(f'{task_no} :Velotio ...')
    await asyncio.sleep(1)
    print(f'{task_no}... Blog!')

async def main():
    taskA = loop.create_task (async_func('taskA'))
    taskB = loop.create_task(async_func('taskB'))
    taskC = loop.create_task(async_func('taskC'))
    await asyncio.wait([taskA,taskB,taskC])

if __name__ == "__main__":
    try:
        loop = asyncio.get_event_loop()
        loop.run_until_complete(main())
    except :
        pass

import asyncio
async def async_func(task_no):
    print(f'{task_no} :Velotio ...')
    await asyncio.sleep(1)
    print(f'{task_no}... Blog!')

async def main():
    taskA = loop.create_task (async_func('taskA'))
    taskB = loop.create_task(async_func('taskB'))
    taskC = loop.create_task(async_func('taskC'))
    await asyncio.wait([taskA,taskB,taskC])

if __name__ == "__main__":
    try:
        loop = asyncio.get_event_loop()
        loop.run_until_complete(main())
    except :
        pass

Output

taskA :Velotio ...
taskB :Velotio ...
taskC :Velotio ...
taskA... Blog!
taskB... Blog!
taskC... Blog!

taskA :Velotio ...
taskB :Velotio ...
taskC :Velotio ...
taskA... Blog!
taskB... Blog!
taskC... Blog!

Future

A future is a special, low-level available object that represents an eventual result of an asynchronous operation.

When a Future object is awaited, the co-routine will wait until the Future is resolved in some other place.

We will look into the sample code for Future objects in the next section.

A Comparison Between Multithreading and Async IO

Before we get to Async IO, let’s use multithreading as a benchmark and then compare them to see which is more efficient.

For this benchmark, we will be fetching data from a sample URL (the Velotio Career webpage) with different frequencies, like once, ten times, 50 times, 100 times, 500 times, respectively.

We will then compare the time taken by both of these approaches to fetch the required data.

Implementation

Code of Multithreading:

import requests
import time
from concurrent.futures import ProcessPoolExecutor


def fetch_url_data(pg_url):
    try:
        resp = requests.get(pg_url)
    except Exception as e:
        print(f"Error occured during fetch data from url{pg_url}")
    else:
        return resp.content
        

def get_all_url_data(url_list):
    with ProcessPoolExecutor() as executor:
        resp = executor.map(fetch_url_data, url_list)
    return resp
    

if __name__=='__main__':
    url = "https://www.velotio.com/careers"
    for ntimes in [1,10,50,100,500]:
        start_time = time.time()
        responses = get_all_url_data([url] * ntimes)
        print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

import requests
import time
from concurrent.futures import ProcessPoolExecutor


def fetch_url_data(pg_url):
    try:
        resp = requests.get(pg_url)
    except Exception as e:
        print(f"Error occured during fetch data from url{pg_url}")
    else:
        return resp.content
        

def get_all_url_data(url_list):
    with ProcessPoolExecutor() as executor:
        resp = executor.map(fetch_url_data, url_list)
    return resp
    

if __name__=='__main__':
    url = "https://www.velotio.com/careers"
    for ntimes in [1,10,50,100,500]:
        start_time = time.time()
        responses = get_all_url_data([url] * ntimes)
        print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

Output

Fetch total 1 urls and process takes 1.8822264671325684 seconds
Fetch total 10 urls and process takes 2.3358211517333984 seconds
Fetch total 50 urls and process takes 8.05638575553894 seconds
Fetch total 100 urls and process takes 14.43302869796753 seconds
Fetch total 500 urls and process takes 65.25404500961304 seconds

Fetch total 1 urls and process takes 1.8822264671325684 seconds
Fetch total 10 urls and process takes 2.3358211517333984 seconds
Fetch total 50 urls and process takes 8.05638575553894 seconds
Fetch total 100 urls and process takes 14.43302869796753 seconds
Fetch total 500 urls and process takes 65.25404500961304 seconds

ProcessPoolExecutor is a Python package that implements the Executor interface. The fetch_url_data is a function to fetch the data from the given URL using the requests python package, and the get_all_url_data function is used to map the fetch_url_data function to the lists of URLs.

Async IO Programming Example:

import asyncio
import time
from aiohttp import ClientSession, ClientResponseError


async def fetch_url_data(session, url):
    try:
        async with session.get(url, timeout=60) as response:
            resp = await response.read()
    except Exception as e:
        print(e)
    else:
        return resp
    return


async def fetch_async(loop, r):
    url = "https://www.velotio.com/careers"
    tasks = []
    async with ClientSession() as session:
        for i in range(r):
            task = asyncio.ensure_future(fetch_url_data(session, url))
            tasks.append(task)
        responses = await asyncio.gather(*tasks)
    return responses


if __name__ == '__main__':
    for ntimes in [1, 10, 50, 100, 500]:
        start_time = time.time()
        loop = asyncio.get_event_loop()
        future = asyncio.ensure_future(fetch_async(loop, ntimes))
        loop.run_until_complete(future) #will run until it finish or get any error
        responses = future.result()
        print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

import asyncio
import time
from aiohttp import ClientSession, ClientResponseError


async def fetch_url_data(session, url):
    try:
        async with session.get(url, timeout=60) as response:
            resp = await response.read()
    except Exception as e:
        print(e)
    else:
        return resp
    return


async def fetch_async(loop, r):
    url = "https://www.velotio.com/careers"
    tasks = []
    async with ClientSession() as session:
        for i in range(r):
            task = asyncio.ensure_future(fetch_url_data(session, url))
            tasks.append(task)
        responses = await asyncio.gather(*tasks)
    return responses


if __name__ == '__main__':
    for ntimes in [1, 10, 50, 100, 500]:
        start_time = time.time()
        loop = asyncio.get_event_loop()
        future = asyncio.ensure_future(fetch_async(loop, ntimes))
        loop.run_until_complete(future) #will run until it finish or get any error
        responses = future.result()
        print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

Output

Fetch total 1 urls and process takes 1.3974951362609863 seconds
Fetch total 10 urls and process takes 1.4191942596435547 seconds
Fetch total 50 urls and process takes 2.6497368812561035 seconds
Fetch total 100 urls and process takes 4.391665458679199 seconds
Fetch total 500 urls and process takes 4.960426330566406 seconds

Fetch total 1 urls and process takes 1.3974951362609863 seconds
Fetch total 10 urls and process takes 1.4191942596435547 seconds
Fetch total 50 urls and process takes 2.6497368812561035 seconds
Fetch total 100 urls and process takes 4.391665458679199 seconds
Fetch total 500 urls and process takes 4.960426330566406 seconds

We need to use the get_event_loop function to create and add the tasks. For running more than one URL, we have to use ensure_future and gather function.

The fetch_async function is used to add the task in the event_loop object and the fetch_url_data function is used to read the data from the URL using the session package. The future_result method returns the response of all the tasks.

Results:

As you can see from the plot, async programming is much more efficient than multi-threading for the program above.

The graph of the multithreading program looks linear, while the asyncio program graph is similar to logarithmic.

Conclusion

As we saw in our experiment above, Async IO showed better performance with the efficient use of concurrency than multi-threading.

Async IO can be beneficial in applications that can exploit concurrency. Though, based on what kind of applications we are dealing with, it is very pragmatic to choose Async IO over other implementations.

We hope this article helped further your understanding of the async feature in Python and gave you some quick hands-on experience using the code examples shared above.

December 12, 2022

Author: admin

Introduction

IT Operations & Machine Learning

Anomaly Detection using Elastic’s machine learning with X-Pack

Step I: Setup

1. Setup Elasticsearch:

2. Setup Kibana

3. Metricbeat:

Step II: Time Series data

Step III: Creating Machine Learning jobs

Job1: Tracking CPU Utilization

Job2: Tracking total requests made on server

Setting up Your Postman

Postman Collections

Adding a COLLECTION

Write the Tests

Newman CLI

USING NEWMAN

NEWMAN CLI Options

Iterations

Environment Variables

Bail FLAG

Conclusion

Horizontal Pod Autoscaling

‍Prerequisite

HPA using Multiple Resource Metrics‍

HPA using Custom metrics

‍Vertical Pod Autoscaling

Architecture

Installation

VPA using Resource Metrics

Selective Container Scaling

Conclusion

Related Articles:

What is Nightwatch JS?

Why Use Nightwatch JS Over Any Other Automation Tool?

Installation and Configuration of Nightwatch Framework

Using Nightwatch – Writing and Running Tests

Nightwatch and Cucumber JS

Executing Individual Feature Files or Scenarios

Feature and Scenario Tags

Custom Reporters in Nightwatch and Cucumber Framework

Conclusion

Pitfalls with Manual Testing

Automation Testing … #theSaviour ‍

Account Setup

Registration

Twilio Dashboard

‍Buy a Number

Code Setup

Install Twilio in your code base

Code snippet

List of other APIs to read an SMS provided by Twilio

Limitations with a Trial Twilio Account

Conclusion

Automated migrations!

Migration tool details

Tasks performed under migration

Demo

Step 1

Step 2

Step 3

Conclusion

Related Articles

Introduction

How Does Python Do Multiple Things At Once?

Using Redis and Redis Queue(RQ):

Conclusion:

Promises

Chaining using promises

The async utility module

Async/Await

How long should you (a)wait for async to come to fore?

Where Does Asynchronous Programming Fit in the Real-world?

Difference Between Parallelism, Concurrency, Threading, and Async IO

Components of Async IO Programming

1. Coroutines

2. Tasks

3 Event Loops

Future