The world continues to go through digital transformation at an accelerating pace. Modern applications and infrastructure continues to expand and operational complexity continues to grow. According to a recent ManageEngine Application Performance Monitoring Survey:
28 percent use ad-hoc scripts to detect issues in over 50 percent of their applications.
32 percent learn about application performance issues from end users.
59 percent trust monitoring tools to identify most performance deviations.
Most enterprises and web-scale companies have instrumentation & monitoring capabilities with an ElasticSearch cluster. They have a high amount of collected data but struggle to use it effectively. This available data can be used to improve availability and effectiveness of performance and uptime along with root cause analysis and incident prediction.
IT Operations & Machine Learning
Here is the main question: How to make sense of the huge piles of collected data? The first step towards making sense of data is to understand the correlations between the time series data. But only understanding will not work since correlation does not imply causation. We need a practical and scalable approach to understand the cause-effect relationship between data sources and events across complex infrastructure of VMs, containers, networks, micro-services, regions, etc.
It’s very likely that due to one component something goes wrong with another component. In such cases, operational historical data can be used to identify the root cause by investigating through a series of intermediate causes and effects. Machine learning is particularly useful for such problems where we need to identify “what changed”, since machine learning algorithms can easily analyze existing data to understand the patterns, thus making easier to recognize the cause. This is known as unsupervised learning, where the algorithm learns from the experience and identifies similar patterns when they come along again.
Let’s see how you can setup Elastic + X-Pack to enable anomaly detection for your infrastructure & applications.
Anomaly Detection using Elastic’s machine learning with X-Pack
Step I: Setup
1. Setup Elasticsearch:
According to Elastic documentation, it is recommended to use the Oracle JDK version 1.8.0_131. Check if you have required Java version installed on your system. It should be at least Java 8, if required install/upgrade accordingly.
Download elasticsearch tarball and untar it
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.tar.gz$ tar -xzvf elasticsearch-5.5.1.tar.gz
It will then create a folder named elasticsearch-5.5.1. Go into the folder.
$ cd elasticsearch-5.5.1
Install X-Pack into Elasticsearch
$ ./bin/elasticsearch-plugin install x-pack
Start elasticsearch
$ bin/elasticsearch
2. Setup Kibana
Kibana is an open source analytics and visualization platform designed to work with Elasticsearch.
Download kibana tarball and untar it
$ wget https://artifacts.elastic.co/downloads/kibana/kibana-5.5.1-linux-x86_64.tar.gz$ tar -xzf kibana-5.5.1-linux-x86_64.tar.gz
It will then create a folder named kibana-5.5.1. Go into the directory.
$ cd kibana-5.5.1-linux-x86_64
Install X-Pack into Kibana
$ ./bin/kibana-plugin install x-pack
Running kibana
$ ./bin/kibana
Navigate to Kibana at http://localhost:5601/
Log in as the built-in user elastic and password changeme.
You will see the below screen:
Kibana: X-Pack Welcome Page
3. Metricbeat:
Metricbeat helps in monitoring servers and the services they host by collecting metrics from the operating system and services. We will use it to get CPU utilization metrics of our local system in this blog.
Download Metric Beat’s tarball and untar it
$ wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-5.5.1-linux-x86_64.tar.gz$ tar -xvzf metricbeat-5.5.1-linux-x86_64.tar.gz
It will create a folder metricbeat-5.5.1-linux-x86_64. Go to the folder
$ cd metricbeat-5.5.1-linux-x86_64
By default, Metricbeat is configured to send collected data to elasticsearch running on localhost. If your elasticsearch is hosted on any server, change the IP and authentication credentials in metricbeat.yml file.
Metricbeat Config
Metric beat provides following stats:
System load
CPU stats
IO stats
Per filesystem stats
Per CPU core stats
File system summary stats
Memory stats
Network stats
Per process stats
Start Metricbeat as daemon process
$ sudo ./metricbeat -e -c metricbeat.yml &
Now, all setup is done. Let’s go to step 2 to create machine learning jobs.
Step II: Time Series data
Real-time data: We have metricbeat providing us the real-time series data which will be used for unsupervised learning. Follow below steps to define index pattern metricbeat-* in Kibana to search against this pattern in Elasticsearch: – Go to Management -> Index Patterns – Provide Index name or pattern as metricbeat-* – Select Time filter field name as @timestamp – Click Create
You will not be able to create an index if elasticsearch did not contain any metric beat data. Make sure your metric beat is running and output is configured as elasticsearch.
Saved Historic data: Just to see quickly how machine learning detect the anomalies you can also use data provided by Elastic. Download sample data by clicking here.
Unzip the files in a folder: tar -zxvf server_metrics.tar.gz
Download this script. It will be used to upload sample data to elastic.
Provide execute permissions to the file: chmod +x upload_server-metrics.sh
Run the script.
As we created index pattern for metricbeat data, in same way create index pattern server-metrics*
Step III: Creating Machine Learning jobs
There are two scenarios in which data is considered anomalous. First, when the behavior of key indicator changes over time relative to its previous behavior. Secondly, when within a population behavior of an entity deviates from other entities in population over single key indicator.
To detect these anomalies, there are three types of jobs we can create:
Single Metric job: This job is used to detect Scenario 1 kind of anomalies over only one key performance indicator.
Multimetric job: Multimetric job also detects Scenario 1 kind of anomalies but in this type of job we can track more than one performance indicators, such as CPU utilization along with memory utilization.
Advanced job: This kind of job is created to detect anomalies of type 2.
For simplicity, we are creating following single metric jobs:
Tracking CPU Utilization: Using metric beat data
Tracking total requests made on server: Using sample server data
Follow below steps to create single metric jobs:
Job1: Tracking CPU Utilization
Job2: Tracking total requests made on server
Go to http://localhost:5601/
Go to Machine learning tab on the left panel of Kibana.
Click on Create new job
Click Create single metric job
Select index we created in Step 2 i.e. metricbeat-* and server-metrics* respectively
Configure jobs by providing following values:
Aggregation: Here you need to select an aggregation function that will be applied to a particular field of data we are analyzing.
Field: It is a drop down, will show you all field that you have w.r.t index pattern.
Bucket span: It is interval time for analysis. Aggregation function will be applied on selected field after every interval time specified here.
If your data contains so many empty buckets i.e. data is sparse and you don’t want to consider it as anomalous check the checkbox named sparse data (if it appears).
Click on Use full <index pattern=””> data to use all available data for analysis.</index>
Metricbeats DescriptionServer Description
Click on play symbol
Provide job name and description
Click on Create Job
After creating job the data available will be analyzed. Click on view results, you will see a chart which will show the actual and upper & lower bound of predicted value. If actual value lies outside of the range, it will be considered as anomalous. The Color of the circles represents the severity level.
Here we are getting a high range of prediction values since it just started learning. As we get more data the prediction will get better.You can see here predictions are pretty good since there is a lot of data to understand the pattern
Click on machine learning tab in the left panel. The jobs we created will be listed here.
You will see the list of actions for every job you have created.
Since we are storing every minute data for Job1 using metricbeat. We can feed the data to the job in real time. Click on play button to start data feed. As we get more and more data prediction will improve.
You see details of anomalies by clicking Anomaly Viewer.
Anomaly in metricbeats dataServer metrics anomalies
We have seen how machine learning can be used to get patterns among the different statistics along with anomaly detection. After identifying anomalies, it is required to find the context of those events. For example, to know about what other factors are contributing to the problem? In such cases, we can troubleshoot by creating multimetric jobs.
In the last few years, we have an exponential increase in the development and use of APIs. We are in the era of API-first companies like Stripe, Twilio, Mailgun etc. where the entire product or service is exposed via REST APIs. Web applications also today are powered by REST-based Web Services. APIs today encapsulate critical business logic with high SLAs. Hence it is important to test APIs as part of the continuous integration process to reduce errors, improve predictability and catch nasty bugs.
In the context of API development, Postman is great REST client to test APIs. Although Postman is not just a REST Client, it contains a full-featured testing sandbox that lets you write and execute Javascript based tests for your API.
Postman comes with a nifty CLI tool – Newman. Newman is the Postman’s Collection Runner engine that sends API requests, receives the response and then runs your tests against the response. Newman lets developments easily integrate Postman into continuous integration systems like Jenkins. Some of the important features of Postman & Newman include:-
Ability to test any API and see the response instantly.
Ability to create test suites or collections using a collection of API endpoints.
Ability to collaborate with team members on these collections.
Ability to easily export/import collections as JSON files.
We are going to look at all these features, some are intuitive and some not so much unless you’ve been using Postman for a while.
Later, can then look it up in your installed apps and open it. You can choose to Sign Up & create an account if you want, this is important especially for saving your API collections and accessing them anytime on any machine. However, for this article, we can skip this. There’s a button for that towards the bottom when you first launch the app.
Postman Collections
Postman Collections in simple words is a collection of tests. It is essentially a test suite of related tests. These tests can be scenario-based tests or sequence/workflow-based tests.
There’s a Collections tab on the top left of Postman, with an example Postman Echo collection. You can open and go through it.
Just like in the above screenshot, select a API request and click on the Tests. Check the first line:
tests["response code is 200"] = responseCode.code ===200;
The above line is a simple test to check if the response code for the API is 200. This is the pattern for writing Assertions/Tests in Postman (using JavaScript), and this is actually how you are going to write the tests for API’s need to be tested.You can open the other API requests in the POSTMAN Echo collection to get a sense of how requests are made.
Adding a COLLECTION
To make your own collection, click on the ‘Add Collection‘ button on the top left of Postman and call it “Test API”
You will be prompted to give details about the collection, I’ve added a name Github API and given it a description.
Clicking on Create should add the collection to the left pane, above, or below the example “POSTMAN Echo” collection.
If you need a hierarchy for maintaining relevance between multiple API’s inside a collection, APIs can further be added to a folder inside a collection. Folders are a great way of separating different parts of your API workflow. You can be added folders through the “3 dot” button beside Collection Name:
Eg.: name the folder “Get Calls” and give a description once again.
Now that we have the folder, the next task is to add an API call that is related to the TEST_API_COLLECTION to that folder. That API call is to https://api.github.com/.
If you still have one of the TEST_API_COLLECTION collections open, you can close it the same way you close tabs in a browser, or just click on the plus button to add a new tab on the right pane where we make requests.
Type in or paste in https://api.github.com/ and press Send to see the response.
Once you get the response, you can click on the arrow next to the Save button on the far right, and select Save As, a pop up will be displayed asking where to save the API call.
Give a name, it can be the request URL, or a name like “GET Github Basic”, and a description, then choose the collection and folder, in this case, TEST_API_COLLECTION> GET CALLS, then click on Save. The API call will be added to the Github Root API folder on the left pane.
Whenever you click on this request from the collection, it will open in the center pane.
Write the Tests
We’ve seen that the GET Github Basic request has a JSON response, which is usually the case for most of the APIs.This response has properties such as current_user_url, emails_url, followers_url and following_url to pick a few. The current_user_url has a value of https://api.github.com/user. Let’s add a test, for this URL. Click on the ‘GET Github Basic‘ and click on the test tab in the section just below where the URL is put.
You will notice on the right pane, we have some snippets which Postman creates when you click so that you don’t have to write a lot of code. Let’s add Response Body: JSON value check. Clicking on it produces the following snippet.
var jsonData =JSON.parse(responseBody);tests["Your test name"] = jsonData.value ===100;
From these two lines, it is apparent that Postman stores the response in a global object called responseBody, and we can use this to access response and assert values in tests as required.
Postman also has another global variable object called tests, which is an object you can use to name your tests, and equate it to a boolean expression. If the boolean expression returns true, then the test passes.
tests['some random test'] = x === y
If you click on Send to make the request, you will see one of the tests failing.
Lets create a test that relevant to our usecase.
var jsonData =JSON.parse(responseBody);var usersURL ="https://api.github.com/user"tests["Gets the correct users url"] = jsonData.current_user_url === usersURL;
Clicking on ‘Send‘, you’ll see the test passing.
Let’s modify the test further to test some of the properties we want to check
Ideally the things to be tested in an API Response Body should be:
Response Code ( Assert Correct Response Code for any request)
Response Time ( to check api responds in an acceptable time range / is not delayed)
Response Body is not empty / null
tests["Status code is 200"] = responseCode.code ===200;tests["Response time is less than 200ms"] = responseTime <200;tests["Response time is acceptable"] = _.inRange(responseTime, 0, 500);tests["Body is not empty"] = (responseBody!==null|| responseBody.length!==0);
Newman CLI
Once you’ve set up all your collections and written tests for them, it may be tedious to go through them one by one and clicking send to see if a given collection test passes. This is where Newman comes in. Newman is a command-line collection runner for Postman.
All you need to do is export your collection and the environment variables, then use Newman to run the tests from your terminal.
NOTE: Make sure you’ve clicked on ‘Save’ to save your collection first before exporting.
USING NEWMAN
So the first step is to export your collection and environment variables. Click on the Menu icon for Github API collection, and select export.
Select version 2, and click on “Export”
Save the JSON file in a location you can access with your terminal. I created a local directory/folder called “postman” and saved it there.
Install Newman CLI globally, then navigate to the where you saved the collection.
npm install -g newman cd postman
Using Newman is quite straight-forward, and the documentation is extensive. You can even require it as a Node.js module and run the tests there. However, we will use the CLI.
Once you are in the directory, run newman run <collection_name.json>, </collection_name.json> replacing the collection_name with the name you used to save the collection.
newman run TEST_API_COLLECTION.postman_collection.json
NEWMAN CLI Options
Newman provides a rich set of options to customize a run. A list of options can be retrieved by running it with the -h flag.
$ newman run -hOptions - Additional args: Utility:-h, --help output usage information-v, --version output the version numberBasic setup:--folder [folderName] Specify a single folder to run from a collection.-e, --environment [file|URL] Specify a Postman environment asaJSON [file]-d, --data [file] Specify a data file to use either json or csv-g, --global [file] Specify a Postman globals file asJSON [file]-n, --iteration-count [number] Define the number of iterations to runRequest options:--delay-request [number] Specify a delay (in ms) between requests [number] --timeout-request [number] Specify a request timeout (in ms) for a requestMisc.:--bail Stops the runner when a test case fails--silent Disable terminal output --no-color Disable colored output-k, --insecure Disable strict ssl-x, --suppress-exit-code Continue running tests even after a failure, but exit with code=0--ignore-redirects Disable automatic following of 3XX responses
Lets try out of some of the options.
Iterations
Lets use the -n option to set the number of iterations to run the collection.
$ newman run mycollection.json -n 10 # runs the collection 10 times
To provide a different set of data, i.e. variables for each iteration, you can use the -d to specify a JSON or CSV file. For example, a data file such as the one shown below will run 2 iterations, with each iteration using a set of variables.
Each environment is a set of key-value pairs, with the key as the variable name. These Environment configurations can be used to differentiate between configurations specific to your execution environments eg. Dev, Test & Production.
To provide a different execution environment, you can use the -e to specify a JSON or CSV file. For example, a environment file such as the one shown below will provide the environment variables globally to all tests during execution.
Newman, by default, exits with a status code of 0 if everything runs well i.e. without any exceptions. Continuous integration tools respond to these exit codes and correspondingly pass or fail a build. You can use the –bail flag to tell Newman to halt on a test case error with a status code of 1 which can then be picked up by a CI tool or build system.
$ newman run PostmanCollection.json -e environment.json --bail newman
Conclusion
Postman and Newman can be used for a number of test cases, including creating usage scenarios, Suites, Packs for your API Test Cases. Further NEWMAN / POSTMAN can be very well Integrated with CI/CD Tools such as Jenkins, Travis etc.
Autoscaling, a key feature of Kubernetes, lets you improve the resource utilization of your cluster by automatically adjusting the application’s resources or replicas depending on the load at that time.
This blog talks about Pod Autoscaling in Kubernetes and how to set up and configure autoscalers to optimize the resource utilization of your application.
Horizontal Pod Autoscaling
What is the Horizontal Pod Autoscaler?
The Horizontal Pod Autoscaler (HPA) scales the number of pods of a replica-set/ deployment/ statefulset based on per-pod metrics received from resource metrics API (metrics.k8s.io) provided by metrics-server, the custom metrics API (custom.metrics.k8s.io), or the external metrics API (external.metrics.k8s.io).
Fig:- Horizontal Pod Autoscaling
Prerequisite
Verify that the metrics-server is already deployed and running using the command below, or deploy it using instructions here.
kubectl get deployment metrics-server -n kube-system
HPA using Multiple Resource Metrics
HPA fetches per-pod resource metrics (like CPU, memory) from the resource metrics API and calculates the current metric value based on the mean values of all targeted pods. It compares the current metric value with the target metric value specified in the HPA spec and produces a ratio used to scale the number of desired replicas.
A. Setup: Create a Deployment and HPA resource
In this blog post, I have used the config below to create a deployment of 3 replicas, with some memory load defined by “–vm-bytes”, “850M”.
Lets create an HPA resource for this deployment with multiple metric blocks defined. The HPA will consider each metric one-by-one and calculate the desired replica counts based on each of the metrics, and then select the one with the highest replica count.
We have defined the minimum number of replicas HPA can scale down to as 1 and the maximum number that it can scale up to as 10.
Target Average Utilization and Target Average Values implies that the HPA should scale the replicas up/down to keep the Current Metric Value equal or closest to Target Metric Value.
B. Understanding the HPA Algorithm
kubectl describe hpa autoscale-testerName: autoscale-testerNamespace: autoscale-tester...Metrics: ( current / target ) resource memory on pods: 894188202666m / 500Mi resource cpu on pods (asapercentageofrequest): 36% (361m) /50%Min replicas: 1Max replicas: 10Deployment pods: 3 current /6 desiredConditions: Type Status Reason Message----------------------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 6 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource ScalingLimited False DesiredWithinRange the desired count is within the acceptable rangeEvents: Type Reason Age From Message------------------------- Normal SuccessfulRescale 7s horizontal-pod-autoscaler New size: 6; reason: memory resource above target
HPA calculates pod utilization as total usage of all containers in the pod divided by total request. It looks at all containers individually and returns if container doesn’t have request.
The calculated Current Metric Value for memory, i,e., 894188202666m, is higher than the Target Average Value of 500Mi, so the replicas need to be scaled up.
The calculated Current Metric Value for CPU i.e., 36%, is lower than the Target Average Utilization of 50, so hence the replicas need to be scaled down.
Replicas are calculated based on both metrics and the highest replica count selected. So, the replicas are scaled up to 6 in this case.
HPA using Custom metrics
We will use the prometheus-adapter resource to expose custom application metrics to custom.metrics.k8s.io/v1beta1, which are retrieved by HPA. By defining our own metrics through the adapter’s configuration, we can let HPA perform scaling based on our custom metrics.
A.Setup: Install Prometheus Adapter
Create prometheus-adapter.yaml with the content below:
kubectl describe hpa autoscale-testerName: autoscale-testerNamespace: autoscale-tester...Metrics: ( current / target )"packets_in" on pods: 18666m /50Min replicas: 1Max replicas: 10Deployment pods: 3 current /3 desiredConditions: Type Status Reason Message----------------------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric packets_in ScalingLimited False DesiredWithinRange the desired count is within the acceptable rangeEvents: Type Reason Age From Message------------------------- Normal SuccessfulRescale 2s horizontal-pod-autoscaler New size: 2; reason: All metrics below target Normal SuccessfulRescale 2m51s horizontal-pod-autoscaler New size: 1; reason: All metrics below target kubectl describe hpa autoscale-testerName: autoscale-testerNamespace: autoscale-tester...Metrics: ( current / target )"packets_in" on pods: 18666m /50Min replicas: 1Max replicas: 10Deployment pods: 3 current /3 desiredConditions: Type Status Reason Message----------------------- AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2 ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric packets_in ScalingLimited False DesiredWithinRange the desired count is within the acceptable rangeEvents: Type Reason Age From Message------------------------- Normal SuccessfulRescale 2s horizontal-pod-autoscaler New size: 2; reason: All metrics below target Normal SuccessfulRescale 2m51s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
Here, the current calculated metric value is 18666m. The m represents milli-units. So, for example, 18666m means 18.666 which is what we expect ((33 + 11 + 10 )/3 = 18.666). Since it’s less than the target average value (i.e., 50), the HPA scales down the replicas to make the Current Metric Value : Target Metric Value ratio closest to 1. Hence, replicas are scaled down to 2 and later to 1.
Fig:- container_network_receive_packets_total
Fig:- Ratio to Target value
Vertical Pod Autoscaling
What is Vertical Pod Autoscaler?
Vertical Pod autoscaling (VPA) ensures that a container’s resources are not under- or over-utilized. It recommends optimized CPU and memory requests/limits values, and can also automatically update them for you so that the cluster resources are efficiently used.
Fig:- Vertical Pod Autoscaling
Architecture
VPA consists of 3 components:
VPA admission controller Once you deploy and enable the Vertical Pod Autoscaler in your cluster, every pod submitted to the cluster goes through this webhook, which checks whether a VPA object is referencing it.
VPA recommender The recommender pulls the current and past resource consumption (CPU and memory) data for each container from metrics-server running in the cluster and provides optimal resource recommendations based on it, so that a container uses only what it needs.
VPA updater The updater checks at regular intervals if a pod is running within the recommended range. Otherwise, it accepts it for update, and the pod is evicted by the VPA updater to apply resource recommendation.
Installation
If you are on Google Cloud Platform, you can simply enable vertical-pod-autoscaling:
Verify that the Vertical Pod Autoscaler pods are up and running:
kubectl get po -n kube-systemNAMEREADYSTATUSRESTARTSAGEvpa-admission-controller-68c748777d-ppspd 1/1 Running 0 7svpa-recommender-6fc8c67d85-gljpl 1/1 Running 0 8svpa-updater-786b96955c-bgp9d 1/1 Running 0 8skubectl get crdverticalpodautoscalers.autoscaling.k8s.io
VPA using Resource Metrics
A. Setup: Create a Deployment and VPA resource
Use the same deployment config to create a new deployment with “–vm-bytes”, “850M”. Then create a VPA resource in Recommendation Mode with updateMode : Off
minAllowed is an optional parameter that specifies the minimum CPU request and memory request allowed for the container.
maxAllowed is an optional parameter that specifies the maximum CPU request and memory request allowed for the container.
B. Check the Pod’s Resource Utilization
Check the resource utilization of the pods. Below, you can see only ~50 Mi memory is being used out of 1000Mi and only ~30m CPU out of 1000m. This clearly indicates that the pod resources are underutilized.
Target: The recommended CPU request and memory request for the container that will be applied to the pod by VPA.
Uncapped Target: The recommended CPU request and memory request for the container if you didn’t configure upper/lower limits in the VPA definition. These values will not be applied to the pod. They’re used only as a status indication.
Lower Bound: The minimum recommended CPU request and memory request for the container. There is a –pod-recommendation-min-memory-mb flag that determines the minimum amount of memory the recommender will set—it defaults to 250MiB.
Upper Bound: The maximum recommended CPU request and memory request for the container. It helps the VPA updater avoid eviction of pods that are close to the recommended target values. Eventually, the Upper Bound is expected to reach close to target recommendation.
Now, if you check the logs of vpa-updater, you can see it’s not processing VPA objects as the Update Mode is set as Off.
kubectl logs -f vpa-updater-675d47464b-k7xbx1 updater.go:135] skipping VPA object autoscale-tester-recommender because its mode is not "Recreate" or "Auto"1 updater.go:151] no VPA objects to process
Let’s change the VPA updateMode to “Auto” to see the processing.
As soon as you do that, you can see vpa-updater has started processing objects, and it’s terminating all 3 pods.
kubectl logs -f vpa-updater-675d47464b-k7xbx1 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-8zgb9 with priority 11 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-npts4 with priority 11 update_priority_calculator.go:147] pod accepted for update autoscale-tester/autoscale-tester-5d6b48d64f-vctx5 with priority 11 updater.go:193] evicting pod autoscale-tester-5d6b48d64f-8zgb91 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"autoscale-tester", Name:"autoscale-tester-5d6b48d64f-8zgb9", UID:"ed8c54c7-a87a-4c39-a000-0e74245f18c6", APIVersion:"v1", ResourceVersion:"378376", FieldPath:""}): type: 'Normal'reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.
You can also check the logs of vpa-admission-controller:
kubectl logs -f vpa-admission-controller-bbf4f4cc7-cb6pbSending patches: [{add /metadata/annotations map[]} {add /spec/containers/0/resources/requests/cpu 500m} {add /spec/containers/0/resources/requests/memory 500Mi} {add /spec/containers/0/resources/limits/cpu 500m} {add /spec/containers/0/resources/limits/memory 500Mi} {add /metadata/annotations/vpaUpdates Pod resources updated by autoscale-tester-recommender: container 0: cpu request, memory request, cpu limit, memory limit} {add /metadata/annotations/vpaObservedContainers autoscale-tester}]
NOTE: Ensure that you have more than 1 running replicas. Otherwise, the pods won’t be restarted, and vpa-updater will give you this warning:
1 pods_eviction_restriction.go:209] too few replicas for ReplicaSet autoscale-tester/autoscale-tester1-7698974f6. Found 1 live pods
Now, describe the new pods created and check that the resources match the Target recommendations:
The Target Recommendation can not go below the minAllowed defined in the VPA spec.
Fig:- Prometheus: Memory Usage Ratio
E. Stress Loading Pods
Let’s recreate the deployment with memory request and limit set to 2000Mi and “–vm-bytes”, “500M”.
Gradually stress load one of these pods to increase its memory utilization. You can login to the pod and run stress –vm 1 –vm-bytes 1400M –timeout 120000s.
Limits v/s Request VPA always works with the requests defined for a container and not the limits. So, the VPA recommendations are also applied to the container requests, and it maintains a limit to request ratio specified for all containers.
For example, if the initial container configuration defines a 100m Memory Request and 300m Memory Limit, then when the VPA target recommendation is 150m Memory, the container Memory Request will be updated to 150m and Memory Limit to 450m.
Selective Container Scaling
If you have a pod with multiple containers and you want to opt-out some of them, you can use the “Off” mode to turn off recommendations for a container.
You can also set containerName: “*” to include all containers.
Both the Horizontal Pod Autoscaler and the Vertical Pod Autoscaler serve different purposes and one can be more useful than the other depending on your application’s requirement.
The HPA can be useful when, for example, your application is serving a large number of lightweight (low resource-consuming) requests. In that case, scaling number of replicas can distribute the workload on each of the pod. The VPA, on the other hand, can be useful when your application serves heavyweight requests, which requires higher resources.
Nightwatch.js is a test automation framework on web applications, developed in Node.js which uses W3C WebDriver API (formerly Selenium WebDriver). It is a complete End-to-End testing solution which aims to simplify writing automated tests and setting up Continuous Integration. Nightwatch works by communicating over a restful HTTP API with a WebDriver server (such as ChromeDriver or Selenium Server). The latest version available in market is 1.0.
Why Use Nightwatch JS Over Any Other Automation Tool?
Selenium is demanded for developing automation framework since it supports various programming languages, provides cross-browser testing and also used in both web application and mobile application testing.
But Nightwatch, built on Node.js, exclusively uses JavaScript as the programming language for end-to-end testing which has the listed advantages –
Lightweight framework
Robust configuration
Integrates with cloud servers like SauceLabs and Browserstack for web and mobile testing with JavaScript, Appium
Allows configuration with Cucumber to build a strong BDD (Behaviour Driven Development) setup
High performance of the automation execution
Improves test structuring
Minimum usage and less Maintenance of code
Installation and Configuration of Nightwatch Framework
For configuring Nightwatch framework, all needed are the following in your system –
Download latest Node.js
Install npm
$ npm install
Package.json file for the test settings and dependencies
$ npm init
Install nightwatch and save as dev dependency
$ npm install nightwatch --save-dev
Install chromedriver/geckodriver and save as dev dependency for running the execution on the required browser
Create a nightwatch.conf.js for webdriver and test settings with nightwatch
constchromedriver=require('chromedriver');module.exports= { src_folders : ["tests"], //tests is a folder in workspace which has the step definitions test_settings: { default: { webdriver: { start_process: true, server_path: chromedriver.path, port: 4444, cli_args: ['--port=4444'] }, desiredCapabilities: { browserName: 'chrome' } } }};
Using Nightwatch – Writing and Running Tests
We create a JavaScript file named demo.js for running a test through nightwatch with the command
$ npm test
//demo.js is a JS file under tests foldermodule.exports= {'step one: navigate to google' : function (browser) { //step one browser .url('https://www.google.com') .waitForElementVisible('body', 1000) .setValue('input[type=text]', 'nightwatch') .waitForElementVisible('input[name=btnK]', 1000) },'step two: click input' : function (browser) { //step two browser .click('input[name=btnK]') .pause(1000) .assert.containsText('#main', 'Night Watch') .end(); //to close the browser session after all the steps }
This command on running picks the value “nightwatch” from “test” key in package.json file which hits the nightwatch api to trigger the URL in chromedriver.
There can be one or more steps in demo.js(step definition js) file as per requirement or test cases.
Also, it is a good practice to maintain a separate .js file for page objects which consists of the locator strategy and selectors of the UI web elements.
Cucumber can be configured in the nightwatch framework to help maintaining the test scenarios in its .feature files. We create a file cucumber.conf.js in the root folder which has the setup of starting, creating and closing webdriver sessions.
Then we create a feature file which has the test scenarios in Given, When, Then format.
Feature: Google SearchScenario: Searching Google Given I open Google's search page Then the title is "Google" And the Google search form exists
For Cucumber to be able to understand and execute the feature file we need to create matching step definitions for every feature step we use in our feature file. Create a step definition file under tests folder called google.js. Step definitions which uses Nightwatch client should return the result of api call as it returns a Promise. For example,
const { client } =require('nightwatch-api');const { Given, Then, When } =require('cucumber');Given(/^I open Google's search page$/, () => {return client.url('http://google.com').waitForElementVisible('body', 1000);});Then(/^the title is "([^"]*)"$/, title=> {return client.assert.title(title);});Then(/^the Google search form exists$/, () => {return client.assert.visible('input[name="q"]');});
$ npm run e2e-test
Executing Individual Feature Files or Scenarios
Single feature file
npm run e2e-test -- features/file1.feature
Multiple feature files
npm run e2e-test -- features/file1.feature features/file2.feature
Scenario by its line number
npm run e2e-test -- features/my_feature.feature:3
Feature directory
npm run e2e-test -- features/dir
Scenario by its name matching a regular expression
npm run e2e-test ----name "topic 1"
Feature and Scenario Tags
Cucumber allows to add tags to features or scenarios and we can selectively run a scenario using those tags. The tags can be used with conditional operators also, depending on the requirement.
Single tag
# google.feature@googleFeature: Google Search@searchScenario: Searching Google Given I open Google's search pageThen the title is "Google"And the Google search form exists
npm run e2e-test ----tags @google
Multiple tags
npm run e2e-test ----tags "@google or @duckduckgo"npm run e2e-test ----tags "(@google or @duckduckgo) and @search"
To skip tags
npm run e2e-test ----tags "not @google"npm run e2e-test ----tags "not(@google or @duckduckgo)"
Custom Reporters in Nightwatch and Cucumber Framework
Reporting is again an advantage provided by Cucumber which generates a report of test results at the end of the execution and it provides an immediate visual clue of a possible problem and will simplify the debugging process. HTML reports are best suited and easy to understand due to its format. To generate the same, we will add cucumber-html-reporter as a dependency in our nightwatch.conf.js file.
Cucumber-html-reporter in node_modules manages the creation of reports and generates in the output location after the test execution. Screenshot feature can enabled by adding the below code snippet in nightwatch.conf.js
The Cucumber configuration file can be extended with the handling of screenshots and attaching them to the report. Now – It also enables generating HTML test report at the end of the execution. It is generated based on Cucumber built-can be configured here in JSON report using different templates. We use a setTimeout() block in our cucumber.conf.js to run the generation after Cucumber finishes with the creation of json report.
In package.json file, we have added the JSON formatter to create a JSON report and it is used by cucumber-html-reporter for the same. We use mkdirp to make sure report folder exists before running the test.
When the test run completes, the HTML report is displayed in a new browser tab in the format given below
Conclusion
Nightwatch-Cucumber is a great module for linking the accessibility of Cucumber.js with the robust testing framework of Nightwatch.js. Together they can not only provide easily readable documentation of test suite, but also highly configurable automated user tests, all while keeping everything in JavaScript.
Almost all the applications that you work on or deal with throughout the day use SMS (short messaging service) as an efficient and effective way to communicate with end users.
Some very common use-cases include:
Receiving an OTP for authenticating your login
Getting deals from the likes of Flipkart and Amazon informing you regarding the latest sale.
Getting reminder notifications for the doctor’s appointment that you have
Getting details for your debit and credit transactions.
The practical use cases for an SMS can be far-reaching.
Even though SMS integration forms an integral part of any application, due to the limitations and complexities involved in automating it via web automation tools like selenium, these are often neglected to be automated.
Teams often opt for verifying these sets of test cases manually, which, even though is important in getting bugs earlier, it does pose some real-time challenges.
Pitfalls with Manual Testing
With these limitations, you obviously do not want your application sending faulty Text Messages after that major Release.
Automation Testing … #theSaviour
To overcome the limitations of manual testing, delegating your task to a machine comes in handy.
Now that we have talked about the WHY, we will look into HOW the feature can be automated. Technically, you shouldn’t / can’t use selenium to read the SMS via mobile. So, we were looking for a third-party library that is
Easy to integrate with the existing code base
Supports a range of languages
Does not involve highly complex codes and focuses on the problem at hand
Supports both incoming and outgoing messages
After a lot of research, we settled with Twilio.
In this article, we will look at an example of working with Twilio APIs to Read SMS and eventually using it to automate SMS flows.
Twilio supports a bunch of different languages. For this article, we stuck with Node.js
Account Setup
Registration
To start working with the service, you need to register.
Once that is done, Twilio will prompt you with a bunch of simple questions to understand why you want to use their service.
Twilio Dashboard
A trial balance of $15.50 is received upon signing up for your usage. This can be used for sending and receiving text messages. A unique Account SID and Auth Token is also generated for your account.
Buy a Number
Navigate to the buy a number link under Phone Numbers > Manage and purchase a number that you would eventually be using in your automation scripts for receiving text messages from the application.
Note – for the free trial, Twilio does not support Indian Number (+91)
Code Setup
Install Twilio in your code base
Code snippet
For simplification, Just pass in the accountSid and authToken that you will receive from the Dashboard Console to the twilio library.This would return you with a client object containing the list of all the messages in your inbox.
List Messages matching filter criteria: If you’d like to have Twilio narrow down this list of messages for you, you can do so by specifying a To number, From the number, and a DateSent.
Get a Message : If you know the message SID (i.e., the message’s unique identifier), then you can retrieve that specific message directly. Using this method, you can send emails without attachments.
The trial version does not support Indian numbers (+91).
The trial version just provides an initial balance of $15.50. This is sufficient enough for your use case that involves only receiving messages on your Twilio number. But if the use case requires sending back the message from the Twilio number, a paid version can solve the purpose.
Messages sent via a short code (557766) are not received on the Twilio number. Only long codes are accepted in the trial version.
You can buy only a single number with the trial version. If purchasing multiple numbers is required, the user may have to switch to a paid version.
Conclusion
In a nutshell, we saw how important it is to thoroughly verify the SMS functionality of our application since it serves as one of the primary ways of communicating with the end users. We also saw what the limitations are with following the traditional manual testing approach and how automating SMS scenarios would help us deliver high-quality products. Finally, we demonstrated a feasible, efficient and easy-to-use way to Automate SMS test scenarios using Twilio APIs.
Hope this was a useful read and that you will now be able to easily automate SMS scenarios. Happy testing… Do like and share …
Containerized applications are becoming more popular with each passing year. All enterprise applications are adopting container technology as they modernize their IT systems. Migrating your applications from VMs or physical machines to containers comes with multiple advantages like optimal resource utilization, faster deployment times, replication, quick cloning, lesser lock-in and so on. Various container orchestration platforms like Kubernetes, Google Container Engine (GKE), Amazon EC2 Container Service (Amazon ECS) help in quick deployment and easy management of your containerized applications. But in order to use these platforms, you need to migrate your legacy applications to containers or rewrite/redeploy your applications from scratch with the containerization approach. Rearchitecting your applications using containerization approach is preferable, but is that possible for complex legacy applications? Is your deployment team capable enough to list down each and every detail about the deployment process of your application? Do you have the patience of authoring a Docker file for each of the components of your complex application stack?
Automated migrations!
Velotio has been helping customers with automated migration of VMs and bare-metal servers to various container platforms. We have developed automation to convert these migrated applications as containers on various container deployment platforms like GKE, Amazon ECS and Kubernetes. In this blog post, we will cover one such migration tool developed at Velotio which will migrate your application running on a VM or physical machine to Google Container Engine (GKE) by running a single command.
Migration tool details
We have named our migration tool as A2C(Anything to Container). It can migrate applications running on any Unix or Windows operating system.
The migration tool requires the following information about the server to be migrated:
IP of the server
SSH User, SSH Key/Password of the application server
Configuration file containing data paths for application/database/components (more details below)
Required name of your docker image (The docker image that will get created for your application)
GKE Container Cluster details
In order to store persistent data, volumes can be defined in container definition. Data changes done on volume path remain persistent even if the container is killed or crashes. Volumes are basically filesystem path from host machine on which your container is running, NFS or cloud storage. Containers will mount the filesystem path from your local machine to container, leading to data changes being written on the host machine filesystem instead of the container’s filesystem. Our migration tool supports data volumes which can be defined in the configuration file. It will automatically create disks for the defined volumes and copy data from your application server to these disks in a consistent way.
The configuration file we have been talking about is basically a YAML file containing filesystem level information about your application server. A sample of this file can be found below:
The configuration file contains 3 sections: includes, volumes and excludes:
Includes contains filesystem paths on your application server which you want to add to your container image.
Volumes contain filesystem paths on your application server which stores your application data. Generally, filesystem paths containing database files, application code files, configuration files, log files are good candidates for volumes.
The excludes section contains filesystem paths which you don’t want to make part of the container. This may include temporary filesystem paths like /proc, /tmp and also NFS mounted paths. Ideally, you would include everything by giving “/” in includes section and exclude specifics in exclude section.
Docker image name to be given as input to the migration tool is the docker registry path in which the image will be stored, followed by the name and tag of the image. Docker registry is like GitHub of docker images, where you can store all your images. Different versions of the same image can be stored by giving version specific tag to the image. GKE also provides a Docker registry. Since in this demo we are migrating to GKE, we will also store our image to GKE registry.
GKE container cluster details to be given as input to the migration tool, contains GKE specific details like GKE project name, GKE container cluster name and GKE region name. A container cluster can be created in GKE to host the container applications. We have a separate set of scripts to perform cluster creation operation. Container cluster creation can also be done easily through GKE UI. For now, we will assume that we have a 3 node cluster created in GKE, which we will use to host our application.
Tasks performed under migration
Our migration tool (A2C), performs the following set of activities for migrating the application running on a VM or physical machine to GKE Container Cluster:
1. Install the A2C migration tool with all it’s dependencies to the target application server
2. Create a docker image of the application server, based on the filesystem level information given in the configuration file
3. Capture metadata from the application server like configured services information, port usage information, network configuration, external services, etc.
4. Push the docker image to GKE container registry
5. Create disk in Google Cloud for each volume path defined in configuration file and prepopulate disks with data from application server
6. Create deployment spec for the container application in GKE container cluster, which will open the required ports, configure required services, add multi container dependencies, attach the pre populated disks to containers, etc.
7. Deploy the application, after which you will have your application running as containers in GKE with application software in running state. New application URL’s will be given as output.
8. Load balancing, HA will be configured for your application.
Demo
For demonstration purpose, we will deploy a LAMP stack (Apache+PHP+Mysql) on a CentOS 7 VM and will run the migration utility for the VM, which will migrate the application to our GKE cluster. After the migration we will show our application preconfigured with the same data as on our VM, running on GKE.
Step 1
We setup LAMP stack using Apache, PHP and Mysql on a CentOS 7 VM in GCP. The PHP application can be used to list, add, delete or edit user data. The data is getting stored in MySQL database. We added some data to the database using the application and the UI would show the following:
Step 2
Now we run the A2C migration tool, which will migrate this application stack running on a VM into a container and auto-deploy it to GKE.
Pushing converter binary to target machinePushing data config to target machinePushing installer script to target machineRunning converter binary on target machine[130.211.231.58] out: creating docker image[130.211.231.58] out: image created with id 6dad12ba171eaa8615a9c353e2983f0f9130f3a25128708762228f293e82198d[130.211.231.58] out: Collecting metadata for image[130.211.231.58] out: Generating metadata for cent7[130.211.231.58] out: Building image from metadataPushing the docker image to GCP container registryInitiate remote data copyActivated service account credentials for: [glassy-chaliceXXXXX@appspot.gserviceaccount.com]for volume var/log/httpdCreating disk migrate-lamp-0Disk Created Successfullytransferring data from sourcefor volume var/log/mariadbCreating disk migrate-lamp-1Disk Created Successfullytransferring data from sourcefor volume var/www/htmlCreating disk migrate-lamp-2Disk Created Successfullytransferring data from sourcefor volume var/lib/mysqlCreating disk migrate-lamp-3Disk Created Successfullytransferring data from sourceConnecting to GCP cluster for deploymentCreated service file /tmp/gcp-service.yamlCreated deployment file /tmp/gcp-deployment.yaml
Deploying to GKE
$ kubectl get podNAMEREADYSTATUSRESTARTSAGEmigrate-lamp-3707510312-6dr5g 0/1 ContainerCreating 058s
$ kubectl get deploymentNAMEDESIREDCURRENTUP-TO-DATEAVAILABLEAGEmigrate-lamp 1110 1m
$ kubectl get serviceNAMECLUSTER-IPEXTERNAL-IPPORT(S)AGEkubernetes 10.59.240.1443/TCP23hmigrate-lamp 10.59.248.4435.184.53.1003306:31494/TCP,80:30909/TCP,22:31448/TCP 53s
You can access your application using above connection details!
Step 3
Access LAMP stack on GKE using the IP 35.184.53.100 on default 80 port as was done on the source machine.
Here is the Docker image being created in GKE Container Registry:
We can also see that disks were created with migrate-lamp-x, as part of this automated migration.
Load Balancer also got provisioned in GCP as part of the migration process
Following service files and deployment files were created by our migration tool to deploy the application on GKE:
Migrations are always hard for IT and development teams. At Velotio, we have been helping customers to migrate to cloud and container platforms using streamlined processes and automation. Feel free to reach out to us at contact@rsystems.com to know more about our cloud and container adoption/migration offerings.
These days, we see that most software development is moving towards serverless architecture, and that’s no surprise. Almost all top cloud service providers have serverless services that follow a pay-as-you-go model. This way, consumers don’t have to pay for any unused resources. Also, there’s no need to worry about procuring dedicated servers, network/hardware management, operating system security updates, etc.
Unfortunately, for cloud developers, serverless tools don’t provide auto-deploy services for updating local environments. This is still a headache. The developer must deploy and test changes manually. Web app projects using Node or Django have a watcher on the development environment during app bundling on their respective server runs. Thus, when changes happen in the code directory, the server automatically restarts with these new changes, and the developer can check if the changes are working as expected.
In this blog, we will talk about automating serverless application deployment by changing the local codebase. We are using AWS as a cloud provider and primarily focusing on lambda to demonstrate the functionality.
Prerequisites:
This article uses AWS, so command and programming access are necessary.
This article is written with deployment to AWS in mind, so AWS credentials are needed to make changes in the Stack. In the case of other cloud providers, we would require that provider’s command-line access.
We are using a serverless application framework for deployment. (This example will also work for other tools like Zappa.) So, some serverless context would be required.
Before development, let’s divide the problem statement into sub-tasks and build them one step at a time.
Problem Statement
Create a codebase watcher service that would trigger either a stack update on AWS or run a local test. By doing this, developers would not have to worry about manual deployment on the cloud provider. This service needs to keep an eye on the code and generate events when an update/modify/copy/delete occurs in the given codebase.
Solution
First, to watch the codebase, we need logic that acts as a trigger and notifies when underlining files changes. For this, there are already packages present in different programming languages. In this example, we are using ‘python watchdog.’
from watchdog.observers import Observerfrom watchdog.events import FileSystemEventHandlerCODE_PATH = "<codebase path>"class WatchMyCodebase: # Set the directory on watch def __init__(self): self.observer = Observer() def run(self): event_handler = EventHandler() # recursive flag decides if watcher should collect changes in CODE_PATH directory tree. self.observer.schedule(event_handler, CODE_PATH, recursive=True) self.observer.start() self.observer.join()class EventHandler(FileSystemEventHandler):"""Handle events generated by Watchdog Observer""" @classmethod def on_any_event(cls, event): if event.is_directory:"""Ignore directory level events, like creating new empty directory etc..""" return None elif event.event_type == 'modified': print("file under codebase directory is modified...")if __name__ == '__main__': watch = WatchMyCodebase() watch.run()
Here, the on_any_event() class method gets called on any updates in the mentioned directory, and we need to add deployment logic here. However, we can’t just deploy once it receives a notification from the watcher because modern IDEs save files as soon as the user changes them. And if we add logic that deploys on every change, then most of the time, it will deploy half-complete services.
To handle this, we must add some timeout before deploying the service.
Here, the program will wait for some time after the file is changed. And if it finds that, for some time, there have been no new changes in the codebase, it will deploy the service.
import timeimport subprocessimport threadingfrom watchdog.observers import Observerfrom watchdog.events import FileSystemEventHandlervalid_events = ['created', 'modified', 'deleted', 'moved']DEPLOY_AFTER_CHANGE_THRESHOLD = 300STAGE_NAME = ""CODE_PATH = "<codebase path>"def deploy_env(): process = subprocess.Popen(['sls', 'deploy', '--stage', STAGE_NAME, '-v'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) stdout, stderr = process.communicate() print(stdout, stderr)def deploy_service_on_change(): while True: if EventHandler.last_update_time and (int(time.time() - EventHandler.last_update_time) > DEPLOY_AFTER_CHANGE_THRESHOLD): EventHandler.last_update_time = None deploy_env() time.sleep(5)def start_interval_watcher_thread(): interval_watcher_thread = threading.Thread(target=deploy_service_on_change) interval_watcher_thread.start()class WatchMyCodebase: # Set the directory on watch def __init__(self): self.observer = Observer() def run(self): event_handler = EventHandler() self.observer.schedule(event_handler, CODE_PATH, recursive=True) self.observer.start() self.observer.join()class EventHandler(FileSystemEventHandler):"""Handle events generated by Watchdog Observer""" last_update_time = None @classmethod def on_any_event(cls, event): if event.is_directory:"""Ignore directory level events, like creating new empty directory etc..""" return None elif event.event_type in valid_events and '.serverless' not in event.src_path: # Ignore events related to changes in .serverless directory, serverless creates few temp file while deploy cls.last_update_time = time.time()if __name__ == '__main__': start_interval_watcher_thread() watch = WatchMyCodebase() watch.run()
The specified valid_events acts as a filter to deploy, and we are only considering these events and acting upon them.
Moreover, to add a delay after file changes and ensure that there are no new changes, we added interval_watcher_thread. This checks the difference between current and last directory update time, and if it’s greater than the specified threshold, we deploy serverless resources.
Here, the sleep time in deploy_service_on_change is important. It will prevent the program from consuming more CPU cycles to check whether the condition to deploy serverless is satisfied. Also, too much delay would cause more delay in the deployment than the specified value of DEPLOY_AFTER_CHANGE_THRESHOLD.
Note: With programming languages like Golang, and its features like goroutine and channels, we can build an even more efficient application—or even with Python with the help of thread signals.
Let’s build one lambda function that automatically deploys on a change. Let’s also be a little lazy and develop a basic python lambda that takes a number as an input and returns its factorial value.
import mathdef lambda_handler(event, context):""" Handler for get factorial""" number = event['number'] return math.factorial(number)
We are using a serverless application framework, so to deploy this lambda, we need a serverless.yml file that specifies stack details like execution environment, cloud provider, environment variables, etc. More parameters are listed in this guide.
We need to keep both handler.py and serverless.yml in the same folder, or we need to update the function handler in serverless.yml.
We can deploy it manually using this serverless command:
sls deploy --stage production -v
Note: Before deploying, export AWS credentials.
The above command deployed a stack using cloud formation:
–stage is how to specify the environment where the stack should be deployed. Like any other software project, it can have stage names such as production, dev, test, etc.
-v specifies verbose.
To auto-deploy changes from now on, we can use the watcher.
Start the watcher with this command:
python3 auto_deploy_sls.py
This will run continuously and keep an eye on the codebase directory, and if any changes are detected, it will deploy them. We can customize this to some extent, like post-deploy, so it can run test cases against a new stack.
If you are worried about network traffic when the stack has lots of dependencies, using an actual cloud provider for testing might increase billing. However, we can easily fix this by using serverless local development.
Here is a serverless blog that specifies local development and testing of a cloudformation stack. It emulates cloud behavior on the local setup, so there’s no need to worry about cloud service billing.
One great upgrade supports complex directory structure.
In the above example, we are assuming that only one single directory is present, so it’s fine to deploy using the command:
sls deploy --stage production -v
But in some projects, one might have multiple stacks present in the codebase at different hierarchies. Consider the below example: We have three different lambdas, so updating in the `check-prime` directory requires updating only that lambda and not the others.
The above can be achieved in on_any_event(). By using the variable event.src_path, we can learn the file path that received the event.
Now, deployment command changes to:
cd <updated_directory>&& sls deploy --stage <your-stage>-v
This will deploy only an updated stack.
Conclusion
We learned that even if serverless deployment is a manual task, it can be automated with the help of Watchdog for better developer workflow.
With the help of serverless local development, we can test changes as we are making them without needing an explicit deployment to the cloud environment manually to test all the changes being made.
We hope this helps you improve your serverless development experience and close the loop faster.
Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread. When the work is complete, it notifies the main thread about completion or failure of the worker thread. There are numerous benefits to using it, such as improved application performance and enhanced responsiveness.
Asynchronous programming has been gaining a lot of attention in the past few years, and for good reason. Although it can be more difficult than the traditional linear style, it is also much more efficient.
For example, instead of waiting for an HTTP request to finish before continuing execution, with Python async coroutines you can submit the request and do other work that’s waiting in a queue while waiting for the HTTP request to finish.
Asynchronicity seems to be a big reason why Node.js so popular for server-side programming. Much of the code we write, especially in heavy IO applications like websites, depends on external resources. This could be anything from a remote database call to POSTing to a REST service. As soon as you ask for any of these resources, your code is waiting around with nothing to do. With asynchronous programming, you allow your code to handle other tasks while waiting for these other resources to respond.
How Does Python Do Multiple Things At Once?
1. Multiple Processes
The most obvious way is to use multiple processes. From the terminal, you can start your script two, three, four…ten times and then all the scripts are going to run independently or at the same time. The operating system that’s underneath will take care of sharing your CPU resources among all those instances. Alternately you can use the multiprocessing library which supports spawning processes as shown in the example below.
from multiprocessing import Processdef print_func(continent='Asia'): print('The name of continent is : ', continent)if __name__ == "__main__": # confirms that the code is under main function names = ['America', 'Europe', 'Africa'] procs = [] proc = Process(target=print_func) # instantiating without any argument procs.append(proc) proc.start() # instantiating process with arguments for name in names: # print(name) proc = Process(target=print_func, args=(name,)) procs.append(proc) proc.start() # complete the processes for proc in procs: proc.join()
Output:
The name of continent is : AsiaThe name of continent is : AmericaThe name of continent is : EuropeThe name of continent is : Africa
2. Multiple Threads
The next way to run multiple things at once is to use threads. A thread is a line of execution, pretty much like a process, but you can have multiple threads in the context of one process and they all share access to common resources. But because of this, it’s difficult to write a threading code. And again, the operating system is doing all the heavy lifting on sharing the CPU, but the global interpreter lock (GIL) allows only one thread to run Python code at a given time even when you have multiple threads running code. So, In CPython, the GIL prevents multi-core concurrency. Basically, you’re running in a single core even though you may have two or four or more.
import threadingdef print_cube(num):""" function to print cube of given num""" print("Cube: {}".format(num * num * num))def print_square(num):""" function to print square of given num""" print("Square: {}".format(num * num))if __name__ == "__main__": # creating thread t1 = threading.Thread(target=print_square, args=(10,)) t2 = threading.Thread(target=print_cube, args=(10,)) # starting thread 1 t1.start() # starting thread 2 t2.start() # wait until thread 1 is completely executed t1.join() # wait until thread 2 is completely executed t2.join() # both threads completely executed print("Done!")
Output:
Square: 100Cube: 1000Done!
3. Coroutines using yield:
Coroutines are generalization of subroutines. They are used for cooperative multitasking where a process voluntarily yield (give away) control periodically or when idle in order to enable multiple applications to be run simultaneously. Coroutines are similar to generators but with few extra methods and slight change in how we use yield statement. Generators produce data for iteration while coroutines can also consume data.
def print_name(prefix):print("Searching prefix:{}".format(prefix))try : whileTrue: # yeild used to create coroutine name = (yield)if prefix inname:print(name) except GeneratorExit:print("Closing coroutine!!")corou =print_name("Dear")corou.__next__()corou.send("James")corou.send("Dear James")corou.close()
The fourth way is an asynchronous programming, where the OS is not participating. As far as OS is concerned you’re going to have one process and there’s going to be a single thread within that process, but you’ll be able to do multiple things at once. So, what’s the trick?
The answer is asyncio
asyncio is the new concurrency module introduced in Python 3.4. It is designed to use coroutines and futures to simplify asynchronous code and make it almost as readable as synchronous code as there are no callbacks.
asyncio uses different constructs: event loops, coroutines and futures.
An event loop manages and distributes the execution of different tasks. It registers them and handles distributing the flow of control between them.
Coroutines (covered above) are special functions that work similarly to Python generators, on await they release the flow of control back to the event loop. A coroutine needs to be scheduled to run on the event loop, once scheduled coroutines are wrapped in Tasks which is a type of Future.
Futures represent the result of a task that may or may not have been executed. This result may be an exception.
Using Asyncio, you can structure your code so subtasks are defined as coroutines and allows you to schedule them as you please, including simultaneously. Coroutines contain yield points where we define possible points where a context switch can happen if other tasks are pending, but will not if no other task is pending.
A context switch in asyncio represents the event loop yielding the flow of control from one coroutine to the next.
In the example, we run 3 async tasks that query Reddit separately, extract and print the JSON. We leverage aiohttp which is a http client library ensuring even the HTTP request runs asynchronously.
import signal import sys import asyncio import aiohttp import jsonloop = asyncio.get_event_loop() client = aiohttp.ClientSession(loop=loop)async def get_json(client, url): async with client.get(url) as response: assert response.status == 200 return await response.read()async def get_reddit_top(subreddit, client): data1 = await get_json(client, 'https://www.reddit.com/r/' + subreddit + '/top.json?sort=top&t=day&limit=5') j = json.loads(data1.decode('utf-8')) for i in j['data']['children']: score = i['data']['score'] title = i['data']['title'] link = i['data']['url'] print(str(score) + ': ' + title + ' (' + link + ')') print('DONE:', subreddit + '\n')def signal_handler(signal, frame): loop.stop() client.close() sys.exit(0)signal.signal(signal.SIGINT, signal_handler)asyncio.ensure_future(get_reddit_top('python', client)) asyncio.ensure_future(get_reddit_top('programming', client)) asyncio.ensure_future(get_reddit_top('compsci', client)) loop.run_forever()
Output:
50: Undershoot: Parsing theory in1965 (http://jeffreykegler.github.io/Ocean-of-Awareness-blog/individual/2018/07/knuth_1965_2.html)12:Questionaboutbest-prefix/failurefunction/primalmatchtableinkmpalgorithm (https://www.reddit.com/r/compsci/comments/8xd3m2/question_about_bestprefixfailure_functionprimal/)1:QuestionregardingcalculatingtheprobabilityoffailureofaRAIDsystem (https://www.reddit.com/r/compsci/comments/8xbkk2/question_regarding_calculating_the_probability_of/)DONE:compsci336: /r/thanosdidnothingwrong -- banningpeoplewithpython (https://clips.twitch.tv/AstutePluckyCocoaLitty)175:PythonRobotics:Pythonsamplecodesforroboticsalgorithms (https://atsushisakai.github.io/PythonRobotics/)23:PythonandFlaskTutorialinVSCode (https://code.visualstudio.com/docs/python/tutorial-flask)17:StartedanewblogonCelery - whatwouldyouliketoreadabout? (https://www.python-celery.com)14:ASimpleAnomalyDetectionAlgorithminPython (https://medium.com/@mathmare_/pyng-a-simple-anomaly-detection-algorithm-2f355d7dc054)DONE:python1360:gitbundle (https://dev.to/gabeguz/git-bundle-2l5o)1191:Whichhashingalgorithmisbestforuniquenessandspeed?IanBoyd's answer (top voted) is one of the best comments I'veseenonStackexchange. (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed)430:ARMlaunches “Facts” campaignagainstRISC-V (https://riscv-basics.com/)244:ChoiceofsearchengineonAndroidnukedby “AnonymousCoward” (2009) (https://android.googlesource.com/platform/packages/apps/GlobalSearch/+/592150ac00086400415afe936d96f04d3be3ba0c)209:ExploitingfreelyaccessibleWhatsAppdataor “WhydoesWhatsAppwebknowmyphone’sbatterylevel?” (https://medium.com/@juan_cortes/exploiting-freely-accessible-whatsapp-data-or-why-does-whatsapp-know-my-battery-level-ddac224041b4)DONE:programming
Using Redis and Redis Queue(RQ):
Using asyncio and aiohttp may not always be in an option especially if you are using older versions of python. Also, there will be scenarios when you would want to distribute your tasks across different servers. In that case we can leverage RQ (Redis Queue). It is a simple Python library for queueing jobs and processing them in the background with workers. It is backed by Redis – a key/value data store.
In the example below, we have queued a simple function count_words_at_url using redis.
from mymodule import count_words_at_urlfrom redis import Redisfrom rq import Queueq = Queue(connection=Redis())job = q.enqueue(count_words_at_url, 'http://nvie.com')******mymodule.py******import requestsdef count_words_at_url(url):"""Just an example function that's called async.""" resp = requests.get(url) print( len(resp.text.split())) return( len(resp.text.split()))
Output:
15:10:45RQ worker 'rq:worker:EMPID18030.9865' started, version 0.11.015:10:45*** Listening on default...15:10:45 Cleaning registries for queue: default15:10:50default: mymodule.count_words_at_url('http://nvie.com') (a2b7451e-731f-4f31-9232-2b7e3549051f)32215:10:51default: Job OK (a2b7451e-731f-4f31-9232-2b7e3549051f)15:10:51 Result is kept for 500 seconds
Conclusion:
Let’s take a classical example chess exhibition where one of the best chess players competes against a lot of people. And if there are 24 games with 24 people to play with and the chess master plays with all of them synchronically, it’ll take at least 12 hours (taking into account that the average game takes 30 moves, the chess master thinks for 5 seconds to come up with a move and the opponent – for approximately 55 seconds). But using the asynchronous mode gives chess master the opportunity to make a move and leave the opponent thinking while going to the next one and making a move there. This way a move on all 24 games can be done in 2 minutes and all of them can be won in just one hour.
So, this is what’s meant when people talk about asynchronous being really fast. It’s this kind of fast. Chess master doesn’t play chess faster, the time is just more optimized and it’s not get wasted on waiting around. This is how it works.
In this analogy, the chess master will be our CPU and the idea is that we wanna make sure that the CPU doesn’t wait or waits the least amount of time possible. It’s about always finding something to do.
A practical definition of Async is that it’s a style of concurrent programming in which tasks release the CPU during waiting periods, so that other tasks can use it. In Python, there are several ways to achieve concurrency, based on our requirement, code flow, data manipulation, architecture design and use cases we can select any of these methods.
In this blog, I will compare various methods to avoid the dreaded callback hells that are common in Node.js. What exactly am I talking about? Have a look at this piece of code below. Every child function executes only when the result of its parent function is available. Callbacks are the very essence of the unblocking (and hence performant) nature of Node.js.
Convinced yet? Even though there is some seemingly unnecessary error handling done here, I assume you get the drift! The problem with such code is more than just indentation. Instead, our programs entire flow is based on side effects – one function only incidentally calling the inner function.
There are multiple ways in which we can avoid writing such deeply nested code. Let’s have a look at our options:
Promises
According to the official specification, promise represents an eventual result of an asynchronous operation. Basically, it represents an operation that has not completed yet but is expected to in the future. The then method is a major component of a promise. It is used to get the return value (fulfilled or rejected) of a promise. Only one of these two values will ever be set. Let’s have a look at a simple file read example without using promises:
Now, if readFile function returned a promise, the same logic could be written like so:
var fileReadPromise = fs.readFile(filePath);fileReadPromise.then(console.log, console.error)
The fileReadPromise can then be passed around multiple times in a code where you need to read a file. This helps in writing robust unit tests for your code since you now only have to write a single test for a promise. And more readable code!
Chaining using promises
The then function itself returns a promise which can again be used to do the next operation. Changing the first code snippet to using promises results in this:
As in evident, it makes the code more composed, readable and easier to maintain. Also, instead of chaining we could have used Promise.all. Promise.all takes an array of promises as input and returns a single promise that resolves when all the promises supplied in the array are resolved. Other useful information on promises can be found here.
The async utility module
Async is an utility module which provides a set of over 70 functions that can be used to elegantly solve the problem of callback hells. All these functions follow the Node.js convention of error-first callbacks which means that the first callback argument is assumed to be an error (null in case of success). Let’s try to solve the same foo-bar-baz problem using async module. Here is the code snippet:
Here, I have used the async.waterfall function as an example. There are a multiple functions available according to the nature of the problem you are trying to solve like async.each – for parallel execution, async.eachSeries – for serial execution etc.
Async/Await
Now, this is one of the most exciting features coming to Javascript in near future. It internally uses promises but handles them in a more intuitive manner. Even though it seems like promises and/or 3rd party modules like async would solve most of the problems, a further simplification is always welcome! For those of you who have worked with C# async/await, this concept is directly cribbed from there and being brought into ES7.
Async/await enables us to write asynchronous promise-based code as if it were synchronous, but without blocking the main thread. An async function always returns a promise whether await is used or not. But whenever an await is observed, the function is paused until the promise either resolves or rejects. Following code snippet should make it clearer:
Here, asyncFun is an async function which captures the promised result using await. This has made the code readable and a major convenience for developers who are more comfortable with linearly executed languages, without blocking the main thread.
Now, like before, lets solve the foo-bar-baz problem using async/await. Note that foo, bar and baz individually return promises just like before. But instead of chaining, we have written the code linearly.
How long should you (a)wait for async to come to fore?
Well, it’s already here in the Chrome 55 release and the latest update of the V8 engine. The native support in the language means that we should see a much more widespread use of this feature. The only, catch is that if you would want to use async/await on a codebase which isn’t promise aware and based completely on callbacks, it probably will require a lot of wrapping the existing functions to make them usable.
To wrap up, async/await definitely make coding numerous async operations an easier job. Although promises and callbacks would do the job for most, async/await looks like the way to make some architectural problems go away and improve code quality.
Asynchronous programming is a characteristic of modern programming languages that allows an application to perform various operations without waiting for any of them. Asynchronicity is one of the big reasons for the popularity of Node.js.
We have discussed Python’s asynchronous features as part of our previous post: an introduction to asynchronous programming in Python. This blog is a natural progression on the same topic. We are going to discuss async features in Python in detail and look at some hands-on examples.
Consider a traditional web scraping application that needs to open thousands of network connections. We could open one network connection, fetch the result, and then move to the next ones iteratively. This approach increases the latency of the program. It spends a lot of time opening a connection and waiting for others to finish their bit of work.
On the other hand, async provides you a method of opening thousands of connections at once and swapping among each connection as they finish and return their results. Basically, it sends the request to a connection and moves to the next one instead of waiting for the previous one’s response. It continues like this until all the connections have returned the outputs.
From the above chart, we can see that using synchronous programming on four tasks took 45 seconds to complete, while in asynchronous programming, those four tasks took only 20 seconds.
Where Does Asynchronous Programming Fit in the Real-world?
Asynchronous programming is best suited for popular scenarios such as:
1. The program takes too much time to execute.
2. The reason for the delay is waiting for input or output operations, not computation.
3. For the tasks that have multiple input or output operations to be executed at once.
And application-wise, these are the example use cases:
Web Scraping
Network Services
Difference Between Parallelism, Concurrency, Threading, and Async IO
Because we discussed this comparison in detail in our previous post, we will just quickly go through the concept as it will help us with our hands-on example later.
Parallelism involves performing multiple operations at a time. Multiprocessing is an example of it. It is well suited for CPU bound tasks.
Concurrency is slightly broader than Parallelism. It involves multiple tasks running in an overlapping manner.
Threading – a thread is a separate flow of execution. One process can contain multiple threads and each thread runs independently. It is ideal for IO bound tasks.
Async IO is a single-threaded, single-process design that uses cooperative multitasking. In simple words, async IO gives a feeling of concurrency despite using a single thread in a single process.
Fig:- A comparison in concurrency and parallelism
Components of Async IO Programming
Let’s explore the various components of Async IO in depth. We will also look at an example code to help us understand the implementation.
1. Coroutines
Coroutines are mainly generalization forms of subroutines. They are generally used for cooperative tasks and behave like Python generators.
An async function uses the await keyword to denote a coroutine. When using the await keyword, coroutines release the flow of control back to the event loop.
To run a coroutine, we need to schedule it on the event loop. After scheduling, coroutines are wrapped in Tasks as a Future object.
Example:
In the below snippet, we called async_func from the main function. We have to add the await keyword while calling the sync function. As you can see, async_func will do nothing unless the await keyword implementation accompanies it.
import asyncioasync def async_func(): print('Velotio ...') await asyncio.sleep(1) print('... Technologies!')async def main(): async_func()#this will do nothing because coroutine object is created but not awaited await async_func()asyncio.run(main())
Output
RuntimeWarning: coroutine 'async_func' was never awaitedasync_func()#this will do nothing because coroutine object is created but not awaitedRuntimeWarning: Enable tracemalloc to get the object allocation tracebackVelotio ...... Blog!
2. Tasks
Tasks are used to schedule coroutines concurrently.
When submitting a coroutine to an event loop for processing, you can get a Task object, which provides a way to control the coroutine’s behavior from outside the event loop.
Example:
In the snippet below, we are creating a task using create_task (an inbuilt function of asyncio library), and then we are running it.
This mechanism runs coroutines until they complete. You can imagine it as while(True) loop that monitors coroutine, taking feedback on what’s idle, and looking around for things that can be executed in the meantime.
It can wake up an idle coroutine when whatever that coroutine is waiting on becomes available.
Only one event loop can run at a time in Python.
Example:
In the snippet below, we are creating three tasks and then appending them in a list and executing all tasks asynchronously using get_event_loop, create_task and the await function of the asyncio library.
A future is a special, low-level available object that represents an eventual result of an asynchronous operation.
When a Future object is awaited, the co-routine will wait until the Future is resolved in some other place.
We will look into the sample code for Future objects in the next section.
A Comparison Between Multithreading and Async IO
Before we get to Async IO, let’s use multithreading as a benchmark and then compare them to see which is more efficient.
For this benchmark, we will be fetching data from a sample URL (the Velotio Career webpage) with different frequencies, like once, ten times, 50 times, 100 times, 500 times, respectively.
We will then compare the time taken by both of these approaches to fetch the required data.
Implementation
Code of Multithreading:
import requestsimport timefrom concurrent.futures import ProcessPoolExecutordef fetch_url_data(pg_url): try: resp = requests.get(pg_url) except Exception as e: print(f"Error occured during fetch data from url{pg_url}") else: return resp.contentdef get_all_url_data(url_list): with ProcessPoolExecutor() as executor: resp = executor.map(fetch_url_data, url_list) return respif __name__=='__main__': url = "https://www.velotio.com/careers" for ntimes in [1,10,50,100,500]: start_time = time.time() responses = get_all_url_data([url] * ntimes) print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')
Output
Fetch total 1 urls and process takes 1.8822264671325684 secondsFetch total 10 urls and process takes 2.3358211517333984 secondsFetch total 50 urls and process takes 8.05638575553894 secondsFetch total 100 urls and process takes 14.43302869796753 secondsFetch total 500 urls and process takes 65.25404500961304 seconds
ProcessPoolExecutor is a Python package that implements the Executor interface. The fetch_url_data is a function to fetch the data from the given URL using the requests python package, and the get_all_url_data function is used to map the fetch_url_data function to the lists of URLs.
Async IO Programming Example:
import asyncioimport timefrom aiohttp import ClientSession, ClientResponseErrorasync def fetch_url_data(session, url): try: async with session.get(url, timeout=60) as response: resp = await response.read() except Exception as e: print(e) else: return resp returnasync def fetch_async(loop, r): url = "https://www.velotio.com/careers" tasks = [] async with ClientSession() as session: for i in range(r): task = asyncio.ensure_future(fetch_url_data(session, url)) tasks.append(task) responses = await asyncio.gather(*tasks) return responsesif __name__ == '__main__': for ntimes in [1, 10, 50, 100, 500]: start_time = time.time() loop = asyncio.get_event_loop() future = asyncio.ensure_future(fetch_async(loop, ntimes)) loop.run_until_complete(future) #will run until it finish or get any error responses = future.result() print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')
Output
Fetch total 1 urls and process takes 1.3974951362609863 secondsFetch total 10 urls and process takes 1.4191942596435547 secondsFetch total 50 urls and process takes 2.6497368812561035 secondsFetch total 100 urls and process takes 4.391665458679199 secondsFetch total 500 urls and process takes 4.960426330566406 seconds
We need to use the get_event_loop function to create and add the tasks. For running more than one URL, we have to use ensure_future and gather function.
The fetch_async function is used to add the task in the event_loop object and the fetch_url_data function is used to read the data from the URL using the session package. The future_result method returns the response of all the tasks.
Results:
As you can see from the plot, async programming is much more efficient than multi-threading for the program above.
The graph of the multithreading program looks linear, while the asyncio program graph is similar to logarithmic.
Conclusion
As we saw in our experiment above, Async IO showed better performance with the efficient use of concurrency than multi-threading.
Async IO can be beneficial in applications that can exploit concurrency. Though, based on what kind of applications we are dealing with, it is very pragmatic to choose Async IO over other implementations.
We hope this article helped further your understanding of the async feature in Python and gave you some quick hands-on experience using the code examples shared above.