According to the OpenAI Gym GitHub repository “OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. This is the gym open-source library, which gives you access to a standardized set of environments.”
Open AI Gym has an environment-agent arrangement. It simply means Gym gives you access to an “agent” which can perform specific actions in an “environment”. In return, it gets the observation and reward as a consequence of performing a particular action in the environment.
There are four values that are returned by the environment for every “step” taken by the agent.
Observation (object): an environment-specific object representing your observation of the environment. For example, board state in a board game etc
Reward (float): the amount of reward/score achieved by the previous action. The scale varies between environments, but the goal is always to increase your total reward/score.
Done (boolean): whether it’s time to reset the environment again. E.g you lost your last life in the game.
Info (dict): diagnostic information useful for debugging. However, official evaluations of your agent are not allowed to use this for learning.
Following are the available Environments in the Gym:
Classic control and toy text
Algorithmic
Atari
2D and 3D robots
Here you can find a full list of environments.
Cart-Pole Problem
Here we will try to write a solve a classic control problem from Reinforcement Learning literature, “The Cart-pole Problem”.
The Cart-pole problem is defined as follows: “A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center.”
The following code will quickly allow you see how the problem looks like on your computer.
import gymenv = gym.make('CartPole-v0')env.reset()for _ in range(1000): env.render() env.step(env.action_space.sample())
This is what the output will look like:
Coding the neural network
#We first import the necessary libraries and define hyperparameters - import gymimport randomimport numpy as npimport tflearnfrom tflearn.layers.core import input_data, dropout, fully_connectedfrom tflearn.layers.estimator import regressionfrom statistics import median, meanfrom collections import CounterLR = 2.33e-4env = gym.make("CartPole-v0")observation = env.reset()goal_steps = 500score_requirement = 50initial_games = 10000#Now we will define a function to generate training data - def initial_population(): # [OBS, MOVES] training_data = [] # all scores: scores = [] # scores above our threshold: accepted_scores = [] # number of episodes for _ in range(initial_games): score = 0 # moves specifically from this episode: episode_memory = [] # previous observation that we saw prev_observation = [] for _ inrange(goal_steps): # choose random action left or right i.e (0 or 1) action = random.randrange(0,2) observation, reward, done, info = env.step(action) # since that the observation is returned FROM the action # we store previous observation and corresponding actioniflen(prev_observation) >0 : episode_memory.append([prev_observation, action]) prev_observation = observation score+=rewardifdone: break # reinforcement methodology here. # IF our score is higher than our threshold, we save # all we're doing is reinforcing the score, we're not trying # to influence the machine in any way astoHOWthatscoreis # reached.if score >=score_requirement: accepted_scores.append(score) for data inepisode_memory: # convert to one-hot (this is the output layer for our neural network)if data[1] ==1: output = [0,1] elif data[1] ==0: output = [1,0] # saving our training data training_data.append([data[0], output]) # reset env to play again env.reset() # save overall scores scores.append(score)# Now usingtflearn we will define our neural network def neural_network_model(input_size): network =input_data(shape=[None, input_size, 1], name='input') network =fully_connected(network, 128, activation='relu') network =dropout(network, 0.8) network =fully_connected(network, 256, activation='relu') network =dropout(network, 0.8) network =fully_connected(network, 512, activation='relu') network =dropout(network, 0.8) network =fully_connected(network, 256, activation='relu') network =dropout(network, 0.8) network =fully_connected(network, 128, activation='relu') network =dropout(network, 0.8) network =fully_connected(network, 2, activation='softmax') network =regression(network, optimizer='adam', learning_rate=LR, loss='categorical_crossentropy', name='targets') model = tflearn.DNN(network, tensorboard_dir='log')return model#It is time to train the model now -def train_model(training_data, model=False):X= np.array([i[0] for i in training_data]).reshape(-1,len(training_data[0][0]),1) y = [i[1] for i in training_data]if not model: model =neural_network_model(input_size =len(X[0])) model.fit({'input': X}, {'targets': y}, n_epoch=5, snapshot_step=500, show_metric=True, run_id='openai_CartPole')return modeltraining_data =initial_population()model =train_model(training_data)#Training complete, now we should play the game to see how the output looks like scores = []choices = []for each_game inrange(10): score =0 game_memory = [] prev_obs = [] env.reset() for _ inrange(goal_steps): env.render()iflen(prev_obs)==0: action = random.randrange(0,2)else: action = np.argmax(model.predict(prev_obs.reshape(-1,len(prev_obs),1))[0]) choices.append(action) new_observation, reward, done, info = env.step(action) prev_obs = new_observation game_memory.append([new_observation, action]) score+=rewardifdone: break scores.append(score)print('Average Score:',sum(scores)/len(scores))print('choice 1:{} choice 0:{}'.format(float((choices.count(1))/float(len(choices)))*100,float((choices.count(0))/float(len(choices)))*100))print(score_requirement)
This is what the result will look like:
Conclusion
Though we haven’t used the Reinforcement Learning model in this blog, the normal fully connected neural network gave us a satisfactory accuracy of 60%. We used tflearn, which is a higher level API on top of Tensorflow for speeding-up experimentation. We hope that this blog will give you a head start in using OpenAI Gym.
We are waiting to see exciting implementations using Gym and Reinforcement Learning. Happy Coding!
Custom resources definition (CRD) is a powerful feature introduced in Kubernetes 1.7 which enables users to add their own/custom objects to the Kubernetes cluster and use it like any other native Kubernetes objects. In this blog post, we will see how we can add a custom resource to a Kubernetes cluster using the command line as well as using the Golang client library thus also learning how to programmatically interact with a Kubernetes cluster.
What is a Custom Resource Definition (CRD)?
In the Kubernetes API, every resource is an endpoint to store API objects of certain kind. For example, the built-in service resource contains a collection of service objects. The standard Kubernetes distribution ships with many inbuilt API objects/resources. CRD comes into picture when we want to introduce our own object into the Kubernetes cluster to full fill our requirements. Once we create a CRD in Kubernetes we can use it like any other native Kubernetes object thus leveraging all the features of Kubernetes like its CLI, security, API services, RBAC etc.
The custom resource created is also stored in the etcd cluster with proper replication and lifecycle management. CRD allows us to use all the functionalities provided by a Kubernetes cluster for our custom objects and saves us the overhead of implementing them on our own.
How to register a CRD using command line interface (CLI)
Step-1: Create a CRD definition file sslconfig-crd.yaml
Here we are creating a custom resource definition for an object of kind SslConfig. This object allows us to store the SSL configuration information for a domain. As we can see under the validation section specifying the cert, key and the domain are mandatory for creating objects of this kind, along with this we can store other information like the provider of the certificate etc. The name metadata that we specify must be spec.names.plural+”.”+spec.group.
An API group (blog.velotio.com here) is a collection of API objects which are logically related to each other. We have also specified version for our custom objects (spec.version), as the definition of the object is expected to change/evolve in future so it’s better to start with alpha so that the users of the object knows that the definition might change later. In the scope, we have specified Namespaced, by default a custom resource name is clustered scoped.
Along with the mandatory fields cert, key and domain, we have also stored the information of the provider ( certifying authority ) of the cert.
How to register a CRD programmatically using client-go
Client-go project provides us with packages using which we can easily create go client and access the Kubernetes cluster. For creating a client first we need to create a connection with the API server. How we connect to the API server depends on whether we will be accessing it from within the cluster (our code running in the Kubernetes cluster itself) or if our code is running outside the cluster (locally)
If the code is running outside the cluster then we need to provide either the path of the config file or URL of the Kubernetes proxy server running on the cluster.
var (// Set during buildversion stringproxyURL = flag.String("proxy", "",`If specified, it is assumed that a kubctl proxy server is running on thegiven url and creates a proxy client. In case it is not given InClusterkubernetes setup will be used`))if*proxyURL !="" {config, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(&clientcmd.ClientConfigLoadingRules{},&clientcmd.ConfigOverrides{ClusterInfo: clientcmdapi.Cluster{Server: *proxyURL,},}).ClientConfig()if err != nil {glog.Fatalf("error creating client configuration: %v", err)}
When the code is to be run as a part of the cluster then we can simply use
Once the connection is established we can use it to create clientset. For accessing Kubernetes objects, generally the clientset from the client-go project is used, but for CRD related operations we need to use the clientset from apiextensions-apiserver project.
In the create CRD function, we first create the definition of our custom object and then pass it to the create method which creates it in our cluster. Just like we did while creating our definition using CLI, here also we set the parameters like version, group, kind etc.
Once our definition is ready we can create objects of its type just like we did earlier using the CLI. First we need to define our object.
Kubernetes API conventions suggests that each object must have two nested object fields that govern the object’s configuration: the object spec and the object status. Objects must also have metadata associated with them. The custom objects that we define here comply with these standards. It is also recommended to create a list type for every type thus we have also created a SslConfigList struct.
Now we need to write a function which will create a custom client which is aware of the new resource that we have created.
Once we have registered our custom resource definition with the Kubernetes cluster we can create objects of its type using the Kubernetes cli as we did earlier but for creating controllers for these objects or for developing some custom functionalities around them we need to build a client library also using which we can access them from go API. For native Kubernetes objects, this type of library is provided for each object.
We can add more methods like watch, update status etc. Their implementation will also be similar to the methods we have defined above. For looking at the methods available for various Kubernetes objects like pod, node etc. we can refer to the v1 package.
Putting all things together
Now in our main function we will get all the things together.
package mainimport ("flag""fmt""time""blog.velotio.com/crd-blog/v1alpha1""github.com/golang/glog" apiextension "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset" meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1""k8s.io/client-go/rest""k8s.io/client-go/tools/clientcmd" clientcmdapi "k8s.io/client-go/tools/clientcmd/api")var (// Set during build version string proxyURL = flag.String("proxy", "",`If specified, it is assumed that a kubctl proxy server is running on the given url and creates a proxy client. In case it is not given InCluster kubernetes setup will be used`))func main() { flag.Parse()var err errorvar config *rest.Configif*proxyURL !="" { config, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(&clientcmd.ClientConfigLoadingRules{},&clientcmd.ConfigOverrides{ ClusterInfo: clientcmdapi.Cluster{ Server: *proxyURL, }, }).ClientConfig()if err != nil { glog.Fatalf("error creating client configuration: %v", err) } } else {if config, err = rest.InClusterConfig(); err != nil { glog.Fatalf("error creating client configuration: %v", err) } } kubeClient, err := apiextension.NewForConfig(config)if err != nil { glog.Fatalf("Failed to create client: %v", err) }// Create the CRD err = v1alpha1.CreateCRD(kubeClient)if err != nil { glog.Fatalf("Failed to create crd: %v", err) }// Wait for the CRD to be created before we use it. time.Sleep(5* time.Second)// Create a new clientset which include our CRD schema crdclient, err := v1alpha1.NewClient(config)if err != nil {panic(err) }// Create a new SslConfig objectSslConfig :=&v1alpha1.SslConfig{ObjectMeta: meta_v1.ObjectMeta{Name: "sslconfigobj",Labels: map[string]string{"mylabel": "test"}, },Spec: v1alpha1.SslConfigSpec{Cert: "my-cert",Key: "my-key",Domain: "*.velotio.com", },Status: v1alpha1.SslConfigStatus{State: "created",Message: "Created, not processed yet", }, }// Create the SslConfig object we create above in the k8s cluster resp, err := crdclient.SslConfigs("default").Create(SslConfig)if err != nil { fmt.Printf("error while creating object: %vn", err) } else { fmt.Printf("object created: %vn", resp) } obj, err := crdclient.SslConfigs("default").Get(SslConfig.ObjectMeta.Name)if err != nil { glog.Infof("error while getting the object %vn", err) } fmt.Printf("SslConfig Objects Found: n%vn", obj) select {}}
Now if we run our code then our custom resource definition will get created in the Kubernetes cluster and also an object of its type will be there just like with the cli. The docker image akash125/crdblog is build using the code discussed above it can be directly pulled from docker hub and run in a Kubernetes cluster. After the image is run successfully, the CRD definition that we discussed above will get created in the cluster along with an object of its type. We can verify the same using the CLI the way we did earlier, we can also check the logs of the pod running the docker image to verify it. The complete code is available here.
Conclusion and future work
We learned how to create a custom resource definition and objects using Kubernetes command line interface as well as the Golang client. We also learned how to programmatically access a Kubernetes cluster, using which we can build some really cool stuff on Kubernetes, we can now also create custom controllers for our resources which continuously watches the cluster for various life cycle events of our object and takes desired action accordingly. To read more about CRD refer the following links:
Recently, I needed to develop a complex Jenkins plug-in for a customer in the containers & DevOps space. In this process, I realized that there is lack of good documentation on Jenkins plugin development and good information is very hard to find. That’s why I decided to write this blog to share my knowledge on Jenkins plugin development.
Topics covered in this Blog
Setting up the development environment
Jenkins plugin architecture: Plugin classes and understanding of the source code.
Complex tasks: Tasks like the integration of REST API in the plugin and exposing environment variables through source code.
Plugin debugging and deployment
So let’s start, shall we?
1. Setting up the development environment
I have used Ubuntu 16.04 for this environment, but the steps remain identical for other flavors. The only difference will be in the commands used for each operating system.
Let me give you a brief list of the requirements:
Compatible JDK: Jenkins plugin development is done in Java. Thus a compatible JDK is what you need first. JDK 6 and above are supported as per the Jenkins documentation.
Maven: Installation guide. I know many of us don’t like to use Maven, as it downloads stuff over the Internet at runtime but it’s required. Check this to understand why using Maven is a good idea.
Jenkins: Check this Installation Guide. Obviously, you would need a Jenkins setup – can be local on hosted on a server/VM.
IDE for development: An IDE like Netbeans, Eclipse or IntelliJ IDEA is preferred. I have used Netbeans 8.1 for this project.
Before going forward, please ensure that you have the above prerequisites installed on your system. Jenkins does have official documentation for setting up the environment – Check this. If you would like to use an IDE besides Netbeans, the above document covers that too.
Let’s start with the creation of your project. I will explain with Maven commands and with use of the IDE as well.
First, let’s start with the approach of using commands.
It may be helpful to add the following to your ~/.m2/settings.xml (Windows users will find them in %USERPROFILE%.m2settings.xml):
<settings> <pluginGroups> <pluginGroup>org.jenkins-ci.tools</pluginGroup> </pluginGroups><profiles> <!-- Give access to Jenkins plugins --> <profile> <id>jenkins</id> <activation> <activeByDefault>true</activeByDefault> <!-- change this to false, if you don't like to have it on per default --> </activation> <repositories> <repository> <id>repo.jenkins-ci.org</id> <url>http://repo.jenkins-ci.org/public/</url> </repository> </repositories> <pluginRepositories> <pluginRepository> <id>repo.jenkins-ci.org</id> <url>http://repo.jenkins-ci.org/public/</url> </pluginRepository> </pluginRepositories> </profile> </profiles> <mirrors> <mirror> <id>repo.jenkins-ci.org</id> <url>http://repo.jenkins-ci.org/public/</url> <mirrorOf>m.g.o-public</mirrorOf> </mirror> </mirrors></settings>
This basically lets you use short names in commands e.g. instead of org.jenkins-ci.tools:maven-hpi-plugin:1.61:create, you can use hpi:create. hpi is the packaging style used to deploy the plugins.
This will ask you a few questions, like the groupId (the Maven jargon for the package name) and the artifactId (the Maven jargon for your project name), then create a skeleton plugin from which you can start with. This command should create the sample HelloWorldBuilder plugin.
Command Explanation:
-U: Maven needs to update the relevant Maven plugins (check plugin updates).
hpi: this prefix specifies that the Jenkins HPI Plugin is being invoked, a plugin that supports the development of Jenkins plugins.
create is the goal which creates the directory layout and the POM for your new Jenkins plugin and it adds it to the module list.
Source code tree would be like this:
Your Project Name Pom.xml Src Main Java packagefolder(usually consist of groupId and artifactId) HelloWorldBuilder.java Resources Package folder/HelloWorldBuilder/jelly files
Run “mvn package” which compiles all sources, runs the tests and creates a package – when used by the HPI plugin it will create an *.hpi file.
Building the Plugin:
Run mvn install in the directory where pom.xml resides. This is similar to mvn package command but at the end, it will create your plugins .hpi file which you can deploy. Simply copy the create .hpi file and paste to /plugins folder of your Jenkins setup. Restart your Jenkins and you should see the plugin on Jenkins.
You can now create plugin via New Project » Maven » Jenkins Plugin.
This is the same as “mvn -U org.jenkins-ci.tools:maven-hpi-plugin:create” command which should create the simple “HelloWorldBuilder” application.
Netbeans comes with Maven built-in so even if you don’t have Maven installed on your system this should work. But you may face error accessing the Jenkins repo. Remember we added some configuration settings in settings.xml in the very first step. Yes, if you have added that already then you shouldn’t face any problem but if you haven’t added that you can add that in Netbeans Maven settings.xml which you can find at: netbeans_installation_path/java/maven/conf/settings.xml
Now you have your “HelloWorldBuilder” application ready. This is shown as TODO plugin in Netbeans. Simply run it(F6). This creates the Jenkins instance and runs it on 8080 port. Now, if you already have local Jenkins setup then you need to stop it otherwise this will give you an exception. Go to localhost:8080/jenkins and create a simple job. In “Add Build Step” you should see “Say Hello World” plugin already there.
Now how it got there and the source code explanation is next.
2. Jenkins plugin architecture and understanding
Now that we have our sample HelloWorldBuilder plugin ready, let’s see its components.
As you may know, Jenkins plugin has two parts: Build Step and Post Build Step. This sample application is designed to work for Build step and that’s why you see “Say Hello world” plugin in Build step. I am going to cover Build Step itself.
Do you want to develop Post Build plugin? Don’t worry as these two don’t have much difference. The difference is only in the classes which we extend. For Build step, we extend “hudson.tasks.Builder” and for Post Build “hudson.tasks.Recorder” and with Descriptor class for Build step “BuildStepDescriptor<builder></builder>” for Post Build “BuildStepDescriptor<publisher></publisher>”.
We will go through these classes in detail below:
hudson.tasks.Builder Class:
In brief, this simply tells Jenkins that you are writing a Build Step plugin. A full explanation is here. Now you will see “perform” method once you override this class.
Note that we are not implementing the ”SimpleBuildStep” interface which is there in HelloWorldBuilder source code. Perform method for that Interface is a bit different from what I have given above. My explanation goes around this perform method.
The perform method is basically called when you run your Build. If you see the Parameters passed you have full control over the Build configured, you can log to Jenkins console screen using listener object. What you should do here is access the values set by the user on UI and perform the plugin activity. Note that this method is returning a boolean, True means build is Successful and False is Build Failed.
Understanding the Descriptor Class:
You will notice there is a static inner class in your main class named as DescriptorImpl. This class is basically used for handling configuration of your Plugin. When you click on “Configure” link on Jenkins it basically calls this method and loads the configured data.
You can perform validations here, save the global configuration and many things. We will see these in detail as when required. Now there is an overridden method:
That’s why we see “Say Hello World” in the Build Step. You can rename it to what your plugin does.
@Overridepublic boolean configure(StaplerRequest req, JSONObject formData) throws FormException {// To persist global configuration information,// set that to properties and call save().useFrench = formData.getBoolean("useFrench");// ^Can also use req.bindJSON(this, formData);//(easier when there are many fields; need set* methods for this, like setUseFrench)save();returnsuper.configure(req,formData);}
This method basically saves your configuration, or you can even get global data like we have taken “useFrench” attribute which can be set from Jenkins global configuration. If you would like to set any global parameter you can place them in the global.jelly file.
Understanding Action class and jelly files:
To understand the main Action class and what it’s purpose is, let’s first understand the jelly files.
There are two main jelly files: config.jelly and global.jelly. The global.jelly file is used to set global parameters while config.jelly is used for local parameters configuration. Jenkins uses these jelly files to show the parameters or fields on UI. So anything you write in config.jelly will show up on Jobs configuration page as configurable.
This is what is there in our HelloWorldBuilder application. It simply renders a textbox for entering name.
Jelly has its own syntax and supports HTML and Javascript as well. It has radio buttons, checkboxes, dropdown lists and so on.
How does Jenkins manage to pull the data set by the user? This is where our Action class comes into the picture. If you see the structure of the sample application, it has a private field as name and a constructor.
This DataBoundConstructor annotation tells Jenkins to bind the value of jelly fields. If you notice there’s field as “name” in jelly and the same is used here to put the data. Note that, whatever name you set in field attribute of jelly same you should use here as they are tightly coupled.
Also, add getters for these fields so that Jenkins can access the values.
This method gives you the instance of Descriptor class. So if you want to access methods or properties of Descriptor class in your Action class you can use this.
3. Complex tasks:
We now have a good idea on how the Jenkins plugin structure is and how it works. Now let’s start with some complex stuff.
On the internet, there are examples on how to render a selection box(drop-down) with static data. What if you want to load in a dynamic manner? I came with the below solution. We will use Amazon’s publicly available REST API for getting the coupons and load that data in the selection box.
Here, the objective is to load the data in the selection box. I have the response for REST API as below:
I have taken all these offers and created one dictionary and rendered it on UI. Thus the user will see the list of coupon codes and can choose anyone of them.
Let’s understand how to create the selection box and load the data into it.
<f:entrytitle="select Offer From Amazon"field="getOffer"> <f:selectid="offer-${editorId}"onfocus="getOffers(this.id)"/> </f:entry>
This is the code which will generate the selection box on configuration page. Now you will see here “getOffer” field means there’s field with the same name in the Action class.
When you create any selection box, Jenkins needs doFill{fieldname}Items method in your descriptor class. As we have seen, Descriptor class is configuration class it tries to load the data from this method when you click on the configuration of the job. So in this case, “doFillGetOfferItems” method is required.
After this, selection box should pop up on the configuration page of your plugin.
Now here as we need to do dynamic actions, we will perform some action and will load the data.
As an example, we will click on the button and load the data in Selection Box.
Above is the code to create a button. In method attribute, specify the backend method which should be present in your Descriptor class. So when you click on this button “getAmazonOffers” method will get called at the backend and it will get the data from API.
Now when we click on the selection box, we need to show the contents. As I said earlier, Jelly does support HTML and Javascript. Yes, if you want to do dynamic action use Javascript simply. If you see in selection box code of jelly I have used onfocus() method of Javascript which is pointing to getOffers() function.
Now you need to have this function, define script tag like this.
<script> function getOffers(){ }</script>
Now here get the data from backend and load it in the selection box. To do this we need to understand some objects of Jenkins.
Descriptor: As you now know, Descriptor is configuration class this object points to. So from jelly at any point, you can call the method from your Descriptor class.
Instance: This is the object currently being configured on the configuration page. Null if it’s a newly added instance. Means by using this you can call the methods from your Action class. Like getters of field attribute.
Now how to use these objects? To use you need to first set them.
<st:bindvar="backend"value="${descriptor}"/>
Here you are binding descriptor object to backend variable and this variable is now ready for use anywhere in config.jelly. Similarly for instance, <st:bind var=”backend” value=”${instance}”>.</st:bind>
To make calls use backend.{backend method name}() and it should call your backend method.
But if you are using this from JavaScript then you need use @JavaScriptMethod annotation over the method being called.
We can now get the REST data from backend function in JavaScript and to load the data into the element you can use the document object of JavaScript.
E.g. var selection = document.getElementById(“element-id”); This part is normal Javascript.
So after clicking on “Get Amazon Offers” button and clicking on Selection box it should now load the data.
Multiple Plugin Instance: If we are creating a multiple Build Step plugin then you can create multiple instances of your plugin while configuring it. If you try to do what we have done up till now, it will fail to load the data in the second instance. This is because the same element already exists on the UI with the same id. JavaScript will get confused while putting the data. We need to have a mechanism to create the different ids of the same fields.
I thought of one approach for this. Get the index from backend while configuring the fields and add as a suffix in id attribute.
In this manner, we set the ID value in variable “editorId” and this can be used while creation of fields.
(Check out the selection box creation code above. I have appended this variable in ID attribute)
Now create as many instances you want in configuration page it should work fine.
Exposing Environment Variables:
Environment variables are needed quite often in Jenkins. Your plugin may require the support of some environment variables or the use of the built-in environment variables provided by Jenkins.
First, you need to create the Envvars object.
EnvVars envVars =newEnvVars();** Assign it to the build environment.envVars = build.getEnvironment(listener);** Put the values which you wanted to expose asenvironmentvariable.envVars.put("offer", getOffer);
If you print this then you will get all the default Jenkins environment variables as well as variables which you have exposed. Using this you can even use third party plugins like “Parameterized Trigger Plugin” to export the current build’s environment variable to different jobs.You can even get the value of any environment variable using this.
4. Plugin Debugging and Deployment:
You have now got an idea on how to write a plugin in Jenkins, now we move on to perform some complex tasks. We will see how to debug the issue and deploy the plugin. If you are using the IDE then debugging is same like you do for Java program similar to setting up the breakpoints and running the project.
If you want to perform any validation on fields, in the configuration class you would need to have docheck{fieldname} method which will return FormValidation object. In this example, we are validating the “name” field from our sample “HelloWorldBuilder” example.
public FormValidation doCheckName(@QueryParameter String value)throws IOException, ServletException {if (value.length() ==0)return FormValidation.error("Please set a name"); if (value.length() <4)return FormValidation.warning("Isn't the name too short?");return FormValidation.ok(); }
Plugin deployment:
We have now created the plugin, how are we going to deploy it? We have created the plugin using Netbeans IDE and as I said earlier if you want to deploy it on your local Jenkins setup you need to use the Maven command mvn install and copy .hpi to /plugins/ folder.
But what if you want to deploy it on Jenkins Marketplace? Well, it’s a pretty long process and thankfully Jenkins has good documentation for it.
In short, you need to have a jenkins-ci.org account. Your public Git repo will have the plugin source code. Raise an issue on JIRA to get space on their Git repo and in this operation, they will have forked your git repo. Finally, release the plugin using Maven. The above document explains well what exactly needs to be done.
Conclusion:
We went through the basics of Jenkins plugin development such as classes, configuration, and some complex tasks.
Jenkins plugin development is not difficult, but I feel the poor documentation is what makes the task challenging. I have tried to cover my understanding while developing the plugin, however, it is advisable to create a plugin only if the required functionality does not already exist.
Below are some important links on plugin development:
Jenkins post build plugin development: This is a very good blog which covers things like setting up the environment, plugin classes and developing Post build action.
Basic guide to use jelly: This covers how to use jelly files in Jenkins and attributes of jelly.
You can check the code of the sample application discussed in this blog here. I hope this helps you to build interesting Jenkins plugins. Happy Coding!!
Recently, I was involved in building an ETL (Extract-Transform-Load) pipeline. It included extracting data from MongoDB collections, perform transformations and then loading it into Redshift tables. Many ETL solutions are available in the market which kind-of solves the issue, but the key part of an ETL process lies in its ability to transform or process raw data before it is pushed to its destination.
Each ETL pipeline comes with a specific business requirement around processing data which is hard to be achieved using off-the-shelf ETL solutions. This is why a majority of ETL solutions are custom built manually, from scratch. In this blog, I am going to talk about my learning around building a custom ETL solution which involved moving data from MongoDB to Redshift using Apache Airflow.
Background:
I began by writing a Python-based command line tool which supported different phases of ETL, like extracting data from MongoDB, processing extracted data locally, uploading the processed data to S3, loading data from S3 to Redshift, post-processing and cleanup. I used the PyMongo library to interact with MongoDB and the Boto library for interacting with Redshift and S3.
I kept each operation atomic so that multiple instances of each operation can run independently of each other, which will help to achieve parallelism. One of the major challenges was to achieve parallelism while running the ETL tasks. One option was to develop our own framework based on threads or developing a distributed task scheduler tool using a message broker tool like Celery combined with RabbitMQ. After doing some research I settled for Apache Airflow. Airflow is a Python-based scheduler where you can define DAGs (Directed Acyclic Graphs), which would run as per the given schedule and run tasks in parallel in each phase of your ETL. You can define DAG as Python code and it also enables you to handle the state of your DAG run using environment variables. Features like task retries on failure handling are a plus.
We faced several challenges while getting the above ETL workflow to be near real-time and fault tolerant. We discuss the challenges faced and the solutions below:
Keeping your ETL code changes in sync with Redshift schema
While you are building the ETL tool, you may end up fetching a new field from MongoDB, but at the same time, you have to add that column to the corresponding Redshift table. If you fail to do so the ETL pipeline will start failing. In order to tackle this, I created a database migration tool which would become the first step in my ETL workflow.
The migration tool would:
keep the migration status in a Redshift table and
would track all migration scripts in a code directory.
In each ETL run, it would get the most recently ran migrations from Redshift and would search for any new migration script available in the code directory. If found it would run the newly found migration script after which the regular ETL tasks would run. This adds the onus on the developer to add a migration script if he is making any changes like addition or removal of a field that he is fetching from MongoDB.
Maintaining data consistency
While extracting data from MongoDB, one needs to ensure all the collections are extracted at a specific point in time else there can be data inconsistency issues. We need to solve this problem at multiple levels:
While extracting data from MongoDB define parameters like modified date and extract data from different collections with a filter as records less than or equal to that date. This will ensure you fetch point in time data from MongoDB.
While loading data into Redshift tables, don’t load directly to master table, instead load it to some staging table. Once you are done loading data in staging for all related collections, load it to master from staging within a single transaction. This way data is either updated in all related tables or in none of the tables.
A single bad record can break your ETL
While moving data across the ETL pipeline into Redshift, one needs to take care of field formats. For example, the Date field in the incoming data can be different than that in the Redshift schema design. Another example can be that the incoming data can exceed the length of the field in the schema. Redshift’s COPY command which is used to load data from files to redshift tables is very vulnerable to such changes in data types. Even a single incorrectly formatted record will lead to all your data getting rejected and effectively breaking the ETL pipeline.
There are multiple ways in which we can solve this problem. Either handle it in one of the transform jobs in the pipeline. Alternately we put the onus on Redshift to handle these variances. Redshift’s COPY command has many options which can help you solve these problems. Some of the very useful options are
ACCEPTANYDATE: Allows any date format, including invalid formats such as 00/00/00 00:00:00, to be loaded without generating an error.
ACCEPTINVCHARS: Enables loading of data into VARCHAR columns even if the data contains invalid UTF-8 characters.
TRUNCATECOLUMNS: Truncates data in columns to the appropriate number of characters so that it fits the column specification.
Redshift going out of storage
Redshift is based on PostgreSQL and one of the common problems is when you delete records from Redshift tables it does not actually free up space. So if your ETL process is deleting and creating new records frequently, then you may run out of Redshift storage space. VACUUM operation for Redshift is the solution to this problem. Instead of making VACUUM operation a part of your main ETL flow, define a different workflow which runs on a different schedule to run VACUUM operation. VACUUM operation reclaims space and resorts rows in either a specified table or all tables in the current database. VACUUM operation can be FULL, SORT ONLY, DELETE ONLY & REINDEX. More information on VACUUM can be found here.
ETL instance going out of storage
Your ETL will be generating a lot of files by extracting data from MongoDB onto your ETL instance. It is very important to periodically delete those files otherwise you are very likely to go out of storage on your ETL server. If your data from MongoDB is huge, you might end up creating large files on your ETL server. Again, I would recommend defining a different workflow which runs on a different schedule to run a cleanup operation.
Making ETL Near Real Time
Processing only the delta rather than doing a full load in each ETL run
ETL would be faster if you keep track of the already processed data and process only the new data. If you are doing a full load of data in each ETL run, then the solution would not scale as your data scales. As a solution to this, we made it mandatory for the collection in our MongoDB to have a created and a modified date. Our ETL would check the maximum value of the modified date for the given collection from the Redshift table. It will then generate the filter query to fetch only those records from MongoDB which have modified date greater than that of the maximum value. It may be difficult for you to make changes in your product, but it’s worth the effort!
Compressing and splitting files while loading
A good approach is to write files in some compressed format. It saves your storage space on ETL server and also helps when you load data to Redshift. Redshift COPY command suggests that you provide compressed files as input. Also instead of a single huge file, you should split your files into parts and give all files to a single COPY command. This will enable Redshift to use it’s computing resources across the cluster to do the copy in parallel, leading to faster loads.
Streaming mongo data directly to S3 instead of writing it to ETL server
One of the major overhead in the ETL process is to write data first to ETL server and then uploading it to S3. In order to reduce disk IO, you should not store data to ETL server. Instead, use MongoDB’s handy stream API. For MongoDB Node driver, both the collection.find() and the collection.aggregate() function return cursors. The stream method also accepts a transform function as a parameter. All your custom transform logic could go into the transform function. AWS S3’s node library’s upload() function, also accepts readable streams. Use the stream from the MongoDB Node stream method, pipe it into zlib to gzip it, then feed the readable stream into AWS S3’s Node library. Simple! You will see a large improvement in your ETL process by this simple but important change.
Optimizing Redshift Queries
Optimizing Redshift Queries helps in making the ETL system highly scalable, efficient and also reduce the cost. Lets look at some of the approaches:
Add a distribution key
Redshift database is clustered, meaning your data is stored across cluster nodes. When you query for certain set of records, Redshift has to search for those records in each node, leading to slow queries. A distribution key is a single metric, which will decide the data distribution of all data records across your tables. If you have a single metric which is available for all your data, you can specify it as distribution key. When loading data into Redshift, all data for a certain value of distribution key will be placed on a single node of Redshift cluster. So when you query for certain records Redshift knows exactly where to search for your data. This is only useful when you are also using the distribution key to query the data.
Source: Slideshare
Generating a numeric primary key for string primary key
In MongoDB, you can have any type of field as your primary key. If your Mongo collections are having a non-numeric primary key and you are using those same keys in Redshift, your joins will end up being on string keys which are slower. Instead, generate numeric keys for your string keys and joining on it which will make queries run much faster. Redshift supports specifying a column with an attribute as IDENTITY which will auto-generate numeric unique value for the column which you can use as your primary key.
Conclusion:
In this blog, I have covered the best practices around building ETL pipelines for Redshift based on my learning. There are many more recommended practices which can be easily found in Redshift and MongoDB documentation.
Text classification is one of the essential tasks in supervised machine learning (ML). Assigning categories to text, which can be tweets, Facebook posts, web page, library book, media articles, gallery, etc. has many applications like spam filtering, sentiment analysis, etc. In this blog, we build a text classification engine to classify topics in an incoming Twitter stream using Apache Kafka and scikit-learn – a Python based Machine Learning Library.
Let’s dive into the details. Here is a diagram to explain visually the components and data flow. The Kafka producer will ingest data from Twitter and send it to Kafka broker. The Kafka consumer will ask the Kafka broker for the tweets. We convert the tweets binary stream from Kafka to human readable strings and perform predictions using saved models. We train the models using Twenty Newsgroups which is a prebuilt training data from Sci-kit. It is a standard data set used for training classification algorithms.
In this blog we will use the following machine learning models:
Bag-of-Words(BOW) to convert words to vectors : The bag-of-words model is a way of representing text data when modeling text with machine learning algorithms.
Let’s first understand the following key concepts:
Word to Vector Methodology (Word2Vec)
Bag-of-Words
tf-idf
Multinomial Naive Bayes classifier
Word2Vec methodology
One of the key ideas in Natural Language Processing(NLP) is how we can efficiently convert words into numeric vectors which can then be given as an input to machine learning models to perform predictions.
Neural networks or any other machine learning models are nothing but mathematical functions which need numbers or vectors to churn out the output except tree based methods, they can work on words.
For this we have an approach known as Word2Vec. A very trivial solution to this would be to use “one-hot” method of converting the word into a sparse matrix with only one element of the vector set to 1, the rest being zero.
For example, “the apple a day the good” would have following representation
Here we have transformed the above sentence into a 6×5 matrix, with the 5 being the size of the vocabulary as “the” is repeated. But what are we supposed to do when we have a gigantic dictionary to learn from say more than 100000 words? Here one hot encoding fails. In one hot encoding the relationship between the words is lost. Like “Lanka” should come after “Sri”.
Here is where Word2Vec comes in. Our goal is to vectorize the words while maintaining the context. Word2vec can utilize either of two model architectures to produce a distributed representation of words: continuous bag-of-words (CBOW) or continuous skip-gram. In the continuous bag-of-words architecture, the model predicts the current word from a window of surrounding context words. The order of context words does not influence prediction (bag-of-words assumption). In the continuous skip-gram architecture, the model uses the current word to predict the surrounding window of context words.
TF-IDF is a statistic which determines how important is a word to the document in given corpus. Variations of tf-idf is used by search engines, for text summarizations etc. You can read more about tf-idf – here.
Multinomial Naive Bayes classifier
Naive Bayes Classifier comes from family of probabilistic classifiers based on Bayes theorem. We use it to classify spam or not spam, sports or politics etc. We are going to use this for classifying streams of tweets coming in. You can explore it – here.
Lets how they fit in together.
The data from the “20 newsgroups datasets” is completely in text format. We cannot feed it directly to any model to do mathematical calculations. We have to extract features from the datasets and have to convert them to numbers which a model can ingest and then produce an output. So, we use Continuous Bag of Words and tf-idf for extracting features from datasets and then ingest them to multinomial naive bayes classifier to get predictions.
1. Train Your Model
We are going to use this dataset. We create another file and import the needed libraries We are using sklearn for ML and pickle to save trained model. Now we define the model.
Now we will define Kafka settings and will create KafkaPusher Class. This is necessary because we need to send the data coming from tweepy stream to Kafka producer.
Note – You need to start Kafka server before running this script.
3. Loading your model for predictions
Now we have the trained model in step 1 and a twitter stream in step 2. Lets use the model now to do actual predictions. The first step is to load the model:
#Loading model and vocabprint("Loading pre-trained model")vocabulary_to_load = pickle.load(open("vocab.pickle", 'rb'))count_vect =CountVectorizer(vocabulary=vocabulary_to_load)load_model = pickle.load(open("model.pickle", 'rb'))count_vect._validate_vocabulary()tfidf_transformer =tf_idf(categories)[0]
Then we start the kafka consumer and begin predictions:
#predicting the streaming kafka messagesconsumer =KafkaConsumer('twitter-stream',bootstrap_servers=['localhost:9092'])print("Starting ML predictions.")for message inconsumer:X_new_counts = count_vect.transform([message.value])X_new_tfidf = tfidf_transformer.transform(X_new_counts)predicted = load_model.predict(X_new_tfidf)print(message.value+" => "+fetch_train_dataset(categories).target_names[predicted[0]])
Following are some of the classification done by our model
RT @amazingatheist: Making fun of kids who survived a school shooting just days after the event because you disagree with their politics is… => talk.politics.misc
sci.med
RT @DavidKlion: Apropos of that D’Souza tweet; I think in order to make sense of our politics, you need to understand that there are some t… => talk.politics.misc
RT @BeauWillimon: These students have already cemented a place in history with their activism, and they’re just getting started. No one wil… => talk.politics.misc
In this blog, we were successful in creating a data pipeline where we were using the Naive Bayes model for doing classification of the streaming twitter data. We can classify other sources of data like news articles, blog posts etc. Do let us know if you have any questions, queries and additional thoughts in the comments section below.
Amazon announced “Amazon Lex” in December 2016 and since then we’ve been using it to build bots for our customers. Lex is effectively the technology used by Alexa, Amazon’s voice-activated virtual assistant which lets people control things with voice commands such as playing music, setting alarm, ordering groceries, etc. It provides deep learning-powered natural-language understanding along with automatic speech recognition. Amazon now provides it as a service that allows developers to take advantage of the same features used by Amazon Alexa. So, now there is no need to spend time in setting up and managing the infrastructure for your bots.
Now, developers just need to design conversations according to their requirements in Lex console. The phrases provided by the developer are used to build the natural language model. After publishing the bot, Lex will process the text or voice conversations and execute the code to send responses.
I’ve put together this quick-start tutorial using which you can start building Lex chat-bots. To understand the terms correctly, let’s consider an e-commerce bot that supports conversations involving the purchase of books.
Lex-Related Terminologies
Bot: It consists of all the components related to a conversation, which includes:
Intent: Intent represents a goal, needed to be achieved by the bot’s user. In our case, our goal is to purchase books.
Utterances: An utterance is a text phrase that invokes intent. If we have more than one intent, we need to provide different utterances for them. Amazon Lex builds a language model based on utterance phrases provided by us, which then invoke the required intent. For our demo example, we need a single intent “OrderBook”. Some sample utterances would be:
I want to order some books
Can you please order a book for me
Slots: Each slot is a piece of data that the user must supply in order to fulfill the intent. For instance, purchasing a book requires bookType and bookName as slots for intent “OrderBook” (I am considering these two factors for making the example simpler, otherwise there are so many other factors based on which one will purchase/select a book.). Slots are an input, a string, date, city, location, boolean, number etc. that are needed to reach the goal of the intent. Each slot has a name, slot type, a prompt, and is it required. The slot types are the valid values a user can respond with, which can be either custom defined or one of the Amazon pre-built types.
Prompt: A prompt is a question that Lex uses to ask the user to supply some correct data (for a slot) that is needed to fulfill an intent e.g. Lex will ask “what type of book you want to buy?” to fill the slot bookType.
Fulfillment: Fulfillment provides the business logic that is executed after getting all required slot values, need to achieve the goal. Amazon Lex supports the use of Lambda functions for fulfillment of business logic and for validations.
Let’s Implement this Bot!
Now that we are aware of the basic terminology used in Amazon Lex, let’s start building our chat-bot.
Creating Lex Bot:
Go to Amazon Lex console, which is available only in US, East (N. Virginia) region and click on create button.
Create a custom bot by providing following information:
Bot Name: PurchaseBook
Output voice: None, this is only a test based application
Set Session Timeout: 5 min
Add Amazon Lex basic role to Bot app: Amazon will create it automatically. Find out more about Lex roles & permissions here.
Click on Create button, which will redirect you to the editor page.
Architecting Bot Conversations
Create Slots: We are creating two slots named bookType and bookName. Slot type values can be chosen from 275 pre-built types provided by Amazon or we can create our own customized slot types.
Create custom slot type for bookType as shown here and consider predefined type named Amazon.Book for bookName.
Create Intent: Our bot requires single custom intent named OrderBook.
Configuring the Intents
Utterances: Provide some utterances to invoke the intent. An utterance can consist only of Unicode characters, spaces, and valid punctuation marks. Valid punctuation marks are periods for abbreviations, underscores, apostrophes, and hyphens. If there is a slot placeholder in your utterance ensure, that it’s in the {slotName} format and has spaces at both ends.
Slots: Map slots with their types and provide prompt questions that need to be asked to get valid value for the slot. Note the sequence, Lex-bot will ask the questions according to priority.
Confirmation prompt: This is optional. If required you can provide a confirmation message e.g. Are you sure you want to purchase book named {bookName}?, where bookName is a slot placeholder.
Fulfillment: Now we have all necessary data gathered from the chatbot, it can just be passed over in lambda function, or the parameters can be returned to the client application that then calls a REST endpoint.
Creating Amazon Lambda Functions
Amazon Lex supports Lambda function to provide code hooks to the bot. These functions can serve multiple purposes such as improving the user interaction with the bot by using prior knowledge, validating the input data that bot received from the user and fulfilling the intent.
Go to AWS Lambda console and choose to Create a Lambda function.
Select blueprint as blank function and click next.
To configure your Lambda function, provide its name, runtime and code needs to be executed when the function is invoked. The code can also be uploaded in a zip folder instead of providing it as inline code. We are considering Nodejs4.3 as runtime.
Click next and choose Create Function.
We can configure our bot to invoke these lambda functions at two places. We need to do this while configuring the intent as shown below:-
where, botCodeHook and fulfillment are name of lambda functions we created.
Lambda initialization and validation
Lambda function provided here i.e. botCodeHook will be invoked on each user input whose intent is understood by Amazon Lex. It will validate the bookName with predefined list of books.
'use strict';exports.handler= (event, context, callback) => {constsessionAttributes= event.sessionAttributes;constslots= event.currentIntent.slots;constbookName= slots.bookName;// predefined list of available booksconstvalidBooks= ['harry potter', 'twilight', 'wings of fire'];// negative check: if valid slot value is not obtained, inform lex that user is expected // respond with a slot value if (bookName &&!(bookName ==="") && validBooks.indexOf(bookName.toLowerCase()) ===-1) {let response = { sessionAttributes: event.sessionAttributes, dialogAction: { type: "ElicitSlot", message: { contentType: "PlainText", content: `We do not have book: ${bookName}, Provide any other book name. For. e.g twilight.` }, intentName: event.currentIntent.name, slots: slots, slotToElicit : "bookName" } }callback(null, response); }// if valid book name is obtained, send command to choose next course of actionlet response = {sessionAttributes: sessionAttributes, dialogAction: { type: "Delegate", slots: event.currentIntent.slots } }callback(null, response);};
Fulfillment code hook
This lambda function is invoked after receiving all slot data required to fulfill the intent.
'use strict';exports.handler= (event, context, callback) => {// when intent get fulfilled, inform lex to complete the statelet response = {sessionAttributes: event.sessionAttributes, dialogAction: { type: "Close", fulfillmentState: "Fulfilled", message: { contentType: "PlainText", content: "Thanks for purchasing book." } } }callback(null, response);};
Error Handling: We can customize the error message for our bot users. Click on error handling and replace default values with the required ones. Since the number of retries given is two, we can also provide different message for every retry.
Your Bot is Now Ready To Chat
Click on Build to build the chat-bot. Congratulations! Your Lex chat-bot is ready to test. We can test it in the overlay which appears in the Amazon Lex console.
Sample conversations:
I hope you have understood the basic terminologies of Amazon Lex along with how to create a simple chat-bot using serverless (Amazon Lambda). This is a really powerful platform to build mature and intelligent chatbots.
Bots are the flavor of the season. Everyday, we hear about a new bot catering to domains like travel, social, legal, support, sales, etc. being launched. Facebook Messenger alone has more than 11,000 bots when I last checked and must have probably added thousands of them as I write this article.
The first generation of bots were dumb since they could understand only a limited set of queries based on keywords in the conversation. But the commoditization of NLP(Natural Language Processing) and machine learning by services like Wit.ai, API.ai, Luis.ai, Amazon Lex, IBM Watson, etc. has resulted in the growth of intelligent bots like donotpay, chatShopper. I don’t know if bots are just hype or the real deal. But I can say with certainty that building a bot is fun and challenging at the same time. In this article, I would like to introduce you to some of the tools to build an intelligent chatbot.
The title of the blog clearly tells that we have used Botkit and Rasa (NLU) to build our bot. Before getting into the technicalities, I would like to share the reason for choosing these two platforms and how they fit our use case. Also read – How to build a serverless chatbot with Amazon Lex.
Bot development Framework — Howdy, Botkit and Microsoft (MS) Bot Framework were good contenders for this. Both these frameworks: – are open source – have integrations with popular messaging platforms like Slack, Facebook Messenger, Twilio etc – have good documentation – have an active developer community
Due to compliance issues, we had chosen AWS to deploy all our services and we wanted the same with the bot as well.
NLU (Natural Language Understanding) — API.ai (acquired by google) and Wit.ai (acquired by Facebook) are two popular NLU tools in the bot industry which we first considered for this task. Both the solutions: – are hosted as a cloud service – have Nodejs, Python SDK and a REST interface – have good documentation – support for state or contextual intents which makes it very easy to build a conversational platform on top of it.
As stated before, we couldn’t use any of these hosted solutions due to compliance and that is where we came across an open source NLU called Rasa which was a perfect replacement for API.ai and Wit.ai and at the same time, we could host and manage it on AWS.
You would now be wondering why I used the term NLU for Api.ai and Wit.ai and not NLP (Natural Language Processing). * NLP refers to all the systems which handle the interactions with humans in the way humans find it natural. It means that we could converse with a system just the way we talk to other human beings. * NLU is a subfield of NLP which handles a narrow but complex challenge of converting unstructured inputs into a structured form which a machine can understand and act upon. So when you say “Book a hotel for me in San Francisco on 20th April 2017”, the bot uses NLU to extract date=20th April 2017, location=San Francisco and action=book hotel which the system can understand.
RASA NLU
In this section, I would like to explain Rasa in detail and some terms used in NLP which you should be familiar with. * Intent: This tells us what the user would like to do. Ex : Raise a complaint, request for refund etc
* Entities: These are the attributes which gives details about the user’s task. Ex — Complaint regarding service disruptions, refund cost etc
* Confidence Score : This is a distance metric which indicates how closely the NLU could classify the result into the list of intents.
Here is an example to help you understand the above mentioned terms — Input:“My internet isn’t working since morning”. – intent: “service_interruption” – entities: “service=internet”, “duration=morning”. – confidence score: 0.84 (This could vary based on your training)
NLU’s job (Rasa in our case) is to accept a sentence/statement and give us the intent, entities and a confidence score which could be used by our bot. Rasa basically provides a high level API over various NLP and ML libraries which does intent classification and entity extraction. These NLP and ML libraries are called as backend in Rasa which brings the intelligence in Rasa. These are some of the backends used with Rasa
MITIE — This is an all inclusive library meaning that it has NLP library for entity extraction as well as ML library for intent classification built into it.
spaCy + sklearn — spaCy is a NLP library which only does entity extraction. sklearn is used with spaCy to add ML capabilities for intent classification.
MITIE + sklearn — This uses best of both the worlds. This uses good entity recognition available in MITIE along with fast and good intent classification in sklearn.
I have used MITIE backend to train Rasa. For the demo, I’ve taken a “Live Support ChatBot” which is trained for messages like this: * My phone isn’t working. * My phone isn’t turning on. * My phone crashed and isn’t working anymore.
My training data looks like this:
{"rasa_nlu_data": {"common_examples": [ {"text": "hi","intent": "greet","entities": [] }, {"text": "my phone isn't turning on.","intent": "device_failure","entities": [ {"start": 3,"end": 8,"value": "phone","entity": "device" } ] }, {"text": "my phone is not working.","intent": "device_failure","entities": [ {"start": 3,"end": 8,"value": "phone","entity": "device" } ] }, {"text": "My phone crashed and isn’t working anymore.","intent": "device_failure","entities": [ {"start": 3,"end": 8,"value": "phone","entity": "device" } ] } ] }}
NOTE — We have observed that MITIE gives better accuracy than spaCy + sklearn for a small training set but as you keep adding more intents, training on MITIE gets slower and slower. For a training set of 200+ examples with about 10–15 intents, MITIE takes about 35–45 minutes for us to train on a C4.4xlarge instance(16 cores, 30 GB RAM) on AWS.
Botkit-Rasa Integration
Botkit is an open source bot development framework designed by the creators of Howdy. It basically provides a set of tools for building bots on Facebook Messenger, Slack, Twilio, Kik and other popular platforms. They have also come up with an IDE for bot development called Botkit Studio. To summarize, Botkit is a tool which allows us to write the bot once and deploy it on multiple messaging platforms.
Botkit also has a support for middleware which can be used to extend the functionality of botkit. Integrations with database, CRM, NLU and statistical tools are provided via middleware which makes the framework extensible. This design also allows us to easily add integrations with other tools and software by just writing middleware modules for them.
I’ve integrated Slack and botkit for this demo. You can use this boilerplate template to setup botkit for Slack. We have extended Botkit-Rasa middleware which you can find here.
Botkit-Rasa has 2 functions: receive and hears which override the default botkit behaviour. 1. receive — This function is invoked when botkit receives a message. It sends the user’s message to Rasa and stores the intent and entities into the botkit message object.
2. hears — This function overrides the default botkit hears method i.e controller.hears. The default hears method uses regex to search the given patterns in the user’s message while the hears method from Botkit-Rasa middleware searches for the intent.
let Botkit =require('botkit');let rasa =require('./Middleware/rasa')({rasa_uri: 'http://localhost:5000'});let controller = Botkit.slackbot({ clientId: process.env.clientId, clientSecret: process.env.clientSecret, scopes: ['bot'], json_file_store: __dirname +'/.db/'});// Override receive method in botkitcontroller.middleware.receive.use(rasa.receive);// Override hears method in botkitcontroller.changeEars(function (patterns, message) {return rasa.hears(patterns, message);});controller.setupWebserver(3000, function (err, webserver) {// Configure a route to receive webhooks from slack controller.createWebhookEndpoints(webserver);});
Let’s try an example — “my phone is not turning on”. Rasa will return the following 1. Intent — device_failure 2. Entites — device=phone
If you notice carefully, the input I gave i.e “my phone is not turning on” is a not present in my training file. Rasa has some intelligence built into it to identify the intent and entities correctly for such combinations.
We need to add a hears method listening to intent “device_failure” to process this input. Remember that intent and entities returned by Rasa will be stored in the message object by Rasa-Botkit middleware.
let Botkit =require('botkit');let rasa =require('./Middleware/rasa')({rasa_uri: 'http://localhost:5000'});let controller = Botkit.slackbot({ clientId: process.env.clientId, clientSecret: process.env.clientSecret, scopes: ['bot'], json_file_store: __dirname +'/.db/'});// Override receive method in botkitcontroller.middleware.receive.use(rasa.receive);// Override hears method in botkitcontroller.changeEars(function (patterns, message) {return rasa.hears(patterns, message);});controller.setupWebserver(3000, function (err, webserver) {// Configure a route to receive webhooks from slack controller.createWebhookEndpoints(webserver);});
You should be able run this bot with slack and see the output as shown below (support_bot is the name of my bot).
Conclusion
You are now familiar with the process of building chatbots with a bot development framework and a NLU. Hope this helps you get started on your bot very quickly. If you have any suggestions, questions, feedback then tweet me @harjun1601. Keep following our blogs for more articles on bot development, ML and AI.
This blog aims at exploring the Rasa Stack to create a stateless chat-bot. We will look into how, the recently released Rasa Core, which provides machine learning based dialogue management, helps in maintaining the context of conversations using machine learning in an efficient way.
If you have developed chatbots, you would know how hopelessly bots fail in maintaining the context once complex use-cases need to be developed. There are some home-grown approaches that people currently use to build stateful bots. The most naive approach is to create the state machines where you create different states and based on some logic take actions. As the number of states increases, more levels of nested logic are required or there is a need to add an extra state to the state machine, with another set of rules for how to get in and out of that state. Both of these approaches lead to fragile code that is harder to maintain and update. Anyone who’s built and debugged a moderately complex bot knows this pain.
After building many chatbots, we have experienced that flowcharts are useful for doing the initial design of a bot and describing a few of the known conversation paths, but we shouldn’t hard-code a bunch of rules since this approach doesn’t scale beyond simple conversations.
Thanks to the Rasa guys who provided a way to go stateless where scaling is not at all a problem. Let’s build a bot using Rasa Core and learn more about this.
Rasa Core: Getting Rid of State Machines
The main idea behind Rasa Core is that thinking of conversations as a flowchart and implementing them as a state machine doesn’t scale. It’s very hard to reason about all possible conversations explicitly, but it’s very easy to tell, mid-conversation, if a response is right or wrong. For example, let’s consider a term insurance purchase bot, where you have defined different states to take different actions. Below diagram shows an example state machine:
Let’s consider a sample conversation where a user wants to compare two policies listed by policy_search state.
In above conversation, it can be compared very easily by adding some logic around the intent campare_policies. But real life is not so easy, as a majority of conversations are edge cases. We need to add rules manually to handle such cases, and after testing we realize that these clash with other rules we wrote earlier.
Rasa guys figured out how machine learning can be used to solve this problem. They have released Rasa Core where the logic of the bot is based on a probabilistic model trained on real conversations.
Structure of a Rasa Core App
Let’s understand few terminologies we need to know to build a Rasa Core app:
1. Interpreter: An interpreter is responsible for parsing messages. It performs the Natural Language Understanding and transforms the message into structured output i.e. intent and entities. In this blog, we are using Rasa NLU model as an interpreter. Rasa NLU comes under the Rasa Stack. In Training section, it is shown in detail how to prepare the training data and create a model.
2. Domain: To define a domain we create a domain.yml file, which defines the universe of your bot. Following things need to be defined in a domain file:
Intents: Things we expect the user to say. It is more related to Rasa NLU.
Entities: These represent pieces of information extracted what user said. It is also related to Rasa NLU.
Templates: We define some template strings which our bot can say. The format for defining a template string is utter_<intent>. These are considered as actions which bot can take.
Actions: List of things bot can do and say. There are two types of actions we define one those which will only utter message (Templates) and others some customised actions where some required logic is defined. Customised actions are defined as Python classes and are referenced in domain file.
Slots: These are user-defined variables which need to be tracked in a conversation. For e.g to buy a term insurance we need to keep track of what policy user selects and details of the user, so all these details will come under slots.
3. Stories: In stories, we define what bot needs to do at what point in time. Based on these stories, a probabilistic model is generated which is used to decide which action to be taken next. There are two ways in which stories can be created which are explained in next section.
Let’s combine all these pieces together. When a message arrives in a Rasa Core app initially, interpreter transforms the message into structured output i.e. intents and entities. The Tracker is the object which keeps track of conversation state. It receives the info that a new message has come in. Then based on dialog model we generate using domain and stories policy chooses which action to take next. The chosen action is logged by the tracker and response is sent back to the user.
Training and Running A Sample Bot
We will create a simple Facebook chat-bot named Secure Life which assists you in buying term life insurance. To keep the example simple, we have restricted options such as age-group, term insurance amount, etc.
There are two models we need to train in the Rasa Core app:
Rasa NLU model based on which messages will be processed and converted to a structured form of intent and entities. Create following two files to generate the model:
data.json: Create this training file using the rasa-nlu trainer. Click here to know more about the rasa-nlu trainer.
$ python -m rasa_nlu.train -c nlu_model_config.json --fixed_model_name current
Dialogue Model: This model is trained on stories we define, based on which the policy will take the action. There are two ways in which stories can be generated:
Supervised Learning: In this type of learning we will create the stories by hand, writing them directly in a file. It is easy to write but in case of complex use-cases it is difficult to cover all scenarios.
Reinforcement Learning: The user provides feedback on every decision taken by the policy. This is also known as interactive learning. This helps in including edge cases which are difficult to create by hand. You must be thinking how it works? Every time when a policy chooses an action to take, it is asked from the user whether the chosen action is correct or not. If the action taken is wrong, you can correct the action on the fly and store the stories to train the model again.
Since the example is simple, we have used supervised learning method, to generate the dialogue model. Below is the stories.md file.
## All yes* greet- utter_greet* affirm- utter_very_much_so* affirm- utter_gender* gender- utter_coverage_duration- action_gender* affirm- utter_nicotine* affirm- action_nicotine* age- action_thanks## User not interested* greet- utter_greet* deny- utter_decline## Coverage duration is not sufficient* greet- utter_greet* affirm- utter_very_much_so* affirm- utter_gender* gender- utter_coverage_duration- action_gender* deny- utter_decline
Run below command to train dialogue model :
$ python -m rasa_core.train -s <path to stories.md file>-d <path to domain.yml>-o models/dialogue --epochs 300
Define a Domain: Create domain.yml file containing all the required information. Among the intents and entities write all those strings which bot is supposed to see when user say something i.e. intents and entities you defined in rasa NLU training file.
intents:- greet- goodbye- affirm- deny- age- genderslots:gender:type: textnicotine:type: textagegroup:type: texttemplates:utter_greet:-"hey there! welcome to Secure-Life!\nI can help you quickly estimate your rate of coverage.\nWould you like to do that ?"utter_very_much_so:-"Great! Let's get started.\nWe currently offer term plans of Rs. 1Cr. Does that suit your need?"utter_gender:-"What gender do you go by ?"utter_coverage_duration:-"We offer this term plan for a duration of 30Y. Do you think that's enough to cover entire timeframe of your financial obligations ?"utter_nicotine:-"Do you consume nicotine-containing products?"utter_age:-"And lastly, how old are you ?"utter_thanks:-"Thank you for providing all the info. Let me calculate the insurance premium based on your inputs."utter_decline:-"Sad to see you go. In case you change your plans, you know where to find me :-)"utter_goodbye:-"goodbye :("actions:- utter_greet- utter_goodbye- utter_very_much_so- utter_coverage_duration- utter_age- utter_nicotine- utter_gender- utter_decline- utter_thanks- actions.ActionGender- actions.ActionNicotine- actions.ActionThanks
Define Actions: Templates defined in domain.yml also considered as actions. A sample customized action is shown below where we are setting a slot named gender with values according to the option selected by the user.
from rasa_core.actions.action import Actionfrom rasa_core.events import SlotSetclass ActionGender(Action):def name(self):return 'action_gender'def run(self, dispatcher, tracker, domain):messageObtained = tracker.latest_message.text.lower()if ("male" in messageObtained):return [SlotSet("gender", "male")]elif ("female" in messageObtained):return [SlotSet("gender", "female")]else:return [SlotSet("gender", "others")]
Running the Bot
Create a Facebook app and get the app credentials. Create a bot.py file as shown below:
from rasa_core import utilsfrom rasa_core.agent import Agentfrom rasa_core.interpreter import RasaNLUInterpreterfrom rasa_core.channels import HttpInputChannelfrom rasa_core.channels.facebook import FacebookInputlogger = logging.getLogger(__name__)def run(serve_forever=True):# create rasa NLU interpreterinterpreter = RasaNLUInterpreter("models/nlu/current")agent = Agent.load("models/dialogue", interpreter=interpreter)input_channel = FacebookInput(fb_verify="your_fb_verify_token", # you need tell facebook this token, to confirm your URLfb_secret="your_app_secret", # your app secretfb_tokens={"your_page_id": "your_page_token"}, # page ids + tokens you subscribed todebug_mode=True # enable debug mode for underlying fb library)if serve_forever:agent.handle_channel(HttpInputChannel(5004, "/app", input_channel))return agentif __name__ == '__main__':utils.configure_colored_logging(loglevel="DEBUG")run()
Run the file and your bot is ready to test. Sample conversations are provided below:
Summary
You have seen how Rasa Core has made it easier to build bots. Just create few files and boom! Your bot is ready! Isn’t it exciting? I hope this blog provided you some insights on how Rasa Core works. Start exploring and let us know if you need any help in building chatbots using Rasa Core.
In this blog, I will compare various methods to avoid the dreaded callback hells that are common in Node.js. What exactly am I talking about? Have a look at this piece of code below. Every child function executes only when the result of its parent function is available. Callbacks are the very essence of the unblocking (and hence performant) nature of Node.js.
Convinced yet? Even though there is some seemingly unnecessary error handling done here, I assume you get the drift! The problem with such code is more than just indentation. Instead, our programs entire flow is based on side effects – one function only incidentally calling the inner function.
There are multiple ways in which we can avoid writing such deeply nested code. Let’s have a look at our options:
Promises
According to the official specification, promise represents an eventual result of an asynchronous operation. Basically, it represents an operation that has not completed yet but is expected to in the future. The then method is a major component of a promise. It is used to get the return value (fulfilled or rejected) of a promise. Only one of these two values will ever be set. Let’s have a look at a simple file read example without using promises:
Now, if readFile function returned a promise, the same logic could be written like so:
var fileReadPromise = fs.readFile(filePath);fileReadPromise.then(console.log, console.error)
The fileReadPromise can then be passed around multiple times in a code where you need to read a file. This helps in writing robust unit tests for your code since you now only have to write a single test for a promise. And more readable code!
Chaining using promises
The then function itself returns a promise which can again be used to do the next operation. Changing the first code snippet to using promises results in this:
As in evident, it makes the code more composed, readable and easier to maintain. Also, instead of chaining we could have used Promise.all. Promise.all takes an array of promises as input and returns a single promise that resolves when all the promises supplied in the array are resolved. Other useful information on promises can be found here.
The async utility module
Async is an utility module which provides a set of over 70 functions that can be used to elegantly solve the problem of callback hells. All these functions follow the Node.js convention of error-first callbacks which means that the first callback argument is assumed to be an error (null in case of success). Let’s try to solve the same foo-bar-baz problem using async module. Here is the code snippet:
Here, I have used the async.waterfall function as an example. There are a multiple functions available according to the nature of the problem you are trying to solve like async.each – for parallel execution, async.eachSeries – for serial execution etc.
Async/Await
Now, this is one of the most exciting features coming to Javascript in near future. It internally uses promises but handles them in a more intuitive manner. Even though it seems like promises and/or 3rd party modules like async would solve most of the problems, a further simplification is always welcome! For those of you who have worked with C# async/await, this concept is directly cribbed from there and being brought into ES7.
Async/await enables us to write asynchronous promise-based code as if it were synchronous, but without blocking the main thread. An async function always returns a promise whether await is used or not. But whenever an await is observed, the function is paused until the promise either resolves or rejects. Following code snippet should make it clearer:
Here, asyncFun is an async function which captures the promised result using await. This has made the code readable and a major convenience for developers who are more comfortable with linearly executed languages, without blocking the main thread.
Now, like before, lets solve the foo-bar-baz problem using async/await. Note that foo, bar and baz individually return promises just like before. But instead of chaining, we have written the code linearly.
How long should you (a)wait for async to come to fore?
Well, it’s already here in the Chrome 55 release and the latest update of the V8 engine. The native support in the language means that we should see a much more widespread use of this feature. The only, catch is that if you would want to use async/await on a codebase which isn’t promise aware and based completely on callbacks, it probably will require a lot of wrapping the existing functions to make them usable.
To wrap up, async/await definitely make coding numerous async operations an easier job. Although promises and callbacks would do the job for most, async/await looks like the way to make some architectural problems go away and improve code quality.
With the introduction of Elastic Kubernetes service at AWS re: Invent last year, AWS finally threw their hat in the ever booming space of managed Kubernetes services. In this blog post, we will learn the basic concepts of EKS, launch an EKS cluster and also deploy a multi-tier application on it.
What is Elastic Kubernetes service (EKS)?
Kubernetes works on a master-slave architecture. The master is also referred to as control plane. If the master goes down it brings our entire cluster down, thus ensuring high availability of master is absolutely critical as it can be a single point of failure. Ensuring high availability of master and managing all the worker nodes along with it becomes a cumbersome task in itself, thus it is most desirable for organizations to have managed Kubernetes cluster so that they can focus on the most important task which is to run their applications rather than managing the cluster. Other cloud providers like Google cloud and Azure already had their managed Kubernetes service named GKE and AKS respectively. Similarly now with EKS Amazon has also rolled out its managed Kubernetes cluster to provide a seamless way to run Kubernetes workloads.
Key EKS concepts:
EKS takes full advantage of the fact that it is running on AWS so instead of creating Kubernetes specific features from the scratch they have reused/plugged in the existing AWS services with EKS for achieving Kubernetes specific functionalities. Here is a brief overview:
IAM-integration: Amazon EKS integrates IAM authentication with Kubernetes RBAC ( role-based access control system native to Kubernetes) with the help of Heptio Authenticator which is a tool that uses AWS IAM credentials to authenticate to a Kubernetes cluster. Here we can directly attach an RBAC role with an IAM entity this saves the pain of managing another set of credentials at the cluster level.
Container Interface: AWS has developed an open source cni plugin which takes advantage of the fact that multiple network interfaces can be attached to a single EC2 instance and these interfaces can have multiple secondary private ips associated with them, these secondary ips are used to provide pods running on EKS with real ip address from VPC cidr pool. This improves the latency for inter pod communications as the traffic flows without any overlay.
ELB Support: We can use any of the AWS ELB offerings (classic, network, application) to route traffic to our service running on the working nodes.
Auto scaling: The number of worker nodes in the cluster can grow and shrink using the EC2 auto scaling service.
Route 53: With the help of the External DNS project and AWS route53 we can manage the DNS entries for the load balancers which get created when we create an ingress object in our EKS cluster or when we create a service of type LoadBalancer in our cluster. This way the DNS names are always in sync with the load balancers and we don’t have to give separate attention to it.
Shared responsibility for cluster: The responsibilities of an EKS cluster is shared between AWS and customer. AWS takes care of the most critical part of managing the control plane (api server and etcd database) and customers need to manage the worker node. Amazon EKS automatically runs Kubernetes with three masters across three Availability Zones to protect against a single point of failure, control plane nodes are also monitored and replaced if they fail, and are also patched and updated automatically this ensures high availability of the cluster and makes it extremely simple to migrate existing workloads to EKS.
Prerequisites for launching an EKS cluster:
1. IAM role to be assumed by the cluster: Create an IAM role that allows EKS to manage a cluster on your behalf. Choose EKS as the service which will assume this role and add AWS managed policies ‘AmazonEKSClusterPolicy’ and ‘AmazonEKSServicePolicy’ to it.
2. VPC for the cluster: We need to create the VPC where our cluster is going to reside. We need a VPC with subnets, internet gateways and other components configured. We can use an existing VPC for this if we wish or create one using the CloudFormation script provided by AWS here or use the Terraform script available here. The scripts take ‘cidr’ block of the VPC and three other subnets as arguments.
Launching an EKS cluster:
1. Using the web console: With the prerequisites in place now we can go to the EKS console and launch an EKS cluster when we try to launch an EKS cluster we need to provide a the name of the EKS cluster, choose the Kubernetes version to use, provide the IAM role we created in step one and also choose a VPC, once we choose a VPC we also need to select subnets from the VPC where we want our worker nodes to be launched by default all the subnets in the VPC are selected we also need to provide a security group which is applied to the elastic network interfaces (eni) that EKS creates to allow control plane communicate with the worker nodes.
NOTE: Couple of things to note here is that the subnets must be in at least two different availability zones and the security group that we provided is later updated when we create worker node cluster so it is better to not use this security group with any other entity or be completely sure of the changes happening to it.
In the response, we see that the cluster is in creating state. It will take a few minutes before it is available. We can check the status using the below command:
aws eks describe-cluster --name=eks-blog-cluster
Configure kubectl for EKS:
We know that in Kubernetes we interact with the control plane by making requests to the API server. The most common way to interact with the API server is via kubectl command line utility. As our cluster is ready now we need to install kubectl.
As discussed earlier EKS uses AWS IAM Authenticator for Kubernetes to allow IAM authentication for your Kubernetes cluster. So we need to download and install the same.
Replace the values of the server and certificate–authority data with the values of your cluster and certificate and also update the cluster name in the args section. You can get these values from the web console as well as using the command.
aws eks describe-cluster --name=eks-blog-cluster
Save and exit.
Add that file path to your KUBECONFIG environment variable so that kubectl knows where to look for your cluster configuration.
To verify that the kubectl is now properly configured :
kubectl get allNAMETYPECLUSTER-IPEXTERNAL-IPPORT(S) AGEservice/kubernetes ClusterIP 172.20.0.1443/TCP 50m
Launch and configure worker nodes :
Now we need to launch worker nodes before we can start deploying apps. We can create the worker node cluster by using the CloudFormation script provided by AWS which is available here or use the Terraform script available here.
ClusterName: Name of the Amazon EKS cluster we created earlier.
ClusterControlPlaneSecurityGroup: Id of the security group we used in EKS cluster.
NodeGroupName: Name for the worker node auto scaling group.
NodeAutoScalingGroupMinSize: Minimum number of worker nodes that you always want in your cluster.
NodeAutoScalingGroupMaxSize: Maximum number of worker nodes that you want in your cluster.
NodeInstanceType: Type of worker node you wish to launch.
NodeImageId: AWS provides Amazon EKS-optimized AMI to be used as worker nodes. Currently AKS is available in only two AWS regions Oregon and N.virginia and the AMI ids are ami-02415125ccd555295 and ami-048486555686d18a0 respectively
KeyName: Name of the key you will use to ssh into the worker node.
VpcId: Id of the VPC that we created earlier.
Subnets: Subnets from the VPC we created earlier.
To enable worker nodes to join your cluster, we need to download, edit and apply the AWS authenticator config map.
Edit the value of rolearn with the arn of the role of your worker nodes. This value is available in the output of the scripts that you ran. Save the change and then apply
kubectl apply -f aws-auth-cm.yaml
Now you can check if the nodes have joined the cluster or not.
kubectl get nodesNAMESTATUSROLESAGEVERSIONip-10-0-2-171.ec2.internal Ready 12s v1.10.3ip-10-0-3-58.ec2.internal Ready 14s v1.10.3
Deploying an application:
As our cluster is completely ready now we can start deploying applications on it. We will deploy a simple books api application which connects to a mongodb database and allows users to store,list and delete book information.
In the EXTERNAL-IP section of the test-service we see dns of an load balancer we can now access the application from outside the cluster using this dns.
To Store Data :
curl -XPOST-d '{"name":"A Game of Thrones (A Song of Ice and Fire)“, "author":"George R.R. Martin","price":343}'http://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books{"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}
To Get Data :
curl -XGEThttp://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books[{"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}]
We can directly put the URL used in the curl operation above in our browser as well, we will get the same response.
Now our application is deployed on EKS and can be accessed by the users.
Comparison BETWEEN GKE, ECS and EKS:
Cluster creation: Creating GKE and ECS cluster is way simpler than creating an EKS cluster. GKE being the simplest of all three.
Cost: In case of both, GKE and ECS we pay only for the infrastructure that is visible to us i.e., servers, volumes, ELB etc. and there is no cost for master nodes or other cluster management services but with EKS there is a charge of 0.2 $ per hour for the control plane.
Add-ons: GKE provides the option of using Calico as the network plugin which helps in defining network policies for controlling inter pod communication (by default all pods in k8s can communicate with each other).
Serverless: ECS cluster can be created using Fargate which is container as a Service (CaaS) offering from AWS. Similarly EKS is also expected to support Fargate very soon.
In terms of availability and scalability all the services are at par with each other.
Conclusion:
In this blog post we learned the basics concepts of EKS, launched our own EKS cluster and deployed an application as well. EKS is much awaited service from AWS especially for the folks who were already running their Kubernetes workloads on AWS, as now they can easily migrate to EKS and have a fully managed Kubernetes control plane. EKS is expected to be adopted by many organisations in near future.