Category: Type

  • Extending Kubernetes APIs with Custom Resource Definitions (CRDs)

    Introduction:

    Custom resources definition (CRD) is a powerful feature introduced in Kubernetes 1.7 which enables users to add their own/custom objects to the Kubernetes cluster and use it like any other native Kubernetes objects. In this blog post, we will see how we can add a custom resource to a Kubernetes cluster using the command line as well as using the Golang client library thus also learning how to programmatically interact with a Kubernetes cluster.

    What is a Custom Resource Definition (CRD)?

    In the Kubernetes API, every resource is an endpoint to store API objects of certain kind. For example, the built-in service resource contains a collection of service objects. The standard Kubernetes distribution ships with many inbuilt API objects/resources. CRD comes into picture when we want to introduce our own object into the Kubernetes cluster to full fill our requirements. Once we create a CRD in Kubernetes we can use it like any other native Kubernetes object thus leveraging all the features of Kubernetes like its CLI, security, API services, RBAC etc.

    The custom resource created is also stored in the etcd cluster with proper replication and lifecycle management. CRD allows us to use all the functionalities provided by a Kubernetes cluster for our custom objects and saves us the overhead of implementing them on our own.

    How to register a CRD using command line interface (CLI)

    Step-1: Create a CRD definition file sslconfig-crd.yaml

    apiVersion: "apiextensions.k8s.io/v1beta1"
    kind: "CustomResourceDefinition"
    metadata:
      name: "sslconfigs.blog.velotio.com"
    spec:
      group: "blog.velotio.com"
      version: "v1alpha1"
      scope: "Namespaced"
      names:
        plural: "sslconfigs"
        singular: "sslconfig"
        kind: "SslConfig"
      validation:
        openAPIV3Schema:
          required: ["spec"]
          properties:
            spec:
              required: ["cert","key","domain"]
              properties:
                cert:
                  type: "string"
                  minimum: 1
                key:
                  type: "string"
                  minimum: 1
                domain:
                  type: "string"
                  minimum: 1 

    Here we are creating a custom resource definition for an object of kind SslConfig. This object allows us to store the SSL configuration information for a domain. As we can see under the validation section specifying the cert, key and the domain are mandatory for creating objects of this kind, along with this we can store other information like the provider of the certificate etc. The name metadata that we specify must be spec.names.plural+”.”+spec.group.

    An API group (blog.velotio.com here) is a collection of API objects which are logically related to each other. We have also specified version for our custom objects (spec.version), as the definition of the object is expected to change/evolve in future so it’s better to start with alpha so that the users of the object knows that the definition might change later. In the scope, we have specified Namespaced, by default a custom resource name is clustered scoped. 

    # kubectl create -f crd.yaml
    # kubectl get crd NAME AGE sslconfigs.blog.velotio.com 5s

    Step-2:  Create objects using the definition we created above

    apiVersion: "blog.velotio.com/v1alpha1"
    kind: "SslConfig"
    metadata:
      name: "sslconfig-velotio.com"
    spec:
      cert: "my cert file"
      key : "my private  key"
      domain: "*.velotio.com"
      provider: "digicert"

    # kubectl create -f crd-obj.yaml
    # kubectl get sslconfig NAME AGE sslconfig-velotio.com 12s

    Along with the mandatory fields cert, key and domain, we have also stored the information of the provider ( certifying authority ) of the cert.

    How to register a CRD programmatically using client-go

    Client-go project provides us with packages using which we can easily create go client and access the Kubernetes cluster.  For creating a client first we need to create a connection with the API server.
    How we connect to the API server depends on whether we will be accessing it from within the cluster (our code running in the Kubernetes cluster itself) or if our code is running outside the cluster (locally)

    If the code is running outside the cluster then we need to provide either the path of the config file or URL of the Kubernetes proxy server running on the cluster.

    kubeconfig := filepath.Join(
    os.Getenv("HOME"), ".kube", "config",
    )
    config, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
    if err != nil {
    log.Fatal(err)
    }

    OR

    var (
    // Set during build
    version string
    
    proxyURL = flag.String("proxy", "",
    `If specified, it is assumed that a kubctl proxy server is running on the
    given url and creates a proxy client. In case it is not given InCluster
    kubernetes setup will be used`)
    )
    if *proxyURL != "" {
    config, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(
    &clientcmd.ClientConfigLoadingRules{},
    &clientcmd.ConfigOverrides{
    ClusterInfo: clientcmdapi.Cluster{
    Server: *proxyURL,
    },
    }).ClientConfig()
    if err != nil {
    glog.Fatalf("error creating client configuration: %v", err)
    }

    When the code is to be run as a part of the cluster then we can simply use

    import "k8s.io/client-go/rest"  ...  rest.InClusterConfig() 

    Once the connection is established we can use it to create clientset. For accessing Kubernetes objects, generally the clientset from the client-go project is used, but for CRD related operations we need to use the clientset from apiextensions-apiserver project

    apiextension “k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset”

    kubeClient, err := apiextension.NewForConfig(config)
    if err != nil {
    glog.Fatalf("Failed to create client: %v.", err)
    }

    Now we can use the client to make the API call which will create the CRD for us.

    package v1alpha1
    
    import (
    	"reflect"
    
    	apiextensionv1beta1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1beta1"
    	apiextension "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
    	apierrors "k8s.io/apimachinery/pkg/api/errors"
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    )
    
    const (
    	CRDPlural   string = "sslconfigs"
    	CRDGroup    string = "blog.velotio.com"
    	CRDVersion  string = "v1alpha1"
    	FullCRDName string = CRDPlural + "." + CRDGroup
    )
    
    func CreateCRD(clientset apiextension.Interface) error {
    	crd := &apiextensionv1beta1.CustomResourceDefinition{
    		ObjectMeta: meta_v1.ObjectMeta{Name: FullCRDName},
    		Spec: apiextensionv1beta1.CustomResourceDefinitionSpec{
    			Group:   CRDGroup,
    			Version: CRDVersion,
    			Scope:   apiextensionv1beta1.NamespaceScoped,
    			Names: apiextensionv1beta1.CustomResourceDefinitionNames{
    				Plural: CRDPlural,
    				Kind:   reflect.TypeOf(SslConfig{}).Name(),
    			},
    		},
    	}
    
    	_, err := clientset.ApiextensionsV1beta1().CustomResourceDefinitions().Create(crd)
    	if err != nil && apierrors.IsAlreadyExists(err) {
    		return nil
    	}
    	return err
    }

    In the create CRD function, we first create the definition of our custom object and then pass it to the create method which creates it in our cluster. Just like we did while creating our definition using CLI, here also we set the parameters like version, group, kind etc.

    Once our definition is ready we can create objects of its type just like we did earlier using the CLI. First we need to define our object.

    package v1alpha1
    
    import meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    
    type SslConfig struct {
    	meta_v1.TypeMeta   `json:",inline"`
    	meta_v1.ObjectMeta `json:"metadata"`
    	Spec               SslConfigSpec   `json:"spec"`
    	Status             SslConfigStatus `json:"status,omitempty"`
    }
    type SslConfigSpec struct {
    	Cert   string `json:"cert"`
    	Key    string `json:"key"`
    	Domain string `json:"domain"`
    }
    
    type SslConfigStatus struct {
    	State   string `json:"state,omitempty"`
    	Message string `json:"message,omitempty"`
    }
    
    type SslConfigList struct {
    	meta_v1.TypeMeta `json:",inline"`
    	meta_v1.ListMeta `json:"metadata"`
    	Items            []SslConfig `json:"items"`
    }

    Kubernetes API conventions suggests that each object must have two nested object fields that govern the object’s configuration: the object spec and the object status. Objects must also have metadata associated with them. The custom objects that we define here comply with these standards. It is also recommended to create a list type for every type thus we have also created a SslConfigList struct.

    Now we need to write a function which will create a custom client which is aware of the new resource that we have created.

    package v1alpha1
    
    import (
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/apimachinery/pkg/runtime"
    	"k8s.io/apimachinery/pkg/runtime/schema"
    	"k8s.io/apimachinery/pkg/runtime/serializer"
    	"k8s.io/client-go/rest"
    )
    
    var SchemeGroupVersion = schema.GroupVersion{Group: CRDGroup, Version: CRDVersion}
    
    func addKnownTypes(scheme *runtime.Scheme) error {
    	scheme.AddKnownTypes(SchemeGroupVersion,
    		&SslConfig{},
    		&SslConfigList{},
    	)
    	meta_v1.AddToGroupVersion(scheme, SchemeGroupVersion)
    	return nil
    }
    
    func NewClient(cfg *rest.Config) (*SslConfigV1Alpha1Client, error) {
    	scheme := runtime.NewScheme()
    	SchemeBuilder := runtime.NewSchemeBuilder(addKnownTypes)
    	if err := SchemeBuilder.AddToScheme(scheme); err != nil {
    		return nil, err
    	}
    	config := *cfg
    	config.GroupVersion = &SchemeGroupVersion
    	config.APIPath = "/apis"
    	config.ContentType = runtime.ContentTypeJSON
    	config.NegotiatedSerializer = serializer.DirectCodecFactory{CodecFactory: serializer.NewCodecFactory(scheme)}
    	client, err := rest.RESTClientFor(&config)
    	if err != nil {
    		return nil, err
    	}
    	return &SslConfigV1Alpha1Client{restClient: client}, nil
    }

    Building the custom client library

    Once we have registered our custom resource definition with the Kubernetes cluster we can create objects of its type using the Kubernetes cli as we did earlier but for creating controllers for these objects or for developing some custom functionalities around them we need to build a client library also using which we can access them from go API. For native Kubernetes objects, this type of library is provided for each object.

    package v1alpha1
    
    import (
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/client-go/rest"
    )
    
    func (c *SslConfigV1Alpha1Client) SslConfigs(namespace string) SslConfigInterface {
    	return &sslConfigclient{
    		client: c.restClient,
    		ns:     namespace,
    	}
    }
    
    type SslConfigV1Alpha1Client struct {
    	restClient rest.Interface
    }
    
    type SslConfigInterface interface {
    	Create(obj *SslConfig) (*SslConfig, error)
    	Update(obj *SslConfig) (*SslConfig, error)
    	Delete(name string, options *meta_v1.DeleteOptions) error
    	Get(name string) (*SslConfig, error)
    }
    
    type sslConfigclient struct {
    	client rest.Interface
    	ns     string
    }
    
    func (c *sslConfigclient) Create(obj *SslConfig) (*SslConfig, error) {
    	result := &SslConfig{}
    	err := c.client.Post().
    		Namespace(c.ns).Resource("sslconfigs").
    		Body(obj).Do().Into(result)
    	return result, err
    }
    
    func (c *sslConfigclient) Update(obj *SslConfig) (*SslConfig, error) {
    	result := &SslConfig{}
    	err := c.client.Put().
    		Namespace(c.ns).Resource("sslconfigs").
    		Body(obj).Do().Into(result)
    	return result, err
    }
    
    func (c *sslConfigclient) Delete(name string, options *meta_v1.DeleteOptions) error {
    	return c.client.Delete().
    		Namespace(c.ns).Resource("sslconfigs").
    		Name(name).Body(options).Do().
    		Error()
    }
    
    func (c *sslConfigclient) Get(name string) (*SslConfig, error) {
    	result := &SslConfig{}
    	err := c.client.Get().
    		Namespace(c.ns).Resource("sslconfigs").
    		Name(name).Do().Into(result)
    	return result, err
    }

    We can add more methods like watch, update status etc. Their implementation will also be similar to the methods we have defined above. For looking at the methods available for various Kubernetes objects like pod, node etc. we can refer to the v1 package.

    Putting all things together

    Now in our main function we will get all the things together.

    package main
    
    import (
    	"flag"
    	"fmt"
    	"time"
    
    	"blog.velotio.com/crd-blog/v1alpha1"
    	"github.com/golang/glog"
    	apiextension "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset"
    	meta_v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/client-go/rest"
    	"k8s.io/client-go/tools/clientcmd"
    	clientcmdapi "k8s.io/client-go/tools/clientcmd/api"
    )
    
    var (
    	// Set during build
    	version string
    
    	proxyURL = flag.String("proxy", "",
    		`If specified, it is assumed that a kubctl proxy server is running on the
    		given url and creates a proxy client. In case it is not given InCluster
    		kubernetes setup will be used`)
    )
    
    func main() {
    
    	flag.Parse()
    	var err error
    
    	var config *rest.Config
    	if *proxyURL != "" {
    		config, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(
    			&clientcmd.ClientConfigLoadingRules{},
    			&clientcmd.ConfigOverrides{
    				ClusterInfo: clientcmdapi.Cluster{
    					Server: *proxyURL,
    				},
    			}).ClientConfig()
    		if err != nil {
    			glog.Fatalf("error creating client configuration: %v", err)
    		}
    	} else {
    		if config, err = rest.InClusterConfig(); err != nil {
    			glog.Fatalf("error creating client configuration: %v", err)
    		}
    	}
    
    	kubeClient, err := apiextension.NewForConfig(config)
    	if err != nil {
    		glog.Fatalf("Failed to create client: %v", err)
    	}
    	// Create the CRD
    	err = v1alpha1.CreateCRD(kubeClient)
    	if err != nil {
    		glog.Fatalf("Failed to create crd: %v", err)
    	}
    
    	// Wait for the CRD to be created before we use it.
    	time.Sleep(5 * time.Second)
    
    	// Create a new clientset which include our CRD schema
    	crdclient, err := v1alpha1.NewClient(config)
    	if err != nil {
    		panic(err)
    	}
    
    	// Create a new SslConfig object
    
    	SslConfig := &v1alpha1.SslConfig{
    		ObjectMeta: meta_v1.ObjectMeta{
    			Name:   "sslconfigobj",
    			Labels: map[string]string{"mylabel": "test"},
    		},
    		Spec: v1alpha1.SslConfigSpec{
    			Cert:   "my-cert",
    			Key:    "my-key",
    			Domain: "*.velotio.com",
    		},
    		Status: v1alpha1.SslConfigStatus{
    			State:   "created",
    			Message: "Created, not processed yet",
    		},
    	}
    	// Create the SslConfig object we create above in the k8s cluster
    	resp, err := crdclient.SslConfigs("default").Create(SslConfig)
    	if err != nil {
    		fmt.Printf("error while creating object: %vn", err)
    	} else {
    		fmt.Printf("object created: %vn", resp)
    	}
    
    	obj, err := crdclient.SslConfigs("default").Get(SslConfig.ObjectMeta.Name)
    	if err != nil {
    		glog.Infof("error while getting the object %vn", err)
    	}
    	fmt.Printf("SslConfig Objects Found: n%vn", obj)
    	select {}
    }

    Now if we run our code then our custom resource definition will get created in the Kubernetes cluster and also an object of its type will be there just like with the cli. The docker image akash125/crdblog is build using the code discussed above it can be directly pulled from docker hub and run in a Kubernetes cluster. After the image is run successfully, the CRD definition that we discussed above will get created in the cluster along with an object of its type. We can verify the same using the CLI the way we did earlier, we can also check the logs of the pod running the docker image to verify it. The complete code is available here.

    Conclusion and future work

    We learned how to create a custom resource definition and objects using Kubernetes command line interface as well as the Golang client. We also learned how to programmatically access a Kubernetes cluster, using which we can build some really cool stuff on Kubernetes, we can now also create custom controllers for our resources which continuously watches the cluster for various life cycle events of our object and takes desired action accordingly. To read more about CRD refer the following links:

  • Tutorial: Developing Complex Plugins for Jenkins

    Introduction

    Recently, I needed to develop a complex Jenkins plug-in for a customer in the containers & DevOps space. In this process, I realized that there is lack of good documentation on Jenkins plugin development and good information is very hard to find. That’s why I decided to write this blog to share my knowledge on Jenkins plugin development.

    Topics covered in this Blog

    1. Setting up the development environment
    2. Jenkins plugin architecture: Plugin classes and understanding of the source code.
    3. Complex tasks: Tasks like the integration of REST API in the plugin and exposing environment variables through source code.
    4. Plugin debugging and deployment

    So let’s start, shall we?

    1. Setting up the development environment

    I have used Ubuntu 16.04 for this environment, but the steps remain identical for other flavors. The only difference will be in the commands used for each operating system.

    Let me give you a brief list of the requirements:

    1. Compatible JDK: Jenkins plugin development is done in Java. Thus a compatible JDK is what you need first. JDK 6 and above are supported as per the Jenkins documentation.
    2. Maven: Installation guide. I know many of us don’t like to use Maven, as it downloads stuff over the Internet at runtime but it’s required. Check this to understand why using Maven is a good idea.
    3. Jenkins: Check this Installation Guide. Obviously, you would need a Jenkins setup – can be local on hosted on a server/VM.
    4. IDE for development: An IDE like Netbeans, Eclipse or IntelliJ IDEA is preferred. I have used Netbeans 8.1 for this project.

    Before going forward, please ensure that you have the above prerequisites installed on your system. Jenkins does have official documentation for setting up the environment – Check this. If you would like to use an IDE besides Netbeans, the above document covers that too.

    Let’s start with the creation of your project. I will explain with Maven commands and with use of the IDE as well.

    First, let’s start with the approach of using commands.

    It may be helpful to add the following to your ~/.m2/settings.xml (Windows users will find them in %USERPROFILE%.m2settings.xml):

    <settings>
     <pluginGroups>
       <pluginGroup>org.jenkins-ci.tools</pluginGroup>
     </pluginGroups>
    
    <profiles>
       <!-- Give access to Jenkins plugins -->
       <profile>
         <id>jenkins</id>
         <activation>
           <activeByDefault>true</activeByDefault> <!-- change this to false, if you don't like to have it on per default -->
         </activation>
    
         <repositories>
           <repository>
             <id>repo.jenkins-ci.org</id>
             <url>http://repo.jenkins-ci.org/public/</url>
           </repository>
         </repositories>
         
         <pluginRepositories>
           <pluginRepository>
             <id>repo.jenkins-ci.org</id>
             <url>http://repo.jenkins-ci.org/public/</url>
           </pluginRepository>
         </pluginRepositories>
       </profile>
     </profiles>
     
     <mirrors>
       <mirror>
         <id>repo.jenkins-ci.org</id>
         <url>http://repo.jenkins-ci.org/public/</url>
         <mirrorOf>m.g.o-public</mirrorOf>
       </mirror>
     </mirrors>
    </settings>

    This basically lets you use short names in commands e.g. instead of org.jenkins-ci.tools:maven-hpi-plugin:1.61:create, you can use hpi:create. hpi is the packaging style used to deploy the plugins.

    Create the plugin

    $ mvn -U org.jenkins-ci.tools:maven-hpi-plugin:create


    This will ask you a few questions, like the groupId (the Maven jargon for the package name) and the artifactId (the Maven jargon for your project name), then create a skeleton plugin from which you can start with. This command should create the sample HelloWorldBuilder plugin.

    Command Explanation:

    • -U: Maven needs to update the relevant Maven plugins (check plugin updates).
    • hpi: this prefix specifies that the Jenkins HPI Plugin is being invoked, a plugin that supports the development of Jenkins plugins.
    • create is the goal which creates the directory layout and the POM for your new Jenkins plugin and it adds it to the module list.

    Source code tree would be like this:

    Your Project Name    
      Pom.xml      
        Src          
          Main              
            Java                  
              package folder(usually consist of groupId and artifactId)                      
                HelloWorldBuilder.java              
          Resources                  
              Package folder/HelloWorldBuilder/jelly files

    Run “mvn package” which compiles all sources, runs the tests and creates a package – when used by the HPI plugin it will create an *.hpi file.

    Building the Plugin:

    Run mvn install in the directory where pom.xml resides. This is similar to mvn package command but at the end, it will create your plugins .hpi file which you can deploy. Simply copy the create .hpi file and paste to /plugins folder of your Jenkins setup. Restart your Jenkins and you should see the plugin on Jenkins.

    Now let’s see how this can be done with IDE.

    With Netbeans IDE:

    I have used Netbeans for development(Download). Check with the JDK version. Latest version 8.2 works with JDK 8. Once you install Netbeans, install NetBeans plugin for Jenkins/Stapler development.

    You can now create plugin via New Project » Maven » Jenkins Plugin.

    This is the same as “mvn -U org.jenkins-ci.tools:maven-hpi-plugin:create” command which should create the simple “HelloWorldBuilder” application.

    Netbeans comes with Maven built-in so even if you don’t have Maven installed on your system this should work. But you may face error accessing the Jenkins repo. Remember we added some configuration settings in settings.xml in the very first step. Yes, if you have added that already then you shouldn’t face any problem but if you haven’t added that you can add that in Netbeans Maven settings.xml which you can find at: netbeans_installation_path/java/maven/conf/settings.xml

    Now you have your “HelloWorldBuilder” application ready.  This is shown as TODO plugin in Netbeans. Simply run it(F6). This creates the Jenkins instance and runs it on 8080 port. Now, if you already have local Jenkins setup then you need to stop it otherwise this will give you an exception. Go to localhost:8080/jenkins and create a simple job. In “Add Build Step” you should see “Say Hello World” plugin already there.

    Now how it got there and the source code explanation is next.

    2. Jenkins plugin architecture and understanding

    Now that we have our sample HelloWorldBuilder plugin ready,  let’s see its components.

    As you may know, Jenkins plugin has two parts: Build Step and Post Build Step. This sample application is designed to work for Build step and that’s why you see “Say Hello world” plugin in Build step. I am going to cover Build Step itself.

    Do you want to develop Post Build plugin? Don’t worry as these two don’t have much difference. The difference is only in the classes which we extend. For Build step, we extend “hudson.tasks.Builder” and for Post Build “hudson.tasks.Recorder” and with Descriptor class for Build step “BuildStepDescriptor<builder></builder>” for Post Build “BuildStepDescriptor<publisher></publisher>”.

    We will go through these classes in detail below:

    hudson.tasks.Builder Class:

    In brief, this simply tells Jenkins that you are writing a Build Step plugin. A full explanation is here. Now you will see “perform” method once you override this class.

    @Override
    public boolean perform(AbstractBuild build, Launcher launcher, BuildListener listener)

    Note that we are not implementing the ”SimpleBuildStep” interface which is there in HelloWorldBuilder source code. Perform method for that Interface is a  bit different from what I have given above. My explanation goes around this perform method.

    The perform method is basically called when you run your Build. If you see the Parameters passed you have full control over the Build configured, you can log to Jenkins console screen using listener object. What you should do here is access the values set by the user on UI and perform the plugin activity. Note that this method is returning a boolean, True means build is Successful and False is Build Failed.

    Understanding the Descriptor Class:  

    You will notice there is a static inner class in your main class named as DescriptorImpl. This class is basically used for handling configuration of your Plugin. When you click on “Configure” link on Jenkins it basically calls this method and loads the configured data.

    You can perform validations here, save the global configuration and many things. We will see these in detail as when required. Now there is an overridden method:

    @Override
    public String getDisplayName() {
    return "Say Hello World";
    }

    That’s why we see “Say Hello World” in the Build Step. You can rename it to what your plugin does.

    @Override
    public boolean configure(StaplerRequest req, JSONObject formData) throws FormException {
    // To persist global configuration information,
    // set that to properties and call save().
    useFrench = formData.getBoolean("useFrench");
    // ^Can also use req.bindJSON(this, formData);
    //(easier when there are many fields; need set* methods for this, like setUseFrench)
    save();
    return super.configure(req,formData);
    }

    This method basically saves your configuration, or you can even get global data like we have taken “useFrench” attribute which can be set from Jenkins global configuration. If you would like to set any global parameter you can place them in the global.jelly file.

    Understanding Action class and jelly files:

    To understand the main Action class and what it’s purpose is, let’s first understand the jelly files.

    There are two main jelly files: config.jelly and global.jelly. The global.jelly file is used to set global parameters while config.jelly is used for local parameters configuration. Jenkins uses these jelly files to show the parameters or fields on UI. So anything you write in config.jelly will show up on Jobs configuration page as configurable.

    <f:entry title="Name" field="name">
    <f:textbox />
    </f:entry>

    This is what is there in our HelloWorldBuilder application. It simply renders a textbox for entering name.

    Jelly has its own syntax and supports HTML and Javascript as well. It has radio buttons, checkboxes, dropdown lists and so on.

    How does Jenkins manage to pull the data set by the user? This is where our Action class comes into the picture. If you see the structure of the sample application, it has a private field as name and a constructor.

    @DataBoundConstructor
    public HelloWorldBuilder(String name) {
    this.name = name;
    }

    This DataBoundConstructor annotation tells Jenkins to bind the value of jelly fields. If you notice there’s field as “name” in jelly and the same is used here to put the data. Note that, whatever name you set in field attribute of jelly same you should use here as they are tightly coupled.

    Also, add getters for these fields so that Jenkins can access the values.

    @Override
    public DescriptorImpl getDescriptor() {
    return (DescriptorImpl)super.getDescriptor();
    }

    This method gives you the instance of Descriptor class. So if you want to access methods or properties of Descriptor class in your Action class you can use this.

    3. Complex tasks:

    We now have a good idea on how the Jenkins plugin structure is and how it works. Now let’s start with some complex stuff.

    On the internet, there are examples on how to render a selection box(drop-down) with static data. What if you want to load in a dynamic manner? I came with the below solution. We will use Amazon’s publicly available REST API for getting the coupons and load that data in the selection box.

    Here, the objective is to load the data in the selection box. I have the response for REST API as below:

    "offers" : {    
      "AmazonChimeDialin" : {      
        "offerCode" : "AmazonChimeDialin",      
        "versionIndexUrl" : "/offers/v1.0/aws/AmazonChimeDialin/index.json",      
        "currentVersionUrl" : "/offers/v1.0/aws/AmazonChimeDialin/current/index.json",     
        "currentRegionIndexUrl" : "/offers/v1.0/aws/AmazonChimeDialin/current/region_index.json"    
       },    
       "mobileanalytics" : {      
        "offerCode" : "mobileanalytics",      
        "versionIndexUrl" : "/offers/v1.0/aws/mobileanalytics/index.json",      
        "currentVersionUrl" : "/offers/v1.0/aws/mobileanalytics/current/index.json",      
        "currentRegionIndexUrl" : "/offers/v1.0/aws/mobileanalytics/current/region_index.json"    
        }
        }

    I have taken all these offers and created one dictionary and rendered it on UI. Thus the user will see the list of coupon codes and can choose anyone of them.

    Let’s understand how to create the selection box and load the data into it.

    <f:entry title="select Offer From Amazon" field="getOffer">   
     <f:select id="offer-${editorId}" onfocus="getOffers(this.id)"/>  
     </f:entry>

    This is the code which will generate the selection box on configuration page.  Now you will see here “getOffer” field means there’s field with the same name in the Action class.

    When you create any selection box, Jenkins needs doFill{fieldname}Items method in your descriptor class. As we have seen, Descriptor class is configuration class it tries to load the data from this method when you click on the configuration of the job. So in this case, “doFillGetOfferItems” method is required.

    After this, selection box should pop up on the configuration page of your plugin.

    Now here as we need to do dynamic actions, we will perform some action and will load the data.

    As an example, we will click on the button and load the data in Selection Box.

    <f:validateButton title="Get Amazon Offers" progress="Fetching Offers..."method="getAmazonOffers"/>

    Above is the code to create a button. In method attribute, specify the backend method which should be present in your Descriptor class. So when you click on this button “getAmazonOffers” method will get called at the backend and it will get the data from API.

    Now when we click on the selection box, we need to show the contents. As I said earlier, Jelly does support HTML and Javascript. Yes, if you want to do dynamic action use Javascript simply. If you see in selection box code of jelly I have used onfocus() method of Javascript which is pointing to getOffers() function.

    Now you need to have this function, define script tag like this.

    <script> 
    function getOffers(){ 
    }
    </script>

    Now here get the data from backend and load it in the selection box. To do this we need to understand some objects of Jenkins.

    1. Descriptor: As you now know, Descriptor is configuration class this object points to. So from jelly at any point, you can call the method from your Descriptor class.
    2. Instance: This is the object currently being configured on the configuration page. Null if it’s a newly added instance. Means by using this you can call the methods from your Action class. Like getters of field attribute.

    Now how to use these objects? To use you need to first set them.

    <st:bind var="backend" value="${descriptor}"/>

    Here you are binding descriptor object to backend variable and this variable is now ready for use anywhere in config.jelly.  Similarly for instance, <st:bind var=”backend” value=”${instance}”>.</st:bind>

    To make calls use backend.{backend method name}() and it should call your backend method.

    But if you are using this from JavaScript then you need use @JavaScriptMethod annotation over the method being called.

    We can now get the REST data from backend function in JavaScript and to load the data into the element you can use the document object of JavaScript.

    E.g. var selection = document.getElementById(“element-id”); This part is normal Javascript.

    So after clicking on “Get Amazon Offers” button and clicking on Selection box it should now load the data.

    Multiple Plugin Instance: If we are creating a multiple Build Step plugin then you can create multiple instances of your plugin while configuring it. If you try to do what we have done up till now, it will fail to load the data in the second instance. This is because the same element already exists on the UI with the same id. JavaScript will get confused while putting the data. We need to have a mechanism to create the different ids of the same fields.

    I thought of one approach for this. Get the index from backend while configuring the fields and add as a suffix in id attribute.

    @JavaScriptMethod
    public synchronized String createEditorId() {
    return String.valueOf(lastEditorId++);
    }

    This is the method which just returns the id+1 each time it gets called. You know now how to call backend methods from Jelly.

    <j:set var="editorId" value="${descriptor.createEditorId()}" />

    In this manner, we set the ID value in variable “editorId” and this can be used while creation of fields.

    (Check out the selection box creation code above. I have appended this variable in ID attribute)

    Now create as many instances you want in configuration page it should work fine.

    Exposing Environment Variables:

    Environment variables are needed quite often in Jenkins. Your plugin may require the support of some environment variables or the use of the built-in environment variables provided by Jenkins.

    First, you need to create the Envvars object.

    EnvVars envVars = new EnvVars();
    ** Assign it to the build environment.
    envVars = build.getEnvironment(listener);
    ** Put the values which you wanted to expose as environment variable.
    envVars.put("offer", getOffer);

    If you print this then you will get all the default Jenkins environment variables as well as variables which you have exposed. Using this you can even use third party plugins like “Parameterized Trigger Plugin” to export the current build’s environment variable to different jobs.You can even get the value of any environment variable using this.

    4. Plugin Debugging and Deployment:

    You have now got an idea on how to write a plugin in Jenkins, now we move on to perform some complex tasks. We will see how to debug the issue and deploy the plugin. If you are using the IDE then debugging is same like you do for Java program similar to setting up the breakpoints and running the project.

    If you want to perform any validation on fields, in the configuration class you would need to have docheck{fieldname} method which will return FormValidation object. In this example, we are validating the “name” field from our sample “HelloWorldBuilder” example.

    public FormValidation doCheckName(@QueryParameter String value)
    throws IOException, ServletException {
    if (value.length() == 0)
    return FormValidation.error("Please set a name"); 
    if (value.length() < 4)
    return FormValidation.warning("Isn't the name too short?");
    return FormValidation.ok(); 
    }

    Plugin deployment:  

    We have now created the plugin, how are we going to deploy it? We have created the plugin using Netbeans IDE and as I said earlier if you want to deploy it on your local Jenkins setup you need to use the Maven command mvn install and copy .hpi to /plugins/ folder.

    But what if you want to deploy it on Jenkins Marketplace? Well, it’s a pretty long process and thankfully Jenkins has good documentation for it.

    In short, you need to have a jenkins-ci.org account. Your public Git repo will have the plugin source code. Raise an issue on JIRA to get space on their Git repo and in this operation, they will have forked your git repo. Finally, release the plugin using Maven. The above document explains well what exactly needs to be done.

    Conclusion:

    We went through the basics of Jenkins plugin development such as classes, configuration, and some complex tasks.

    Jenkins plugin development is not difficult, but I feel the poor documentation is what makes the task challenging. I have tried to cover my understanding while developing the plugin, however, it is advisable to create a plugin only if the required functionality does not already exist.

    Below are some important links on plugin development:

    1. Jenkins post build plugin development: This is a very good blog which covers things like setting up the environment, plugin classes and developing Post build action.
    2. Basic guide to use jelly: This covers how to use jelly files in Jenkins and attributes of jelly. 

    You can check the code of the sample application discussed in this blog here. I hope this helps you to build interesting Jenkins plugins. Happy Coding!!

  • MQTT Protocol Overview – Everything You Need To Know

    MQTT is the open protocol. This is used for asynchronous message queuing. This has been developed and matured over several years. MQTT is a machine to machine protocol. It’s been widely used with embedded devices. Microsoft is having its own MQTT tool with huge support. Here, we are going to overview the MQTT protocol & its details.

    MQTT Protocol:

    MQTT is a very simple publish / subscribe protocol. It allows you to send messages on a topic (channels) passed through a centralized message broker.

    The MQTT module of API will take care of the publish/ subscribe mechanism along with additional features like authentication, retaining messages and sending duplicate messages to unreachable clients.

    There are three parts of MQTT architecture –

    • MQTT Broker – All messages passed from the client to the server should be sent via the broker.
    • MQTT Server – The API acts as an MQTT server. The MQTT server will be responsible for publishing the data to the clients.
    • MQTT Client – Any third party client who wishes to subscribe to data published by API, is considered as an MQTT Client.

    The MQTT Client and the MQTT Server need to connect to the Broker in order to publish or subscribe messages.

    MQTT Communication Program

    Suppose our API is sending sensor data to get more ideas on MQTT.
    API gathers the sensor data through the Monitoring module, and the MQTT module publishes the data to provide different channels. On the successful connection of external client to the MQTT module of the API, the client would receive sensor data on the subscribed channel.

    Below diagram shows the flow of data from the API Module to the External clients.

    MQTT Broker – EMQTT:

    EMQTT (Erlang MQTT Broker) is a massively scalable and clusterable MQTT V3.1/V3.1.1 broker, written in Erlang/OTP.

    Main responsibilities of a Broker are-

    • Receive all messages
    • Filter messages
    • Decide which are interested clients
    • Publish messages to all the subscribed clients

    All messages published are passed through the broker. The broker generates the Client ID and Message ID, maintains the message queue, and publishes the message.

    There are several brokers that can be used. Default EMQTT broker developed in ErLang.

    MQTT Topics:

    A topic is a string(UTF-8). Using this string, Broker filters messages for all connected clients. One topic may consist of one or more topic levels. Forward slash(topic level separator) is used for separating each topic level.

     

    When API starts, the Monitoring API will monitor the sensor data and publish it in a combination of topics. The third party client can subscribe to any of those topics, based on the requirement.

    The topics are framed in such a way that it provides options for the user to subscribe at level 1, level 2, level 3, level 4, or individual sensor level data.

    While subscribing to each level of sensor data, the client needs to specify the hierarchy of the IDs. For e.g. to subscribe to level 4 sensor data, the client needs to specify level 1 id/ level 2 id/ level 3 id/ level 4 id.

    The user can subscribe to any type of sensor by specifying the sensor role as the last part of the topic.

    If the user doesn’t specify the role, the client will be subscribed to all types of sensors on that particular level.

    The user can also specify the sensor id that they wish to subscribe to. In that case, they need to specify the whole hierarchy of the sensor, starting from project id and ending with sensor id.

    Following is the list of topics exposed by API on startup.

     

    Features supported by MQTT:

    1. Authentication:

    EMQTT provides authentication of every user who intends to publish or subscribe to particular data. The user id and password is stored in the API database, into a separate collection called ‘mqtt

    While connecting to EMQTT broker, we provide the username name and password, and the MQTT Broker will validate the credentials based on the values present in the database.

    2. Access Control:

    EMQTT determines which user is allowed to access which topics. This information is stored in MongoDB under the table ‘mqtt_acl’

    By default, all users are allowed to access all topics by specifying ‘#’ as the allowed topic to publish and subscribe for all users.

    3. QoS:

    The Quality of Service (QoS) level is the Quality transfer of messages which ensures the delivery of messages between sending body & receiving body. There are 3 QoS levels in MQTT:

    • At most once(0) –The message is delivered at most once, or it is not delivered at all.
    • At least once(1) – The message is always delivered at least once.
    • Exactly once(2) – The message is always delivered exactly once.

    4. Last Will Message:

    MQTT uses the Last Will & Testament(LWT) mechanism to notify ungraceful disconnection of a client to other clients. In this mechanism, when a client is connecting to a broker, each client specifies its last will message which is a normal MQTT message with QoS, topic, retained flag & payload. This message is stored by the Broker until it it detects that the client has disconnected ungracefully.

    5. Retain Message:

    MQTT also has a feature of Message Retention. It is done by setting TRUE to retain the flag. It then retained the last message & QoS for the topic. When a client subscribes to a topic, the broker matches the topic with a retained message. Clients will receive messages immediately if the topic and the retained message are matched. Brokers only store one retained message for each topic.

    6. Duplicate Message:

    If a publisher doesn’t receive the acknowledgement of the published packet, it will resend the packet with DUP flag set to true. A duplicate message contains the same Message ID as the original message.

    7. Session:

    In general, when a client connects with a broker for the first time, the client needs to create subscriptions for all topics for which they are willing to receive data/messages from the broker. Suppose a session is not maintained, or there is no persistent session, or the client lost a connection with the broker, then users have to resubscribe to all the topics after reconnecting to the broker. For the clients with limited resources, it would be very tedious to subscribe to all topics again. So brokers use a persistent session mechanism, in which it saves all information relevant to the client. ‘clientId’ provided by client is used as ‘session identifier’, when the client establishes a connection with the broker.

    Features not-supported by MQTT:

    1. Not RESTful:

    MQTT does not allow a client to expose RESTful API endpoints. The only way to communicate is through the publish /subscribe mechanism.

    2. Obtaining Subscription List:

    The MQTT Broker doesn’t have the Client IDs and the subscribed topics by the clients. Hence, the API needs to publish all data to all possible combinations of topics. This would lead to a problem of network congestion in case of large data.

    MQTT Wildcards:

    MQTT clients can subscribe to one or more topics. At a time, one can subscribe to a single topic only. So we can use the following two wildcards to create a topic which can subscribe to many topics to receive data/message.

    1. Plus sign(+):

    This is a single level wildcard. This is used to match specific topic level. We can use this wildcard when we want to subscribe at topic level.

    Example: Suppose we want to subscribe for all Floor level ‘AL’(Ambient light) sensors, we can use Plus (+) sign level wild card instead of a specific zone level. We can use following topic:

    <project_id>/<building_id>/<floor_id>/+/AL</floor_id></building_id></project_id>

    2. Hash Sign(#):

    This is a multi level wildcard. This wildcard can be used only at the end of a topic. All data/messages get subscribed which match to left-hand side of the ‘#’ wildcard.

    Example: In case we want to receive all the messages related to all sensors for floor1 , we can use Hash sing(#) multi level wildcard after floor name & the slash( / ). We can use following topic-

    <level 1_id=””>/<level 2_id=””>/<level 3_id=””>/#</level></level></level>

    MQTT Test tools:

    Following are some popular open source testing tools for MQTT.

    1. MQTT Lens
    2. MQTT SPY
    3. MQTT FX

    Difference between MQTT & AMQP:

    MQTT is designed for lightweight devices like Embedded systems, where bandwidth is costly and the minimum overhead is required. MQTT uses byte stream to exchange data and control everything. Byte stream has optimized 2 byte fixed header, which is prefered for IoT.

    AMQP is designed with more advanced features and uses more system resources. It provides more advanced features related to messaging, topic-based publish & subscribe messaging, reliable queuing, transactions, flexible routing and security.

    Difference between MQTT & HTTP:

    MQTT is data-centric, whereas HTTP is document-centric. HTTP is a request-response protocol for client-server, on the other hand, MQTT uses publish-subscribe mechanism. Publish/subscribe model provides clients with the independent existence from one another and enhances the reliability of the whole system. Even if any of the client is out of network, the system keeps itself up and running

    As compared to HTTP, MQTT is lightweight (very short message header and the smallest packet message size of 2 bytes), and allows to compose lengthy headers and messages.

    MQTT Protocol ensures high delivery guarantees compared to HTTP.

    There are 3 levels of Quality of Services:

    at most once: it guarantees that message will be delivered with the best effort.

    at least once: It guarantees that message will be delivered at a minimum of one time. But the message can also be delivered again..

    exactly once: It guarantees that message will be delivered one and only one time.

    Last will & testament and Retained messages are the options provided by MQTT to users. With Last Will & Testament, in case of unexpected disconnection of a client, all subscribed clients will get a message from the broker. Newly subscribed clients will get immediate status updates via Retained message.

    HTTP Protocol has none of these abilities.

    Conclusion:

    MQTT is one of its kind message queuing protocols, best suited for embedded hardware devices. On the software level, it supports all major operating systems and platforms. It has proven its certainty as an ISO standard in IoT platforms because of its more pragmatic security and message reliability.

  • Building A Scalable API Testing Framework With Jest And SuperTest

    Focus on API testing

    Before starting off, below listed are the reasons why API testing should be encouraged:

    • Identifies bugs before it goes to UI
    • Effective testing at a lower level over high-level broad-stack testing
    • Reduces future efforts to fix defects
    • Time-saving

    Well, QA practices are becoming more automation-centric with evolving requirements, but identifying the appropriate approach is the primary and the most essential step. This implies choosing a framework or a tool to develop a test setup which should be:

    • Scalable 
    • Modular
    • Maintainable
    • Able to provide maximum test coverage
    • Extensible
    • Able to generate test reports
    • Easy to integrate with source control tool and CI pipeline

    To attain the goal, why not develop your own asset rather than relying on the ready-made tools like Postman, JMeter, or any? Let’s have a look at why you should choose ‘writing your own code’ over depending on the API testing tools available in the market:

    1. Customizable
    2. Saves you from the trap of limitations of a ready-made tool
    3. Freedom to add configurations and libraries as required and not really depend on the specific supported plugins of the tool
    4. No limit on the usage and no question of cost
    5. Let’s take Postman for example. If we are going with Newman (CLI of Postman), there are several efforts that are likely to evolve with growing or changing requirements. Adding a new test requires editing in Postman, saving it in the collection, exporting it again and running the entire collection.json through Newman. Isn’t it tedious to repeat the same process every time?

    We can overcome such annoyance and meet our purpose using a self-built Jest framework using SuperTest. Come on, let’s dive in!

    Source: school.geekwall

    Why Jest?

    Jest is pretty impressive. 

    • High performance
    • Easy and minimal setup
    • Provides in-built assertion library and mocking support
    • Several in-built testing features without any additional configuration
    • Snapshot testing
    • Brilliant test coverage
    • Allows interactive watch mode ( jest –watch or jest –watchAll )

    Hold on. Before moving forward, let’s quickly visit Jest configurations, Jest CLI commands, Jest Globals and Javascript async/await for better understanding of the coming content.

    Ready, set, go!

    Creating a node project jest-supertest in our local and doing npm init. Into the workspace, we will install Jest, jest-stare for generating custom test reports, jest-serial-runner to disable parallel execution (since our tests might be dependent) and save these as dependencies.

    npm install jest jest-stare jest-serial-runner --save-dev

    Tags to the scripts block in our package.json. 

    
    "scripts": {
        "test": "NODE_TLS_REJECT_UNAUTHORIZED=0 jest --reporters default jest-stare --coverage --detectOpenHandles --runInBand --testTimeout=60000",
        "test:watch": "jest --verbose --watchAll"
      }

    npm run test command will invoke the test parameter with the following:

    • NODE_TLS_REJECT_UNAUTHORIZED=0: ignores the SSL certificate
    • jest: runs the framework with the configurations defined under Jest block
    • –reporters: default jest-stare 
    • –coverage: invokes test coverage
    • –detectOpenHandles: for debugging
    • –runInBand: serial execution of Jest tests
    • –forceExit: to shut down cleanly
    • –testTimeout = 60000 (custom timeout, default is 5000 milliseconds)

    Jest configurations:

    [Note: This is customizable as per requirements]

    "jest": {
        "verbose": true,
        "testSequencer": "/home/abc/jest-supertest/testSequencer.js",
        "coverageDirectory": "/home/abc/jest-supertest/coverage/my_reports/",
        "coverageReporters": ["html","text"],
        "coverageThreshold": {
          "global": {
            "branches": 100,
            "functions": 100,
            "lines": 100,
            "statements": 100
          }
        }
      }

    testSequencer: to invoke testSequencer.js in the workspace to customize the order of running our test files

    touch testSequencer.js

    Below code in testSequencer.js will run our test files in alphabetical order.

    const Sequencer = require('@jest/test-sequencer').default;
    
    class CustomSequencer extends Sequencer {
      sort(tests) {
        // Test structure information
        // https://github.com/facebook/jest/blob/6b8b1404a1d9254e7d5d90a8934087a9c9899dab/packages/jest-runner/src/types.ts#L17-L21
        const copyTests = Array.from(tests);
        return copyTests.sort((testA, testB) => (testA.path > testB.path ? 1 : -1));
      }
    }
    
    module.exports = CustomSequencer;

    • verbose: to display individual test results
    • coverageDirectory: creates a custom directory for coverage reports
    • coverageReporters: format of reports generated
    • coverageThreshold: minimum and maximum threshold enforcements for coverage results

    Testing endpoints with SuperTest

    SuperTest is a node library, superagent driven, to extensively test Restful web services. It hits the HTTP server to send requests (GET, POST, PATCH, PUT, DELETE ) and fetch responses.

    Install SuperTest and save it as a dependency.

    npm install supertest --save-dev

    "devDependencies": {
        "jest": "^25.5.4",
        "jest-serial-runner": "^1.1.0",
        "jest-stare": "^2.0.1",
        "supertest": "^4.0.2"
      }

    All the required dependencies are installed and our package.json looks like:

    {
      "name": "supertestjest",
      "version": "1.0.0",
      "description": "",
      "main": "index.js",
      "jest": {
        "verbose": true,
        "testSequencer": "/home/abc/jest-supertest/testSequencer.js",
        "coverageDirectory": "/home/abc/jest-supertest/coverage/my_reports/",
        "coverageReporters": ["html","text"],
        "coverageThreshold": {
          "global": {
            "branches": 100,
            "functions": 100,
            "lines": 100,
            "statements": 100
          }
        }
      },
      "scripts": {
        "test": "NODE_TLS_REJECT_UNAUTHORIZED=0 jest --reporters default jest-stare --coverage --detectOpenHandles --runInBand --testTimeout=60000",
        "test:watch": "jest --verbose --watchAll"
      },
      "author": "",
      "license": "ISC",
      "devDependencies": {
        "jest": "^25.5.4",
        "jest-serial-runner": "^1.1.0",
        "jest-stare": "^2.0.1",
        "supertest": "^4.0.2"
      }
    }

    Now we are ready to create our Jest tests with some defined conventions:

    • describe block assembles multiple tests or its
    • test block – (an alias usually used is ‘it’) holds single test 
    • expect() –  performs assertions 

    It recognizes the test files in __test__/ folder

    • with .test.js extension
    • with .spec.js extension

    Here is a reference app for API tests.

    Let’s write commonTests.js which will be required by every test file. This hits the app through SuperTest, logs in (if required) and saves authorization token. The aliases are exported from here to be used in all the tests. 

    [Note: commonTests.js, be created or not, will vary as per the test requirements]

    touch commonTests.js

    var supertest = require('supertest'); //require supertest
    const request = supertest('https://reqres.in/'); //supertest hits the HTTP server (your app)
    
    /*
    This piece of code is for getting the authorization token after login to your app.
    const token;
    test("Login to the application", function(){
        return request.post(``).then((response)=>{
            token = response.body.token  //to save the login token for further requests
        })
    }); 
    */
    
    module.exports = 
    {
        request
            //, token     -- export if token is generated
    }

    Moving forward to writing our tests on POST, GET, PUT and DELETE requests for the basic understanding of the setup. For that, we are creating two test files to also see and understand if the sequencer works.

    mkdir __test__/
    touch __test__/postAndGet.test.js __test__/putAndDelete.test.js

    As mentioned above and sticking to Jest protocols, we have our tests written.

    postAndGet.test.js test file:

    • requires commonTests.js into ‘request’ alias
    • POST requests to api/users endpoint, calls supertest.post() 
    • GET requests to api/users endpoint, calls supertest.get()
    • uses file system to write globals and read those across all the tests
    • validates response returned on hitting the HTTP endpoints
    const request = require('../commonTests');
    const fs = require('fs');
    let userID;
    
    //Create a new user
    describe("POST request", () => {
      
      try{
        let userDetails;
        beforeEach(function () {  
            console.log("Input user details!")
            userDetails = {
              "name": "morpheus",
              "job": "leader"
          }; //new user details to be created
          });
        
        afterEach(function () {
          console.log("User is created with ID : ", userID)
        });
    
    	  it("Create user data", async done => {
    
            return request.request.post(`api/users`) //post() of supertest
                    //.set('Authorization', `Token $  {request.token}`) //Authorization token
                    .send(userDetails) //Request header
                    .expect(201) //response to be 201
                    .then((res) => {
                        expect(res.body).toBeDefined(); //test if response body is defined
                        //expect(res.body.status).toBe("success")
                        userID = res.body.id;
                        let jsonContent = JSON.stringify({userId: res.body.id}); // create a json
                        fs.writeFile("data.json", jsonContent, 'utf8', function (err) //write user id into global json file to be used 
                        {
                        if (err) {
                            return console.log(err);
                        }
                        console.log("POST response body : ", res.body)
                        done();
                        });
                      })
                    })
                  }
                  catch(err){
                    console.log("Exception : ", err)
                  }
            });
    
    //GET all users      
    describe("GET all user details", () => {
      
      try{
          beforeEach(function () {
            console.log("GET all users details ")
        });
              
          afterEach(function () {
            console.log("All users' details are retrieved")
        });
    
          test("GET user output", async done =>{
            await request.request.get(`api/users`) //get() of supertest
                                    //.set('Authorization', `Token ${request.token}`) 
                                    .expect(200).then((response) =>{
                                    console.log("GET RESPONSE : ", response.body);
                                    done();
                        })
          })
        }
      catch(err){
        console.log("Exception : ", err)
        }
    });

    putAndDelete.test.js file:

    • requires commonsTests into ‘request’ alias
    • calls data.json into ‘data’ alias which was created by the file system in our previous test to write global variables into it
    • PUT sto api/users/${data.userId} endpoint, calls supertest.put() 
    • DELETE requests to api/users/${data.userId} endpoint, calls supertest.delete() 
    • validates response returned by the endpoints
    • removes data.json (similar to unsetting global variables) after all the tests are done
    const request = require('../commonTests');
    const fs = require('fs'); //file system
    const data = require('../data.json'); //data.json containing the global variables
    
    //Update user data
    describe("PUT user details", () => {
    
        try{
            let newDetails;
            beforeEach(function () {
                console.log("Input updated user's details");
                newDetails = {
                    "name": "morpheus",
                    "job": "zion resident"
                }; // details to be updated
      
            });
            afterEach(function () {
                console.log("user details are updated");
            });
      
            test("Update user now", async done =>{
    
                console.log("User to be updated : ", data.userId)
    
                const response = await request.request.put(`api/users/${data.userId}`).send(newDetails) //call put() of supertest
                                    //.set('Authorization', `Token ${request.token}`) 
                                            .expect(200)
                expect(response.body.updatedAt).toBeDefined();
                console.log("UPDATED RESPONSE : ", response.body);
                done();
        })
      }
        catch(err){
            console.log("ERROR : ", err)
        }
    });
    
    //DELETE the user
    describe("DELETE user details", () =>{
        try{
            beforeAll(function (){
                console.log("To delete user : ", data.userId)
            });
    
            test("Delete request", async done =>{
                const response = await request.request.delete(`api/users/${data.userId}`) //invoke delete() of supertest
                                            .expect(204) 
                console.log("DELETE RESPONSE : ", response.body);
                done(); 
            });
    
            afterAll(function (){
                console.log("user is deleted!!")
                fs.unlinkSync('data.json'); //remove data.json after all tests are run
            });
        }
    
        catch(err){
            console.log("EXCEPTION : ", err);
        }
    });

    And we are done with setting up a decent framework and just a command away!

    npm test

    Once complete, the test results will be immediately visible on the terminal.

    Test results HTML report is also generated as index.html under jest-stare/ 

    And test coverage details are created under coverage/my_reports/ in the workspace.

    Similarly, other HTTP methods can also be tested, like OPTIONS – supertest.options() which allows dealing with CORS, PATCH – supertest.patch(), HEAD – supertest.head() and many more.

    Wasn’t it a convenient and successful journey?

    Conclusion

    So, wrapping it up with a note that API testing needs attention, and as a QA, let’s abide by the concept of a testing pyramid which is nothing but the mindset of a tester and how to combat issues at a lower level and avoid chaos at upper levels, i.e. UI. 

    Testing Pyramid

    I hope you had a good read. Kindly spread the word. Happy coding!

  • Test Automation in React Native apps using Appium and WebdriverIO

    React Native provides a mobile app development experience without sacrificing user experience or visual performance. And when it comes to mobile app UI testing, Appium is a great way to test indigenous React Native apps out of the box. Creating native apps from the same code and being able to do it using JavaScript has made Appium popular. Apart from this, businesses are attracted by the fact that they can save a lot of money by using this application development framework.

    In this blog, we are going to cover how to add automated tests for React native apps using Appium & WebdriverIO with a Node.js framework. 

    What are React Native Apps

    React Native is an open-source framework for building Android and iOS apps using React and local app capabilities. With React Native, you can use JavaScript to access the APIs on your platform and define the look and behavior of your UI using React components: lots of usable, non-compact code. In the development of Android and iOS apps, “viewing” is the basic building block of a UI: this small rectangular object on the screen can be used to display text, photos, or user input. Even the smallest detail of an app, such as a text line or a button, is a kind of view. Some views may contain other views.

    What is Appium

    Appium is an open-source tool for traditional automation, web, and hybrid apps on iOS, Android, and Windows desktop mobile platforms. Indigenous apps are those written using iOS and Android. Mobile web applications are accessed using a mobile browser (Appium supports Safari for iOS apps and Chrome or the built-in ‘Browser’ for Android apps). Hybrid apps have a wrapper around “web view”—a traditional controller that allows you to interact with web content. Projects like Apache Cordova make it easy to build applications using web technology and integrate it into a traditional wrapper, creating a hybrid application.

    Importantly, Appium is “cross-platform”, allowing you to write tests against multiple platforms (iOS, Android), using the same API. This enables code usage between iOS, Android, and Windows test suites. It runs on iOS and Android applications using the WebDriver protocol.

    Fig:- Appium Architecture

    What is WebDriverIO

    WebdriverIO is a next-gen browser and Node.js automated mobile testing framework. It allows you to customize any application written with modern web frameworks for mobile devices or browsers, such as React, Angular, Polymeror, and Vue.js.

    WebdriverIO is a widely used test automation framework in JavaScript. It has various features like it supports many reports and services, Test Frameworks, and WDIO CLI Test Runners

    The following are examples of supported services:

    • Appium Service
    • Devtools Service
    • Firefox Profile Service
    • Selenium Standalone Service
    • Shared Store Service
    • Static Server Service
    • ChromeDriver Service
    • Report Portal Service
    • Docker Service

    The followings are supported by the test framework:

    • Mocha
    • Jasmine
    • Cucumber 
    Fig:- WebdriverIO Architecture

    Key features of Appium & WebdriverIO

    Appium

    • Does not require application source code or library
    • Provides a strong and active community
    • Has multi-platform support, i.e., it can run the same test cases on multiple platforms
    • Allows the parallel execution of test scripts
    • In Appium, a small change does not require reinstallation of the application
    • Supports various languages like C#, Python, Java, Ruby, PHP, JavaScript with node.js, and many others that have a Selenium client library

    WebdriverIO 

    • Extendable
    • Compatible
    • Feature-rich 
    • Supports modern web and mobile frameworks
    • Runs automation tests both for web applications as well as native mobile apps.
    • Simple and easy syntax
    • Integrates tests to third-party tools such as Appium
    • ‘Wdio setup wizard’ makes the setup simple and easy
    • Integrated test runner

    Installation & Configuration

    $ mkdir Demo_Appium_Project

    • Create a sample Appium Project
    $ npm init
    $ package name: (demo_appium_project) demo_appium_test
    $ version: (1.0.0) 1.0.0
    $ description: demo_appium_practice
    $ entry point: (index.js) index.js
    $ test command: "./node_modules/.bin/wdio wdio.conf.js"
    $ git repository: 
    $ keywords: 
    $ author: Pushkar
    $ license: (ISC) ISC

    This will also create a package.json file for test settings and project dependencies.

    • Install node packages
    $ npm install

    • Install Appium through npm or as a standalone app.
    $ npm install -g appium or npm install --save appium

    $ npm install -g webdriverio or npm install --save-dev webdriverio @wdio/cli
    • Install Chai Assertion library 
    $ npm install -g chai or npm install --save chai

    Make sure you have following versions installed: 

    $ node --version - v.14.17.0
    $ npm --version - 7.17.0
    $ appium --version - 1.21.0
    $ java --version - java 16.0.1
    $ allure --version - 2.14.0

    WebdriverIO Configuration 

    The web driver configuration file must be created to apply the configuration during the test Generate command below project:

    $ npx wdio config

    With the following series of questions, install the required dependencies,

    $ Where is your automation backend located? - On my local machine
    $ Which framework do you want to use? - mocha	
    $ Do you want to use a compiler? No!
    $ Where are your test specs located? - ./test/specs/**/*.js
    $ Do you want WebdriverIO to autogenerate some test files? - Yes
    $ Do you want to use page objects (https://martinfowler.com/bliki/PageObject.html)? - No
    $ Which reporter do you want to use? - Allure
    $ Do you want to add a service to your test setup? - No
    $ What is the base url? - http://localhost

    This is how wdio.conf.js looks:

    exports.config = {
     port: 4724,
     path: '/wd/hub/',
     runner: 'local',
     specs: ['./test/specs/*.js'],
     maxInstances: 1,
     capabilities: [
       {
         platformName: 'Android',
         platformVersion: '11',
         appPackage: 'com.facebook.katana',
         appActivity: 'com.facebook.katana.LoginActivity',
         automationName: 'UiAutomator2'
       }
     ],
     services: [
       [
         'appium',
         {
           args: {
             relaxedSecurity: true
            },
           command: 'appium'
         }
       ]
     ],
     logLevel: 'debug',
     bail: 0,
     baseUrl: 'http://localhost',
     waitforTimeout: 10000,
     connectionRetryTimeout: 90000,
     connectionRetryCount: 3,
     framework: 'mocha',
     reporters: [
       [
         'allure',
         {
           outputDir: 'allure-results',
           disableWebdriverStepsReporting: true,
           disableWebdriverScreenshotsReporting: false
         }
       ]
     ],
     mochaOpts: {
       ui: 'bdd',
       timeout: 60000
     },
     afterTest: function(test, context, { error, result, duration, passed, retries }) {
       if (!passed) {
           browser.takeScreenshot();
       }
     }
    }
    view raw

    For iOS Automation, just add the following capabilities in wdio.conf.js & the Appium Configuration: 

    {
      "platformName": "IOS",
      "platformVersion": "14.5",
      "app": "/Your_PATH/wdioNativeDemoApp.app",
      "deviceName": "iPhone 12 Pro Max"
    }

    Launch the iOS Simulator from Xcode

    Install Appium Doctor for iOS by using following command:

    npm install -g appium-doctor

    Fig:- Appium Doctor Installed

    This is how package.json will look:

    {
     "name": "demo_appium_test",
     "version": "1.0.0",
     "description": "demo_appium_practice",
     "main": "index.js",
     "scripts": {
       "test": "./node_modules/.bin/wdio wdio.conf.js"
     },
     "author": "Pushkar",
     "license": "ISC",
     "dependencies": {
       "@wdio/sync": "^7.7.4",
       "appium": "^1.21.0",
       "chai": "^4.3.4",
       "webdriverio": "^7.7.4"
     },
     "devDependencies": {
       "@wdio/allure-reporter": "^7.7.3",
       "@wdio/appium-service": "^7.7.3",
       "@wdio/cli": "^7.7.4",
       "@wdio/local-runner": "^7.7.4",
       "@wdio/mocha-framework": "^7.7.4",
       "@wdio/selenium-standalone-service": "^7.7.4"
     }
    }

    Steps to follow if npm legacy peer deeps problem occurred:

    npm install --save --legacy-peer-deps
    npm config set legacy-peer-deps true
    npm i --legacy-peer-deps
    npm config set legacy-peer-deps true
    npm cache clean --force

    This is how the folder structure will look in Appium with the WebDriverIO Framework:

    Fig:- Appium Framework Outline

    Step-by-Step Configuration of Android Emulator using Android Studio

    Fig:- Android Studio Launch

     

    Fig:- Android Studio AVD Manager

     

    Fig:- Create Virtual Device

     

    Fig:- Choose a device Definition

     

    Fig:- Select system image

    Fig:- License Agreement

     

    Fig:- Component Installer

     

    Fig:- System Image Download

     

    Fig:- Configuration Verification

    Fig:- Virtual Device Listing

    ‍Appium Desktop Configuration

    Fig:- Appium Desktop Launch

    Setup of ANDROID_HOME + ANDROID_SDK_ROOT &  JAVA_HOME

    Follow these steps for setting up ANDROID_HOME: 

    vi ~/.bash_profile
    Add following 
    export ANDROID_HOME=/Users/pushkar/android-sdk 
    export PATH=$PATH:$ANDROID_HOME/platform-tools 
    export PATH=$PATH:$ANDROID_HOME/tools 
    export PATH=$PATH:$ANDROID_HOME/tools/bin 
    export PATH=$PATH:$ANDROID_HOME/emulator
    Save ~/.bash_profile 
    source ~/.bash_profile 
    echo $ANDROID_HOME
    /Users/pushkar/Library/Android/sdk

    Follow these steps for setting up ANDROID_SDK_ROOT:

    vi ~/.bash_profile
    Add following 
    export ANDROID_HOME=/Users/pushkar/Android/sdk
    export ANDROID_SDK_ROOT=/Users/pushkar/Android/sdk
    export ANDROID_AVD_HOME=/Users/pushkar/.android/avd
    Save ~/.bash_profile 
    source ~/.bash_profile 
    echo $ANDROID_SDK_ROOT
    /Users/pushkar/Library/Android/sdk

    Follow these steps for setting up JAVA_HOME:

    java --version
    vi ~/.bash_profile
    Add following 
    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk-16.0.1.jdk/Contents/Home.
    echo $JAVA_HOME
    /Library/Java/JavaVirtualMachines/jdk-16.0.1.jdk/Contents/Home

    Fig:- Environment Variables in Appium

     

    Fig:- Appium Server Starts 

     

    Fig:- Appium Start Inspector Session

    Fig:- Inspector Session Configurations

    Note – Make sure you need to install the app from Google Play Store. 

    Fig:- Android Emulator Launch  

     

    Fig: – Android Emulator with Facebook React Native Mobile App

     

    Fig:- Success of Appium with Emulator

     

    Fig:- Locating Elements using Appium Inspector

    How to write E2E React Native Mobile App Tests 

    Fig:- Test Suite Structure of Mocha

    ‍Here is an example of how to write E2E test in Appium:

    Positive Testing Scenario – Validate Login & Nav Bar

    1. Open Facebook React Native App 
    2. Enter valid email and password
    3. Click on Login
    4. Users should be able to login into Facebook 

    Negative Testing Scenario – Invalid Login

    1. Open Facebook React Native App
    2. Enter invalid email and password 
    3. Click on login 
    4. Users should not be able to login after receiving an “Incorrect Password” message popup

    Negative Testing Scenario – Invalid Element

    1. Open Facebook React Native App 
    2. Enter invalid email and  password 
    3. Click on login 
    4. Provide invalid element to capture message

    Make sure test_script should be under test/specs folder 

    var expect = require('chai').expect
    
    beforeEach(() => {
     driver.launchApp()
    })
    
    afterEach(() => {
     driver.closeApp()
    })
    
    describe('Verify Login Scenarios on Facebook React Native Mobile App', () => {
     it('User should be able to login using valid credentials to Facebook Mobile App', () => {   
       $(`~Username`).waitForDisplayed(20000)
       $(`~Username`).setValue('Valid-Email')
       $(`~Password`).waitForDisplayed(20000)
       $(`~Password`).setValue('Valid-Password')
       $('~Log In').click()
       browser.pause(10000)
     })
    
     it('User should not be able to login with invalid credentials to Facebook Mobile App', () => {
       $(`~Username`).waitForDisplayed(20000)
       $(`~Username`).setValue('Invalid-Email')
       $(`~Password`).waitForDisplayed(20000)
       $(`~Password`).setValue('Invalid-Password')   
       $('~Log In').click()
       $(
           '//android.widget.TextView[@resource-id="com.facebook.katana:id/(name removed)"]'
         )
         .waitForDisplayed(11000)
       const status = $(
         '//android.widget.TextView[@resource-id="com.facebook.katana:id/(name removed)"]'
       ).getText()
       expect(status).to.equal(
         `You Can't Use This Feature Right Now`     
       )
     })
    
     it('Test Case should Fail Because of Invalid Element', () => {
       $(`~Username`).waitForDisplayed(20000)
       $(`~Username`).setValue('Invalid-Email')
       $(`~Password`).waitForDisplayed(20000)
       $(`~Password`).setValue('Invalid-Pasword')   
       $('~Log In').click()
       $(
           '//android.widget.TextView[@resource-id="com.facebook.katana:id/(name removed)"'
         )
         .waitForDisplayed(11000)
       const status = $(
         '//android.widget.TextView[@resource-id="com.facebook.katana"'
       ).getText()
       expect(status).to.equal(
         `You Can't Use This Feature Right Now`     
       )
     })
    
    })

    How to Run Mobile Tests Scripts  

    $ npm test 
    This will create a Results folder with .xml report 

    Reporting

    The following are examples of the supported reporters:

    • Allure Reporter
    • Concise Reporter
    • Dot Reporter
    • JUnit Reporter
    • Spec Reporter
    • Sumologic Reporter
    • Report Portal Reporter
    • Video Reporter
    • HTML Reporter
    • JSON Reporter
    • Mochawesome Reporter
    • Timeline Reporter
    • CucumberJS JSON Reporter

    Here, we are using Allure Reporting. Allure Reporting in WebdriverIO is a plugin to create Allure Test Reports.

    The easiest way is to keep @wdio/allure-reporter as a devDependency in your package.json with

    $ npm install @wdio/allure-reporter --save-dev

    Reporter options can be specified in the wdio.conf.js configuration file 

    reporters: [
       [
         'allure',
         {
           outputDir: 'allure-results',
           disableWebdriverStepsReporting: true,
           disableWebdriverScreenshotsReporting: false
         }
       ]
     ]

    To convert Allure .xml report to .html report, run the following command: 

    $ allure generate && allure open
    Allure HTML report should be opened in browser

    This is what Allure Reports look like: 

    Fig:- Allure Report Overview 

     

    Fig:- Allure Categories

     

    Fig:- Allure Suites

     

    Fig: – Allure Graphs

     

    Fig:- Allure Timeline

     

    Fig:- Allure Behaviors

     

    Fig:- Allure Packages

    Limitations with Appium & WebDriverIO

    Appium 

    • Android versions lower than 4.2 are not supported for testing
    • Limited support for hybrid app testing
    • Doesn’t support image comparison.

    WebdriverIO

    • It has a custom implementation
    • It can be used for automating AngularJS apps, but it is not as customized as Protractor.

    Conclusion

    In the QA and developer ecosystem, using Appium to test React native applications is common. Appium makes it easy to record test cases on both Android and iOS platforms while working with React Native. Selenium, a basic web developer, acts as a bridge between Appium and mobile platforms for delivery and testing. Appium is a solid framework for automatic UI testing. This article explains that this framework is capable of conducting test cases quickly and reliably. Most importantly, it can test both Android and iOS apps developed by the React Native framework on the basis of a single code.

    Related Articles –

    References 

  • Building an ETL Workflow Using Apache NiFi and Hive

    The objective of this article is to design an ETL workflow using Apache NiFi that will scrape a web page with almost no code to get an endpoint, extract and transform the dataset, and load the transformed data into a Hive table.

    Problem Statement

    One potential use case where we need to create a data pipeline would be to capture the district level COVID-19 information from the COVID19-India API website, which gets updated daily. So, the aim is to create a flow that collates and loads a dataset into a warehouse system used by various downstream applications for further analysis, and the flow should be easily configurable for future changes.

    Prerequisites

    Before we start, we must have a basic understanding of Apache NiFi, and having it installed on a system would be a great start for this article. If you do not have it installed, please follow these quick steps. Apache Hive should be added to this architecture, which also requires a fully functional Hadoop framework. For this article, I am using Hive on a single cluster installed locally, but you can use a remote hive connection as well.

    Basic Terminologies

    Apache NiFi is an ETL tool with flow-based programming that comes with a web UI built to provide an easy way (drag & drop) to handle data flow in real-time. It also supports powerful and scalable means of data routing and transformation, which can be run on a single server or in a clustered mode across many servers.

    NiFi workflow consists of processors, the rectangular components that can process, verify, filter, join, split, or adjust data. They exchange pieces of information called FlowFiles through queues named connections, and the FlowFile Controller helps to manage the resources between those components.

    Web scraping is a process to extract and collect structured web data with automation. It includes extracting and processing underlying HTML code using CSS selectors and the extracted data gets stored into a database.

    Apache Hive is a warehouse system built on top of Hadoop used for data summarization, query, and ad-hoc analysis.

    Steps for ETL Workflow

    Fig:- End-to-End NiFi WorkFlow

    The above flow comprises multiple processors each performing different tasks at different stages to process data. The different stages are Collect (InvokeHTTP – API Web Page, InvokeHTTP – Download District Data), Filter (GetHTMLElement, ExtractEndPoints, RouteOnAttributeDistrict API, QueryRecord), Transform (ReplaceHeaders, ConvertJSONToSQL), Load (PutHiveQL), and Logging (LogAttribute). Each processor is connected through different relationship connections and gets triggered on success until the data gets loaded into the table. The entire flow is scheduled to run daily.

    So, let’s dig into each step to understand the flow better.

    1. Get the HTML document using the Remote URL

    The flow starts with an InvokeHTTP processor that sends an HTTP GET request to the COVID19-India API URL and returns an HTML page in the response queue for further inspection. The processor can be used to invoke multiple HTTP methods (GET, PUT, POST, or PATCH) as well.

    Fig:- InvokeHTTP – API Web Page Configuration

    2. Extract listed endpoints

    The second step occurs when the GETHTMLElement processor targets HTML table rows from the response where all the endpoints are listed inside anchor tags using the CSS selector as tr > td > a. and extracts data into FlowFiles.

    Fig:- GetHTMLElement Configuration

    After the success of the previous step, the ExtractText processor evaluates regular expressions against the content of the FlowFile to extract the URLs, which are then assigned to a FlowFile attribute named data_url.

    Fig:- ExtractEndPoints Configuration

    Note: The layout of the web page may have changed in the future. So, if you are reading this article in the future, configure the above processors as per the layout changes if any.

    3. Pick districts API and Download the dataset

    Here, the RouteOnAttribute processor filters out an API for district-level information and ignores other APIs using Apache NiFi Expression since we are only interested in district.csv

    Fig:- RouteOnAttribute – District API Configuration

    And this time, the InvokeHTTP processor downloads the data using the extracted API endpoint assigned to the attribute data_url surrounded with curly braces and the response data will be in the CSV format.

    Fig:- InvokeHTTP – Download District Data Configuration

    ‍4. Transform and Filter the dataset

    In this stage, the header of the response data is changed to lowercase using the ReplaceText processor with Literal Replace strategy, and the first field name is changed from date to recorded_date to avoid using reserved database keywords.

    Since the data is being updated daily on an incremental basis, we will only extract the data from the previous day using the QueryRecord processor. It will also convert the CSV data into JSON FlowFile using the CSVReader and JsonRecordSetWriter controller services.

    Please note that both the CSVReader and JsonRecordSetWriter services can have the default settings for our use. You can check out this blog for more reading on the controller services.

    And as mentioned, QueryRecord evaluates the below query to get data from the previous day out of the FlowFile and passes it to the next processor.

    select * from FlowFile where recorded_date=’${now():toNumber():minus(86400000):format(‘yyyy-MM-dd’)}’ 

    Fig:- ReplaceHeaders Configuration

    Fig:- QueryRecord Configuration

    ‍5. Establish JDBC connection pool for Hive and create a table

    Let’s set up the Hive JDBC driver for the NiFi flow using HiveConnectinPool with required local/remote configurations (database connection URL, user, and password).  Hive Configuration Resources property expects Hive configuration file path, i.e., hive-site.xml.

    Fig:- HiveConnectionPool Setup

    Now, we need an empty table to load the data from the NiFi flow, and to do so, you can use the DDL structure below:

    CREATE TABLE IF NOT EXISTS demo.districts (recorded_date string, state string, district string, confirmed string, recovered string, deceased string, other string, tested string)
    ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

    6. Load data into the Hive table

    In this step, the JSON-formatted FlowFile is converted into an SQL statement using ConvertJSONToSQL to provide a SQL query as the output FlowFile. We can configure the HiveConnectinPool for the JDBC Connection Pool property along with the table name and statement type before running the processor. In this case, the statement would be an insert type since we need to load the data into the table.

    Also, please note that when preparing a SQL command, the SQL Parameter Attribute Prefix property should be hiveql. Otherwise, the very next processor will not be able to identify it and will throw an error.

    Then, on success, PutHiveQL executes the input SQL command and loads the data into the table. The success of this processor marks the end of the workflow and the data can be verified by fetching the target table.

    Fig:- ConvertJSONToSQL Configurations

     

    Fig:- PutHiveQL Configuration

    ‍7. Schedule the flow for daily updates

    You can schedule the entire flow to run at any given time using different NiFi scheduling strategies. Since the first InvokeHTTP is the initiator of this flow, we can configure it to run daily at 2 AM.

    Fig:- Scheduling Strategy

    8. Log Management

    Almost every processor has been directed to the LogAttribute processor with a failure/success queue, which will write the state and information of all used attributes into the NiFi file, logs/nifi-app.log. By checking this file, we can debug and fix the issue in case of any failure. To extend it even further, we can also set up a flow to capture and notify error logs using Apache Kafka over email.

    9. Consume data for analysis

    You can use various open-source visualization tools to start off with the exploratory data analysis on the data stored in the Hive table.

    You can download the template covid_etl_workflow.xml and run it on your machine for reference.

    Future Scope

    There are different ways to build any workflow, and this was one of them. You can take this further by allowing multiple datasets (state_wise, test_datasets) from the list with different combinations of various processors/controllers as a part of the flow. 

    You can also try scraping data from a product listing page of multiple e-commerce websites for a comparison between goods and price or you can even extract movie reviews and ratings from the IMDb website and use it as a recommendation for users. 

    Conclusion

    In this article, we discussed Apache NiFi and created a workflow to extract, filter, transform, and load the data for analysis purposes. If you are more comfortable building logics and want to focus on the architecture with less code, then Apache NiFi is the tool for you.

  • How To Implement Chaos Engineering For Microservices Using Istio

    “Embrace Failures. Chaos and failures are your friends, not enemies.” A microservice ecosystem is going to fail at some point. The issue is not if you fail, but when you fail, will you notice or not. It’s between whether it will affect your users because all of your services are down, or it will affect only a few users and you can fix it at your own time.

    Chaos Engineering is a practice to intentionally introduce faults and failures into your microservice architecture to test the resilience and stability of your system. Istio can be a great tool to do so. Let’s have a look at how Istio made it easy.

    For more information on how to setup Istio and what are virtual service and Gateways, please have a look at the following blog, how to setup Istio on GKE.

    Fault Injection With Istio

    Fault injection is a testing method to introduce errors into your microservice architecture to ensure it can withstand the error conditions. Istio lets you injects errors at HTTP layer instead of delaying the packets or killing the pods at network layer. This way, you can generate various types of HTTP error codes and test the reaction of your services under those conditions. 

    Generating HTTP 503 Error

    Here we see that two pods are running two different versions of recommendation service using the recommended tutorial while installing the sample application.

    Currently, the traffic on the recommendation service is automatically load balanced between those two pods.

    kubectl get pods -l app=recommendation
    NAME                                  READY     STATUS    RESTARTS   AGE
    recommendation-v1-798bf87d96-d9d95   2/2       Running   0          1h
    recommendation-v2-7bc4f7f696-d9j2m   2/2       Running   0          1h

    Now let’s apply a fault injection using virtual service which will send 503 HTTP error codes in 30% of the traffic serving the above pods.

    To test whether it is working, check the output from the curl of customer service microservice endpoint. 

    You will find the 503 error on approximately 30% of the request coming to recommendation service.

    To restore normal operation, please delete the above virtual service using:

    kubectl delete -f recommendation-fault.yaml

    Delay

    The most common failure we see in production is not the down service, rather a delay service. To inject network latency as a chaos experiment, you can create another virtual service. Sometimes, it happens that your application doesn’t respond on time and creates chaos in the complete ecosystem. How to simulate that behavior, let’s have a look.

    Now, if you hit the URL of endpoints of the above service in a loop, you will see the delays in some of the requests. 

    Retry‍

    In some of the production services, we expect that instead of failing instantly, it should retry N number of times to get the desired output. If not succeeded, then only a request should be considered as failed.

    For that mechanism, you can insert retries on those services as follows:

    Now any request coming to recommendation will do 3 attempts before considering it as failed.

    Timeout‍

    In the real world, an application faces most failures due to timeouts. It can be because of more load on the application or any other latency in serving the request. Your application should have proper timeouts defined, before declaring any request as “Failed”. You can use Istio to simulate the timeout mechanism and give our application a limited amount of time to respond before giving up.

    Wait only for N seconds before failing and giving up.

    kind: VirtualService
    metadata:
      name: recommendation
    spec:
      hosts:
      - recommendation
      http:
      - route:
        - destination:
            host: recommendation
        timeout: 1.000s

    Conclusion‍

    Istio lets you inject faults at the HTTP layer for your application and improves its resilience and stability. But, the application must handle the failures and take appropriate course of action. Chaos Engineering is only effective when you know your application can take failures, otherwise, there is no point in testing for chaos if you know your application is definitely broken.

  • An Introduction to Asynchronous Programming in Python

    Introduction

    Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread. When the work is complete, it notifies the main thread about completion or failure of the worker thread. There are numerous benefits to using it, such as improved application performance and enhanced responsiveness.

    Asynchronous programming has been gaining a lot of attention in the past few years, and for good reason. Although it can be more difficult than the traditional linear style, it is also much more efficient.

    For example, instead of waiting for an HTTP request to finish before continuing execution, with Python async coroutines you can submit the request and do other work that’s waiting in a queue while waiting for the HTTP request to finish.

    Asynchronicity seems to be a big reason why Node.js so popular for server-side programming. Much of the code we write, especially in heavy IO applications like websites, depends on external resources. This could be anything from a remote database call to POSTing to a REST service. As soon as you ask for any of these resources, your code is waiting around with nothing to do. With asynchronous programming, you allow your code to handle other tasks while waiting for these other resources to respond.

    How Does Python Do Multiple Things At Once?

    1. Multiple Processes

    The most obvious way is to use multiple processes. From the terminal, you can start your script two, three, four…ten times and then all the scripts are going to run independently or at the same time. The operating system that’s underneath will take care of sharing your CPU resources among all those instances. Alternately you can use the multiprocessing library which supports spawning processes as shown in the example below.

    from multiprocessing import Process
    
    
    def print_func(continent='Asia'):
        print('The name of continent is : ', continent)
    
    if __name__ == "__main__":  # confirms that the code is under main function
        names = ['America', 'Europe', 'Africa']
        procs = []
        proc = Process(target=print_func)  # instantiating without any argument
        procs.append(proc)
        proc.start()
    
        # instantiating process with arguments
        for name in names:
            # print(name)
            proc = Process(target=print_func, args=(name,))
            procs.append(proc)
            proc.start()
    
        # complete the processes
        for proc in procs:
            proc.join()

    Output:

    The name of continent is :  Asia
    The name of continent is :  America
    The name of continent is :  Europe
    The name of continent is :  Africa

    2. Multiple Threads

    The next way to run multiple things at once is to use threads. A thread is a line of execution, pretty much like a process, but you can have multiple threads in the context of one process and they all share access to common resources. But because of this, it’s difficult to write a threading code. And again, the operating system is doing all the heavy lifting on sharing the CPU, but the global interpreter lock (GIL) allows only one thread to run Python code at a given time even when you have multiple threads running code. So, In CPython, the GIL prevents multi-core concurrency. Basically, you’re running in a single core even though you may have two or four or more.

    import threading
     
    def print_cube(num):
        """
        function to print cube of given num
        """
        print("Cube: {}".format(num * num * num))
     
    def print_square(num):
        """
        function to print square of given num
        """
        print("Square: {}".format(num * num))
     
    if __name__ == "__main__":
        # creating thread
        t1 = threading.Thread(target=print_square, args=(10,))
        t2 = threading.Thread(target=print_cube, args=(10,))
     
        # starting thread 1
        t1.start()
        # starting thread 2
        t2.start()
     
        # wait until thread 1 is completely executed
        t1.join()
        # wait until thread 2 is completely executed
        t2.join()
     
        # both threads completely executed
        print("Done!")

    Output:

    Square: 100
    Cube: 1000
    Done!

    3. Coroutines using yield:

    Coroutines are generalization of subroutines. They are used for cooperative multitasking where a process voluntarily yield (give away) control periodically or when idle in order to enable multiple applications to be run simultaneously. Coroutines are similar to generators but with few extra methods and slight change in how we use yield statement. Generators produce data for iteration while coroutines can also consume data.

    def print_name(prefix):
        print("Searching prefix:{}".format(prefix))
        try : 
            while True:
                    # yeild used to create coroutine
                    name = (yield)
                    if prefix in name:
                        print(name)
        except GeneratorExit:
                print("Closing coroutine!!")
     
    corou = print_name("Dear")
    corou.__next__()
    corou.send("James")
    corou.send("Dear James")
    corou.close()

    Output:

    Searching prefix:Dear
    Dear James
    Closing coroutine!!

    4. Asynchronous Programming

    The fourth way is an asynchronous programming, where the OS is not participating. As far as OS is concerned you’re going to have one process and there’s going to be a single thread within that process, but you’ll be able to do multiple things at once. So, what’s the trick?

    The answer is asyncio

    asyncio is the new concurrency module introduced in Python 3.4. It is designed to use coroutines and futures to simplify asynchronous code and make it almost as readable as synchronous code as there are no callbacks.

    asyncio uses different constructs: event loopscoroutines and futures.

    • An event loop manages and distributes the execution of different tasks. It registers them and handles distributing the flow of control between them.
    • Coroutines (covered above) are special functions that work similarly to Python generators, on await they release the flow of control back to the event loop. A coroutine needs to be scheduled to run on the event loop, once scheduled coroutines are wrapped in Tasks which is a type of Future.
    • Futures represent the result of a task that may or may not have been executed. This result may be an exception.

    Using Asyncio, you can structure your code so subtasks are defined as coroutines and allows you to schedule them as you please, including simultaneously. Coroutines contain yield points where we define possible points where a context switch can happen if other tasks are pending, but will not if no other task is pending.

    A context switch in asyncio represents the event loop yielding the flow of control from one coroutine to the next.

    In the example, we run 3 async tasks that query Reddit separately, extract and print the JSON. We leverage aiohttp which is a http client library ensuring even the HTTP request runs asynchronously.

    import signal  
    import sys  
    import asyncio  
    import aiohttp  
    import json
    
    loop = asyncio.get_event_loop()  
    client = aiohttp.ClientSession(loop=loop)
    
    async def get_json(client, url):  
        async with client.get(url) as response:
            assert response.status == 200
            return await response.read()
    
    async def get_reddit_top(subreddit, client):  
        data1 = await get_json(client, 'https://www.reddit.com/r/' + subreddit + '/top.json?sort=top&t=day&limit=5')
    
        j = json.loads(data1.decode('utf-8'))
        for i in j['data']['children']:
            score = i['data']['score']
            title = i['data']['title']
            link = i['data']['url']
            print(str(score) + ': ' + title + ' (' + link + ')')
    
        print('DONE:', subreddit + '\n')
    
    def signal_handler(signal, frame):  
        loop.stop()
        client.close()
        sys.exit(0)
    
    signal.signal(signal.SIGINT, signal_handler)
    
    asyncio.ensure_future(get_reddit_top('python', client))  
    asyncio.ensure_future(get_reddit_top('programming', client))  
    asyncio.ensure_future(get_reddit_top('compsci', client))  
    loop.run_forever()

    Output:

    50: Undershoot: Parsing theory in 1965 (http://jeffreykegler.github.io/Ocean-of-Awareness-blog/individual/2018/07/knuth_1965_2.html)
    12: Question about best-prefix/failure function/primal match table in kmp algorithm (https://www.reddit.com/r/compsci/comments/8xd3m2/question_about_bestprefixfailure_functionprimal/)
    1: Question regarding calculating the probability of failure of a RAID system (https://www.reddit.com/r/compsci/comments/8xbkk2/question_regarding_calculating_the_probability_of/)
    DONE: compsci
    
    336: /r/thanosdidnothingwrong -- banning people with python (https://clips.twitch.tv/AstutePluckyCocoaLitty)
    175: PythonRobotics: Python sample codes for robotics algorithms (https://atsushisakai.github.io/PythonRobotics/)
    23: Python and Flask Tutorial in VS Code (https://code.visualstudio.com/docs/python/tutorial-flask)
    17: Started a new blog on Celery - what would you like to read about? (https://www.python-celery.com)
    14: A Simple Anomaly Detection Algorithm in Python (https://medium.com/@mathmare_/pyng-a-simple-anomaly-detection-algorithm-2f355d7dc054)
    DONE: python
    
    1360: git bundle (https://dev.to/gabeguz/git-bundle-2l5o)
    1191: Which hashing algorithm is best for uniqueness and speed? Ian Boyd's answer (top voted) is one of the best comments I've seen on Stackexchange. (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed)
    430: ARM launchesFactscampaign against RISC-V (https://riscv-basics.com/)
    244: Choice of search engine on Android nuked byAnonymous Coward” (2009) (https://android.googlesource.com/platform/packages/apps/GlobalSearch/+/592150ac00086400415afe936d96f04d3be3ba0c)
    209: Exploiting freely accessible WhatsApp data orWhy does WhatsApp web know my phones battery level?” (https://medium.com/@juan_cortes/exploiting-freely-accessible-whatsapp-data-or-why-does-whatsapp-know-my-battery-level-ddac224041b4)
    DONE: programming

    Using Redis and Redis Queue(RQ):

    Using asyncio and aiohttp may not always be in an option especially if you are using older versions of python. Also, there will be scenarios when you would want to distribute your tasks across different servers. In that case we can leverage RQ (Redis Queue). It is a simple Python library for queueing jobs and processing them in the background with workers. It is backed by Redis – a key/value data store.

    In the example below, we have queued a simple function count_words_at_url using redis.

    from mymodule import count_words_at_url
    from redis import Redis
    from rq import Queue
    
    
    q = Queue(connection=Redis())
    job = q.enqueue(count_words_at_url, 'http://nvie.com')
    
    
    ******mymodule.py******
    
    import requests
    
    def count_words_at_url(url):
        """Just an example function that's called async."""
        resp = requests.get(url)
    
        print( len(resp.text.split()))
        return( len(resp.text.split()))

    Output:

    15:10:45 RQ worker 'rq:worker:EMPID18030.9865' started, version 0.11.0
    15:10:45 *** Listening on default...
    15:10:45 Cleaning registries for queue: default
    15:10:50 default: mymodule.count_words_at_url('http://nvie.com') (a2b7451e-731f-4f31-9232-2b7e3549051f)
    322
    15:10:51 default: Job OK (a2b7451e-731f-4f31-9232-2b7e3549051f)
    15:10:51 Result is kept for 500 seconds

    Conclusion:

    Let’s take a classical example chess exhibition where one of the best chess players competes against a lot of people. And if there are 24 games with 24 people to play with and the chess master plays with all of them synchronically, it’ll take at least 12 hours (taking into account that the average game takes 30 moves, the chess master thinks for 5 seconds to come up with a move and the opponent – for approximately 55 seconds). But using the asynchronous mode gives chess master the opportunity to make a move and leave the opponent thinking while going to the next one and making a move there. This way a move on all 24 games can be done in 2 minutes and all of them can be won in just one hour.

    So, this is what’s meant when people talk about asynchronous being really fast. It’s this kind of fast. Chess master doesn’t play chess faster, the time is just more optimized and it’s not get wasted on waiting around. This is how it works.

    In this analogy, the chess master will be our CPU and the idea is that we wanna make sure that the CPU doesn’t wait or waits the least amount of time possible. It’s about always finding something to do.

    A practical definition of Async is that it’s a style of concurrent programming in which tasks release the CPU during waiting periods, so that other tasks can use it. In Python, there are several ways to achieve concurrency, based on our requirement, code flow, data manipulation, architecture design  and use cases we can select any of these methods.

  • Hear from Sema’s Founder & CEO, Matt Van Itallie on the Golden Rules of Code Reviews

    We had a wonderful time talking to Matt about Sema, a much-awaited online code review and developer portfolio tool, and how Velotio helped in building it. He also talks about the basics of writing more meaningful code reviews and his team’s experience of working with Velotio. 

    Spradha: Talk us through your entrepreneurial journey, and what made you build Sema?

    Matt: I learned to code from my parents, who were both programmers, and then worked on the organizational side of technology. I decided to start Sema because of the challenges I saw with lower code quality and companies not managing the engineers’ careers the right way. And so, I wanted to build a company to help solve that. 

    At Sema, we are a 50-person team with engineers from all over the globe. Being a global team is one of my favorite things at Sema because I get to learn so much from people with so many different backgrounds. We have been around since 2017. 

    We are building products to help improve code quality and to help engineers improve their careers and their knowledge. We think that code is a craft. It’s not a competition. And so, to build tools for engineers to improve the code and skills must begin with treating code as a craft. 

    Spradha: What advice do you have for developers when it comes to code reviews?

    Matt: Code reviews are a great way to improve the quality of code and help developers improve their skills. Anyone can benefit from doing code reviews, as well as, of course, receiving code reviews. Even junior developers reviewing the code of seniors can provide meaningful feedback. They can sometimes teach the seniors while having meaningful learning moments as a reviewer themselves. 

    And so, code reviews are incredibly important. There are six tips I would share. 

    • Treat code reviews as part of coding and prioritize it

    It might be obvious, but developers and the engineering team, and the entire company should treat code reviews as part of coding, not as something to do when devs have time. The reason is that an incomplete code review is a blocker for another engineer. So, the faster we can get high-quality code reviews, the faster other engineers can progress. 

    • Always remember – it’s a human being on the other end of the review

    Code reviews should be clear, concise, and communicated the right way. It’s also important to deliver the message with empathy. You can always consider reading your code review out loud and asking yourself, “Is this something I would want to be said to me?” If not, change the tone or content.

    • Give clear recommendations and suggestions

    Never tell someone that the code needs to be fixed without giving suggestions or recommendations on what to fix or how to fix it.

    • Always assume good intent

    Code may not be written how you would write it. Let’s say that more clearly: code is rarely written the same way by two different people. After all, code is a craft, not a task on an assembly line. Tap into a sense of curiosity and appreciation while reviewing – curiosity to understand what the reviewer had in mind and gratitude for what the coder did or tried to do.

    • Clarify the action and the level of importance

    If you are making an optional suggestion, for example, a “nit” that isn’t necessary before the code is approved for production, say so clearly.

    • Don’t forget that code feedback – and all feedback – includes praise.

    It goes without saying that a benefit of doing code reviews is to make the code better and fix issues. But that’s only half of it. On the flip side, code reviews present an excellent opportunity to appreciate your colleagues’ work. If someone has written particularly elegant or maintainable code or has made a great decision about using a library, let them know!

    It’s always the right time to share positive feedback.

    Spradha: How has Velotio helped you build the code review tool at Sema?

    Matt: We’ve been working with Velotio for over 18 months. We have several amazing colleagues from Velotio, including software developers, DevOps engineers, and product managers. 

    Our Velotio colleagues have been instrumental in building our new product, a free code review assistant and developer portfolio tool. It includes a Chrome extension that makes code reviews more clear, more robust, and reusable. The information is available in dashboards for further future exploration too. 

    Developers can now create portfolios of their work that goes far way beyond a traditional developer portfolio. It is based on what other reviewers have said about your code and what you have said as a reviewer on other people’s code. It allows you to really tell a clear story about what you have worked on and how you have grown. 

    Spradha: How has your experience been working with Velotio?

    Matt: We have had an extraordinary experience working with the Velotio team at Sema. We consider our colleagues from Velotio as core team members and leaders of our organization. I have been so impressed with the quality, the knowledge, the energy, and the commitment that Velotio colleagues have shown. And we would not have been able to achieve as much as we have without their contribution.

    I can think of three crucial moments in particular when talking about the impact our Velotio colleagues have made. First, one of their engineers played a major role in designing and building the Chrome extension that we use. 

    Secondly, a DevOps engineer from Velotio has radically improved the setup, reliability, and ease of use of our DevOps systems. 

    And third, a product manager from Velotio has been an extraordinary project leader for a critical feature of the Sema tool, a library of over 20,000 best practices that coders can insert into code reviews, which saves time and helps make the code reviews more robust. 

    We literally would not be where we are if it was not for the great work of our colleagues from Velotio. 

    Spradha: How can people learn more about Sema?

    Matt: You can visit our website at www.semasoftware.com. And for those interested in using our tool to help with code reviews, whether it’s a commercial project or an open-source project, you can sign up for free. 

  • Create CI/CD Pipeline in GitLab in under 10 mins

    Why Chose GitLab Over Other CI tools?

    If there are many tools available in the market, like CircleCI, Github Actions, Travis CI, etc., what makes GitLab CI so special? The easiest way for you to decide if GitLab CI is right for you is to take a look at following use-case:

    GitLab knocks it out of the park when it comes to code collaboration and version control. Monitoring the entire code repository along with all branches becomes manageable. With other popular tools like Jenkins, you can only monitor some branches. If your development teams are spread across multiple locations globally, GitLab serves a good purpose. Regarding price, while Jenkins is free, you need to have a subscription to use all of Gitlab’s features.

    In GitLab, every branch can contain the gitlab-ci.yml file, which makes it easy to modify the workflows. For example, if you want to run unit tests on branch A and perform functional testing on branch B, you can simply modify the YAML configuration for CI/CD, and the runner will take care of running the job for you. Here is a comprehensive list of Pros and Cons of Gitlab to help you make a better decision.

    Intro

    GitLab is an open-source collaboration platform that provides powerful features beyond hosting a code repository. You can track issues, host packages and registries, maintain Wikis, set up continuous integration (CI) and continuous deployment (CD) pipelines, and more.

    In this tutorial, you will configure a pipeline with three stages: build, deploy, test. The pipeline will run for each commit pushed to the repository.

    GitLab and CI/CD

    As we all are aware, a fully-fledged CI/CD pipeline primarily includes the following stages:

    • Build
    • Test
    • Deploy

    Here is a pictorial representation of how GitLab covers CI and CD:

    Source: gitlab.com

    Let’s take a look at an example of an automation testing pipeline. Here, CI empowers test automation and CD automates the release process to various environments. The below image perfectly demonstrates the entire flow.

    Source: xenonstack.com

    Let’s create the basic 3-stage pipeline

    Step 1: Create a project > Create a blank project

    Visit gitlab.com and create your account if you don’t have one already. Once done, click “New Project,” and on the following screen, click “Create Blank Project.” Name it My First Project, leave other settings to default for now, and click Create.
    Alternatively, if you already have your codebase in GitLab, proceed to Step 2.

    Step 2: Create a GitLab YAML

    To create a pipeline in GitLab, we need to define it in a YAML file. This yaml file should reside in the root directory of your project and should be named gitlab-ci.yml. GitLab provides a set of predefined keywords that are used to define a pipeline. 

    In order to design a basic pipeline, let’s understand the structure of a pipeline. If you are already familiar with the basic structure given below, you may want to jump below to the advanced pipeline outline for various environments.

    The hierarchy in GitLab has Pipeline > Stages > Jobs as shown below. The Source or SRC  is often a git commit or a CRON job, which triggers the pipeline on a defined branch.

    Now, let’s understand the commonly used keywords to design a pipeline:

    1. stages: This is used to define stages in the pipeline.
    2. variables: Here you can define the environment variables that can be accessed in all the jobs.
    3. before_script: This is a list of commands to be executed before each job. For example: creating specific directories, logging, etc.
    4. artifacts: If your job creates any artifacts, you can mention the path to find them here.
    5. after_script: This is a list of commands to be executed after each job. For example: cleanup.
    6. tags: This is a tag/label to identify the runner or a GitLab agent to assign your jobs to. If the tags are not specified, the jobs run on shared runners.
    7. needs: If you want your jobs to be executed in a certain order or you want a particular job to be executed before the current job, then you can set this value to the specific job name.
    8. only/except: These keywords are used to control when the job should be added to the pipeline. Use ‘only’ to define when a job should be added, whereas ‘except’ is used to define when a job should not be added. Alternatively, the ‘rules’ keyword is also used to add/exclude jobs based on conditions.

    You can find more keywords here.

    Let’s create a sample YAML file.

    stages:
        - build
        - deploy
        - test
    
    variables:
      RAILS_ENV: "test"
      NODE_ENV: "test"
      GIT_STRATEGY: "clone"
      CHROME_VERSION: "103"
      DOCKER_VERSION: "20.10.14"
    
    build-job:
      stage: build
      script:
        - echo "Check node version and build your binary or docker image."
        - node -v
        - bash buildScript.sh
    
    deploy-code:
      stage: deploy
      needs: build-job
      script:
        - echo "Deploy your code "
        - cd to/your/desired/folder
        - bash deployScript.sh
    
    test-code:
      stage: test
      needs: deploy-code
      script:
        - echo "Run your tests here."
        - cd to/your/desired/folder
        - npm run test

    As you can see, if you have your scripts in a bash file, you can run them from here providing the correct path. 

    Once your YAML is ready, commit the file. 

    Step 3: Check Pipeline Status

    Navigate to CICD > Pipelines from the left navigation bar. You can check the status of the pipeline on this page.

    Here, you can check the commit ID, branch, the user who triggered the pipeline, stages, and their status.

    If you click on the status, you will get a detailed view of pipeline execution.

    If you click on a job under any stage, you can check console logs in detail.

    If you have any artifacts created in your pipeline jobs, you can find them by clicking on the 3 dots for the pipeline instance.

    Advanced Pipeline Outline

    For an advanced pipeline that consists of various environments, you can refer to the below YAML. Simply remove the echo statements and replace them with your set of commands.

    image: your-repo:tag
    variables:
    DOCKER_DRIVER: overlay2
    DOCKER_TLS_CERTDIR: ""
    DOCKER_HOST: tcp://localhost:2375
    SAST_DISABLE_DIND: "true"
    DS_DISABLE_DIND: "false"
    GOCACHE: "$CI_PROJECT_DIR/.cache"
    cache: # this section is used to cache libraries etc between pipeline runs thus reducing the amount of time required for pipeline to run
    key: ${CI_PROJECT_NAME}
    paths:
      - cache-path/
    #include:
     #- #echo "You can add other projects here."
     #- #project: "some/other/important/project"
       #ref: main
       #file: "src/project.yml"
    default:
    tags:
      - your-common-instance-tag
    stages:
    - build
    - test
    - deploy_dev
    - dev_tests
    - deploy_qa
    - qa_tests
    - rollback_qa
    - prod_gate
    - deploy_prod
    - rollback_prod
    - cleanup
    build:
    stage: build
    services:
      - docker:19.03.0-dind
    before_script:
      - echo "Run your pre-build commnadss here"
      - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    script:
      - docker build -t $CI_REGISTRY/repo:$DOCKER_IMAGE_TAG  --build-arg GITLAB_USER=$GITLAB_USER --build-arg GITLAB_PASSWORD=$GITLAB_PASSWORD -f ./Dockerfile .
      - docker push $CI_REGISTRY/repo:$DOCKER_IMAGE_TAG
      - echo "Run your builds here"
    unit_test:
    stage: test
    image: your-repo:tag
    script:
      - echo "Run your unit tests here"
    linting:
    stage: test
    image: your-repo:tag
    script:
      - echo "Run your linting tests here"
    sast:
    stage: test
    image: your-repo:tag
    script:
      - echo "Run your Static application security testing here "
    deploy_dev:
    stage: deploy_dev
    image: your-repo:tag
    before_script:
      - source file.sh
      - export VARIABLE="$VALUE"
      - echo "deploy on dev"
    script:
      - echo "deploy on dev"
    after_script:
      #if deployment fails run rollback on dev
      - echo "Things to do after deployment is run"
    only:
      - master #Depends on your branching strategy
    integration_test_dev:
    stage: dev_tests
    image: your-repo:tag
    script:
      - echo "run test  on dev"
    only:
      - master
    allow_failure: true # In case failures are allowed
    deploy_qa:
    stage: deploy_qa
    image: your-repo:tag
    before_script:
      - source file.sh
      - export VARIABLE="$VALUE"
      - echo "deploy on qa"
    script:
      - echo "deploy on qa
    after_script:
      #if deployment fails run rollback on qa
      - echo "Things to do after deployment script is complete "
    only:
      - master
    needs: ["integration_test_dev", "deploy_dev"]
    allow_failure: false
    integration_test_qa:
    stage: qa_tests
    image: your-repo:tag
    script:
      - echo "deploy on qa
    only:
      - master
    allow_failure: true # in case you want to allow failures
     
    rollback_qa:
    stage: rollback_qa
    image: your-repo:tag
    before_script:
      - echo "Things to rollback after qa integration failure"
    script:
      - echo "Steps to rollback"
    after_script:
      - echo "Things to do after rollback"
    only:
      - master
    needs:
      [
        "deploy_qa",
      ]
    when: on_failure #This will run in case the qa deploy job fails
    allow_failure: false
     
    prod_gate: # this is manual gate for prod approval
    before_script:
      - echo "your commands here"
    stage: prod_gate
    only:
      - master
    needs:
      - deploy_qa
    when: manual
     
     
    deploy_prod:
    stage: deploy_prod
    image: your-repo:tag
    tags:
      - some-tag
    before_script:
      - source file.sh
      - echo "your commands here"
    script:
      - echo "your commands here"
    after_script:
      #if deployment fails
      - echo "your commands here"
    only:
      - master
    needs: [ "deploy_qa"]
    allow_failure: false
    rollback_prod: # This stage should be run only when prod deployment fails
    stage: rollback_prod
    image: your-repo:tag
    before_script:
      - export VARIABLE="$VALUE"
      - echo "your commands here"
    script:
      - echo "your commands here"
    only:
      - master
    needs: [ "deploy_prod"]
    allow_failure: false
    when: on_failure
    cleanup:
    stage: cleanup
    script:
      - echo "run cleanup"
      - rm -rf .cache/
    when: always

    Conclusion

    If you have worked with Jenkins, you know the pain points of working with groovy code. Thus, GitLab CI makes it easy to design, understand, and maintain the pipeline code. 

    Here are some pros and cons of using GitLab CI that will help you decide if this is the right tool for you!