Category: Cloud & DevOps

  • Getting Started With Kubernetes Operators (Helm Based) – Part 1

    Introduction

    The concept of operators was introduced by CoreOs in the last quarter of  2016 and post the introduction of operator framework last year, operators are rapidly becoming the standard way of managing applications on Kubernetes especially the ones which are stateful in nature. In this blog post, we will learn what an operator is. Why they are needed and what problems do they solve. We will also create a helm based operator as an example.

    This is the first part of our Kubernetes Operator Series. In the second part, getting started with Kubernetes operators (Ansible based), and the third part, getting started with Kubernetes operators (Golang based), you can learn how to build Ansible and Golang based operators.

    What is an Operator?

    Whenever we deploy our application on Kubernetes we leverage multiple Kubernetes objects like deployment, service, role, ingress, config map, etc. As our application gets complex and our requirements become non-generic, managing our application only with the help of native Kubernetes objects becomes difficult and we often need to introduce manual intervention or some other form of automation to make up for it.

    Operators solve this problem by making our application first class Kubernetes objects that is we no longer deploy our application as a set of native Kubernetes objects but a custom object/resource of its kind, having a more domain-specific schema and then we bake the “operational intelligence” or the “domain-specific knowledge” into the controller responsible for maintaining the desired state of this object. For example, etcd operator has made etcd-cluster a first class object and for deploying the cluster we create an object of Etcd Cluster kind. With operators, we are able to extend Kubernetes functionalities for custom use cases and manage our applications in a Kubernetes specific way allowing us to leverage Kubernetes APIs and Kubectl tooling.

    Operators combine crds and custom controllers and intend to eliminate the requirement for manual intervention (human operator) while performing tasks like an upgrade, handling failure recovery, scaling in case of complex (often stateful) applications and make them more resilient and self-sufficient.

    How to Build Operators ?

    For building and managing operators we mostly leverage the Operator Framework which is an open source tool kit allowing us to build operators in a highly automated, scalable and effective way.  Operator framework comprises of three subcomponents:

    1. Operator SDK: Operator SDK is the most important component of the operator framework. It allows us to bootstrap our operator project in minutes. It exposes higher level APIs and abstraction and saves developers the time to dig deeper into kubernetes APIs and focus more on building the operational logic. It performs common tasks like getting the controller to watch the custom resource (cr) for changes etc as part of the project setup process.
    2. Operator Lifecycle Manager:  Operators also run on the same kubernetes clusters in which they manage applications and more often than not we create multiple operators for multiple applications. Operator lifecycle manager (OLM) provides us a declarative way to install, upgrade and manage all the operators and their dependencies in our cluster.
    3. Operator Metering:  Operator metering is currently an alpha project. It records historical cluster usage and can generate usage reports showing usage breakdown by pod or namespace over arbitrary time periods.

    Types of Operators

    Currently there are three different types of operator we can build:

    1. Helm based operators: Helm based operators allow us to use our existing Helm charts and build operators using them. Helm based operators are quite easy to build and are preferred to deploy a stateless application using operator pattern.
    2. Ansible based Operator: Ansible based operator allows us to use our existing ansible playbooks and roles and build operators using them. There are also easy to build and generally preferred for stateless applications.
    3. Go based operators: Go based operators are built to solve the most complex use cases and are generally preferred for stateful applications. In case of an golang based operator, we build the controller logic ourselves providing it with all our custom requirements. This type of operators is also relatively complex to build.

    Building a Helm based operator

    1. Let’s first install the operator sdk

    go get -d github.com/operator-framework/operator-sdk
    cd $GOPATH/src/github.com/operator-framework/operator-sdk
    git checkout master
    make dep
    make install

    Now we will have the operator-sdk binary in the $GOPATH/bin folder.      

    2.  Setup the project

    For building a helm based operator we can use an existing Helm chart. We will be using the book-store Helm chart which deploys a simple python app and mongodb instances. This app allows us to perform crud operations via. rest endpoints.

    Now we will use the operator-sdk to create our Helm based bookstore-operator project.

    operator-sdk new bookstore-operator --api-version=velotio.com/v1alpha1 --kind=BookStore --type=helm --helm-chart=book-store
      --helm-chart-repo=https://akash-gautam.github.io/helmcharts/

    In the above command, the bookstore-operator is the name of our operator/project. –kind is used to specify the kind of objects this operator will watch and –api-verison is used for versioning of this object. The operator sdk takes only this much information and creates the custom resource definition (crd) and also the custom resource (cr) of its type for us (remember we talked about high-level abstraction operator sdk provides). The above command bootstraps a project with below folder structure.

    bookstore-operator/
    |
    |- build/ # Contains the Dockerfile to build the operator image
    |- deploy/ # Contains the crd,cr and manifest files for deploying operator
    |- helm-charts/ # Contains the helm chart we used while creating the project
    |- watches.yaml # Specifies the resource the operator watches (maintains the state of)

    We had discussed the operator-sdk automates setting up the operator projects and that is exactly what we can observe here. Under the build folder, we have the Dockerfile to build our operator image. Under deploy folder we have a crd folder containing both the crd and the cr. This folder also has operator.yaml file using which we will run the operator in our cluster, along with this we have manifest files for role, rolebinding and service account file to be used while deploying the operator.  We have our book-store helm chart under helm-charts. In the watches.yaml file.

    ---
    - version: v1alpha1
      group: velotio.com
      kind: BookStore
      chart: /opt/helm/helm-charts/book-store

    We can see that the bookstore-operator watches events related to BookStore kind objects and executes the helm chart specified.

    If we take a look at the cr file under deploy/crds (velotio_v1alpha1_bookstore_cr.yaml) folder then we can see that it looks just like the values.yaml file of our book-store helm chart.

    apiVersion: velotio.com/v1alpha1
    kind: BookStore
    metadata:
      name: example-bookstore
    spec:
      # Default values copied from <project_dir>/helm-charts/book-store/values.yaml
      
      # Default values for book-store.
      # This is a YAML-formatted file.
      # Declare variables to be passed into your templates.
      
      replicaCount: 1
      
      image:
        app:
          repository: akash125/pyapp
          tag: latest
          pullPolicy: IfNotPresent
        mongodb:
          repository: mongo
          tag: latest
          pullPolicy: IfNotPresent
          
      service:
        app:
          type: LoadBalancer
          port: 80
          targetPort: 3000
        mongodb:
          type: ClusterIP
          port: 27017
          targetPort: 27017
      
      
      resources: {}
        # We usually recommend not to specify default resources and to leave this as a conscious
        # choice for the user. This also increases chances charts run on environments with little
        # resources, such as Minikube. If you do want to specify resources, uncomment the following
        # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
        # limits:
        #  cpu: 100m
        #  memory: 128Mi
        # requests:
        #  cpu: 100m
        #  memory: 128Mi
      
      nodeSelector: {}
      
      tolerations: []
      
      affinity: {}

    In the case of Helm charts, we use the values.yaml file to pass the parameter to our Helm releases, Helm based operator converts all these configurable parameters into the spec of our custom resource. This allows us to express the values.yaml with a custom resource (CR) which, as a native Kubernetes object, enables the benefits of RBAC applied to it and an audit trail. Now when we want to update out deployed we can simply modify the CR and apply it, and the operator will ensure that the changes we made are reflected in our app.

    For each object of  `BookStore` kind  the bookstore-operator will perform the following actions:

    1. Create the bookstore app deployment if it doesn’t exists.
    2. Create the bookstore app service if it doesn’t exists.
    3. Create the mongodb deployment if it doesn’t exists.
    4. Create the mongodb service if it doesn’t exists.
    5. Ensure deployments and services match their desired configurations like the replica count, image tag, service port etc.  

    3. Build the Bookstore-operator Image

    The Dockerfile for building the operator image is already in our build folder we need to run the below command from the root folder of our operator project to build the image.

    operator-sdk build akash125/bookstore-operator:v0.0.1

    4. Run the Bookstore-operator

    As we have our operator image ready we can now go ahead and run it. The deployment file (operator.yaml under deploy folder) for the operator was created as a part of our project setup we just need to set the image for this deployment to the one we built in the previous step.

    After updating the image in the operator.yaml we are ready to deploy the operator.

    kubectl create -f deploy/service_account.yaml
    kubectl create -f deploy/role.yaml
    kubectl create -f deploy/role_binding.yaml
    kubectl create -f deploy/operator.yaml

    Note: The role created might have more permissions then actually required for the operator so it is always a good idea to review it and trim down the permissions in production setups.

    Verify that the operator pod is in running state.

    5. Deploy the Bookstore App

    Now we have the bookstore-operator running in our cluster we just need to create the custom resource for deploying our bookstore app.

    First, we can create bookstore cr we need to register its crd.

    kubectl apply -f deploy/crds/velotio_v1alpha1_bookstore_crd.yaml

    Now we can create the bookstore object.

    kubectl apply -f deploy/crds/velotio_v1alpha1_bookstore_cr.yaml

    Now we can see that our operator has deployed out book-store app.

    Now let’s grab the external IP of the app and make some requests to store details of books.

    Let’s hit the external IP on the browser and see if it lists the books we just stored:

    The bookstore operator build is available here.

    Conclusion

    Since its early days Kubernetes was believed to be a great tool for managing stateless application but the managing stateful applications on Kubernetes was always considered difficult. Operators are a big leap towards managing stateful applications and other complex distributed, multi (poly) cloud workloads with the same ease that we manage the stateless applications. In this blog post, we learned the basics of Kubernetes operators and build a simple helm based operator. In the next installment of this blog series, we will build an Ansible based Kubernetes operator and then in the last blog we will build a full-fledged Golang based operator for managing stateful workloads.

    Related Reads:

  • OPA On Kubernetes: An Introduction For Beginners

    Introduction:

    More often than not organizations need to apply various kinds of policies on the environments where they run their applications. These policies might be required to meet compliance requirements, achieve a higher degree of security, achieve standardization across multiple environments, etc. This calls for an automated/declarative way to define and enforce these policies. Policy engines like OPA help us achieve the same. 

    Motivation behind Open Policy Agent (OPA)

    When we run our application, it generally comprises multiple subsystems. Even in the simplest of cases, we will be having an API gateway/load balancer, 1-2 applications and a database. Generally, all these subsystems will have different mechanisms for authorizing the requests, for example, the application might be using JWT tokens to authorize the request, but your database is using grants to authorize the request, it is also possible that your application is accessing some third-party APIs or cloud services which will again have a different way of authorizing the request. Add to this your CI/CD servers, your log server, etc and you can see how many different ways of authorization can exist even in a small system. 

    The existence of so many authorization models in our system makes life difficult when we need to meet compliance or information security requirements or even some self-imposed organizational policies. For example, if we need to adhere to some new compliance requirements then we need to understand and implement the same for all the components which do authorization in our system.

    “The main motivation behind OPA is to achieve unified policy enforcements across the stack

    What are Open Policy Agent (OPA) and OPA Gatekeeper

    The OPA is an open-source, general-purpose policy engine that can be used to enforce policies on various types of software systems like microservices, CI/CD pipelines, gateways, Kubernetes, etc. OPA was developed by Styra and is currently a part of CNCF.

    OPA provides us with REST APIs which our system can call to check if the policies are being met for a request payload or not. It also provides us with a high-level declarative language, Rego which allows us to specify the policies we want to enforce as code. This provides us with lots of flexibility while defining our policies.

    The above image shows the architecture of OPA. It exposes APIs which any service that needs to make an authorization or policy decision, can call (policy query) and then OPA can make a decision based on the Rego code for the policy and return a decision to the service that further processes the request accordingly. The enforcement is done by the actual service itself, OPA is responsible only for making the decision. This is how OPA becomes a general-purpose policy engine and supports a large number of services.   

    The Gatekeeper project is a Kubernetes specific implementation of the OPA. Gatekeeper allows us to use OPA in a Kubernetes native way to enforce the desired policies. 

    How Gatekeeper enforces policies

    On the Kubernetes cluster, the Gatekeeper is installed as a ValidatingAdmissionWebhook. The Admission Controllers can intercept requests after they have been authenticated and authorized by the K8s API server, but before they are persisted in the database. If any of the admission controllers rejects the request then the overall request is rejected. The limitation of admission controllers is that they need to be compiled into the kube-apiserver and can be enabled only when the apiserver starts up. 

    To overcome this rigidity of the admission controller, admission webhooks were introduced. Once we enable admission webhooks controllers in our cluster, they can send admission requests to external HTTP callbacks and receive admission responses. Admission webhook can be of two types MutatingAdmissionWebhook and ValidatingAdmissionWebhook. The difference between the two is that mutating webhooks can modify the objects that they receive while validating webhooks cannot. The below image roughly shows the flow of an API request once both mutating and validating admission controllers are enabled.

     

    The role of Gatekeeper is to simply check if the request meets the defined policy or not, that is why it is installed as a validating webhook.

    Demo:

    Install Gatekeeper:

    kubectl apply -f
    https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/deploy/gatekeeper.yaml

    Now we have Gatekeeper up and running in our cluster. The above installation also created a CRD named `constrainttemplates.templates.gatekeeper.sh’. This CRD allows us to create constraint templates for the policy we want to enforce. In the constraint template, we define the constraints logic using the Rego code and also its schema. Once the constraint template is created, we can create the constraints which are instances of the constraint templates, created for specific resources. Think of it as function and actual function calls, the constraint templates are like functions that are invoked with different values of the parameter (resource kind and other values) by constraints.

    To get a better understanding of the same, let’s go ahead and create constraints templates and constraints.

    The policy that we want to enforce is to prevent developers from creating a service of type LoadBalancer in the `dev` namespace of the cluster, where they verify the working of other code. Creating services of type LoadBalancer in the dev environment is adding unnecessary costs. 

    Below is the constraint template for the same.

    apiVersion: templates.gatekeeper.sh/v1beta1
    kind: ConstraintTemplate
    metadata:
      name: lbtypesvcnotallowed
    spec:
      crd:
        spec:
          names:
            kind: LBTypeSvcNotAllowed
            listKind: LBTypeSvcNotAllowedList
            plural: lbtypesvcnotallowed
            singular: lbtypesvcnotallowed
      targets:
        - target: admission.k8s.gatekeeper.sh
          rego: |
            package kubernetes.admission
            violation[{"msg": msg}] {
                        input.review.kind.kind = "Service"
                        input.review.operation = "CREATE"
                        input.review.object.spec.type = "LoadBalancer"
                        msg := "LoadBalancer Services are not permitted"
            }

    In the constraint template spec, we define a new object kind/type which we will use while creating the constraints, then in the target, we specify the Rego code which will verify if the request meets the policy or not. In the Rego code, we specify a violation that if the request is to create a service of type LoadBalancer then the request should be denied.

    Using the above template, we can now define constraints:

    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: LBTypeSvcNotAllowed
    metadata:
      name: deny-lb-type-svc-dev-ns
    spec:
      match:
        kinds:
          - apiGroups: [""]
            kinds: ["Service"]
        namespaces:
          - "dev"

    Here we have specified the kind of the Kubernetes object (Service) on which we want to apply the constraint and we have specified the namespace as dev because we want the constraint to be enforced only on the dev namespace.

    Let’s go ahead and create the constraint template and constraint:

    Note: After creating the constraint template, please check if its status is true or not, otherwise you will get an error while creating the constraints. Also it is advisable to verify the Rego code snippet before using them in the constraints template.

    Now let’s try to create a service of type LoadBalancer in the dev namespace:

    kind: Service
    apiVersion: v1
    metadata:
      name: opa-service
    spec:
      type: LoadBalancer
      selector:
        app: opa-app
      ports:
      - protocol: TCP
        port: 80
        targetPort: 8080

    When we tried to create a service of type LoadBalancer in the dev namespace, we got the error that it was denied by the admission webhook due to `deny-lb-type-svc-dev-ns` constraint, but when we try to create the service in the default namespace, we were able to do so.

    Here we are not passing any parameters to the Rego policy from our constraints, but we can certainly do so to make our policy more generic, for example, we can add a field named servicetype to constraint template and in the policy code, deny all the request where the servicetype value defined in the constraint matches the value of the request. With this, we will be able to deny service of types other than LoadBalancer as well in any namespace of our cluster.

    Gatekeeper also provides auditing for resources that were created before the constraint was applied. The information is available in the status of the constraint objects. This helps us in identifying which objects in our cluster are not compliant with our constraints. 

    Conclusion:

    OPA allows us to apply fine-grained policies in our Kubernetes clusters and can be instrumental in improving the overall security of Kubernetes clusters which has always been a concern for many organizations while adopting or migrating to Kubernetes. It also makes meeting the compliance and audit requirements much simpler. There is some learning curve as we need to get familiar with Rego to code our policies, but the language is very simple and there are quite a few good examples to help in getting started.

  • Managing a TLS Certificate for Kubernetes Admission Webhook

    A Kubernetes admission controller is a great way of handling an incoming request, whether to add or modify fields or deny the request as per the rules/configuration defined. To extend the native functionalities, these admission webhook controllers call a custom-configured HTTP callback (webhook server) for additional checks. But the API server only communicates over HTTPS with the admission webhook servers and needs TLS cert’s CA information. This poses a problem for how we handle this webhook server certificate and how to pass CA information to the API server automatically.

    One way to handle these TLS certificate and CA is using Kubernetes cert-manager. However, Kubernetes cert-manager itself is a big application and consists of many CRDs to handle its operation. It is not a good idea to install cert-manager just to handle admission webhook TLS certificate and CA. The second and possibly easier way is to use self-signed certificate and handle CA on our own using the Init Container. This eliminates the dependency on other applications, like cert-manager, and gives us the flexibility to control our application flow.

    How is a custom admission webhook written? We will not cover this in-depth, and only a basic overview of admission controllers and their working will be covered. The main focus for this blog will be to cover the second approach step-by-step: handling admission webhook server TLS certificate and CA on our own using init container so that the API server can communicate with our custom webhook.

    To understand the in-depth working of Admission Controllers, these articles are great: 

    Prerequisites:

    • Knowledge of Kubernetes admission controllers,  MutatingAdmissionWebhook, ValidatingAdmissionWebhook
    • Knowledge of Kubernetes resources like pods and volumes

    Basic Overview: 

    Admission controllers intercept requests to the Kubernetes API server before persistence of objects in the etcd. These controllers are bundled and compiled together into the kube-apiserver binary. They consist of a list of controllers, and in that list, there are two special controllers: MutatingAdmissionWebhook and ValidatingAdmissionWebhook. MutatingAdmissionWebhook, as the name suggests, mutates/adds/modifies some fields in the request object by creating a patch, and ValidatingAdmissionWebhook validates the request by checking if the request object fields are valid or if the operation is allowed, etc., as per custom logic.

    The main reason for these types of controllers is to dynamically add new checks along with the native existing checks in Kubernetes to allow a request, just like the plug-in model. To understand this more clearly, let’s say we want all the deployments in the cluster to have certain required labels. If the deployment does not have required labels, then the create deployment request should be denied. This functionality can be achieved in two ways: 

    1) Add these extra checks natively in Kubernetes API server codebase, compile a new binary, and run with the new binary. This is a tedious process, and every time new checks are needed, a new binary is required. 

    2) Create a custom admission webhook, a simple HTTP server, for these additional checks, and register this admission webhook with the API server using AdmissionRegistration API. To register two configurations, MutatingWebhookconfiguration and ValidatingWebhookConfiguration are used. The second approach is recommended and it’s quite easy as well. We will be discussing it here in detail.

    Custom Admission Webhook Server:

    As mentioned earlier, a custom admission webhook server is a simple HTTP server with TLS that exposes endpoints for mutation and validation. Depending upon the endpoint hit, corresponding handlers process mutation and validation. Once a custom webhook server is ready and deployed in a cluster as a deployment along with webhook service, the next part is to register it with the API server so that the API server can communicate with the custom webhook server. To register, MutatingWebhookconfiguration and ValidatingWebhookConfiguration are used. These configurations have a section to fill custom webhook related information.

    apiVersion: admissionregistration.k8s.io/v1
    kind: MutatingWebhookConfiguration
    metadata:
      name: mutation-config
    webhooks:
      - admissionReviewVersions:
        - v1beta1
        name: mapplication.kb.io
        clientConfig:
          caBundle: ${CA_BUNDLE}
          service:
            name: webhook-service
            namespace: default
            path: /mutate
        rules:
          - apiGroups:
              - apps
          - apiVersions:
              - v1
            resources:
              - deployments
        sideEffects: None

    Here, the service field gives information about the name, namespace, and endpoint path of the webhook server running. An important field here to note is the CA bundle. A custom admission webhook is required to run the HTTP server with TLS only because the API server only communicates over HTTPS. So the webhook server runs with server cert, and key and “caBundle” in the configuration is CA (Certification Authority) information so that API server can recognize server certificate.

    The problem here is how to handle this server certificate and the key—and how to get this CA bundle and pass this information to the API server using MutatingWebhookconfiguration or ValidatingWebhookConfiguration. This will be the main focus of the following part.

    Here, we are going to use a self-signed certificate for the webhook server. Now, this self-signed certificate can be made available to the webhook server using different ways. Two possible ways are:

    • Create a Kubernetes secret containing certificate and key and mount that as volume on to the server pod
    • Somehow create certificate and key in a volume, e.g., emptyDir volume and server consumes those from that volume

    However, even after doing any of the above two possible ways, the remaining important part is to add the CA bundle in mutation/validation configs.

    So, instead of doing all these steps manually, we all make use of Kubernetes init containers to perform all functions for us.

    Custom Admission Webhook Server Init Container:

    The main function of this init container will be to create a self-signed webhook server certificate and provide the CA bundle to the API server via mutation/validation configs. How the webhook server consumes this certificate (via secret volume or emptyDir volume), depends on the use-case. This init container will run a simple Go binary to perform all these functions.

    package main
    
    import (
    	"bytes"
    	cryptorand "crypto/rand"
    	"crypto/rsa"
    	"crypto/x509"
    	"crypto/x509/pkix"
    	"encoding/pem"
    	"fmt"
    	log "github.com/sirupsen/logrus"
    	"math/big"
    	"os"
    	"time"
    )
    
    func main() {
    	var caPEM, serverCertPEM, serverPrivKeyPEM *bytes.Buffer
    	// CA config
    	ca := &x509.Certificate{
    		SerialNumber: big.NewInt(2020),
    		Subject: pkix.Name{
    			Organization: []string{"velotio.com"},
    		},
    		NotBefore:             time.Now(),
    		NotAfter:              time.Now().AddDate(1, 0, 0),
    		IsCA:                  true,
    		ExtKeyUsage:           []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth},
    		KeyUsage:              x509.KeyUsageDigitalSignature | x509.KeyUsageCertSign,
    		BasicConstraintsValid: true,
    	}
    
    	// CA private key
    	caPrivKey, err := rsa.GenerateKey(cryptorand.Reader, 4096)
    	if err != nil {
    		fmt.Println(err)
    	}
    
    	// Self signed CA certificate
    	caBytes, err := x509.CreateCertificate(cryptorand.Reader, ca, ca, &caPrivKey.PublicKey, caPrivKey)
    	if err != nil {
    		fmt.Println(err)
    	}
    
    	// PEM encode CA cert
    	caPEM = new(bytes.Buffer)
    	_ = pem.Encode(caPEM, &pem.Block{
    		Type:  "CERTIFICATE",
    		Bytes: caBytes,
    	})
    
    	dnsNames := []string{"webhook-service",
    		"webhook-service.default", "webhook-service.default.svc"}
    	commonName := "webhook-service.default.svc"
    
    	// server cert config
    	cert := &x509.Certificate{
    		DNSNames:     dnsNames,
    		SerialNumber: big.NewInt(1658),
    		Subject: pkix.Name{
    			CommonName:   commonName,
    			Organization: []string{"velotio.com"},
    		},
    		NotBefore:    time.Now(),
    		NotAfter:     time.Now().AddDate(1, 0, 0),
    		SubjectKeyId: []byte{1, 2, 3, 4, 6},
    		ExtKeyUsage:  []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth},
    		KeyUsage:     x509.KeyUsageDigitalSignature,
    	}
    
    	// server private key
    	serverPrivKey, err := rsa.GenerateKey(cryptorand.Reader, 4096)
    	if err != nil {
    		fmt.Println(err)
    	}
    
    	// sign the server cert
    	serverCertBytes, err := x509.CreateCertificate(cryptorand.Reader, cert, ca, &serverPrivKey.PublicKey, caPrivKey)
    	if err != nil {
    		fmt.Println(err)
    	}
    
    	// PEM encode the  server cert and key
    	serverCertPEM = new(bytes.Buffer)
    	_ = pem.Encode(serverCertPEM, &pem.Block{
    		Type:  "CERTIFICATE",
    		Bytes: serverCertBytes,
    	})
    
    	serverPrivKeyPEM = new(bytes.Buffer)
    	_ = pem.Encode(serverPrivKeyPEM, &pem.Block{
    		Type:  "RSA PRIVATE KEY",
    		Bytes: x509.MarshalPKCS1PrivateKey(serverPrivKey),
    	})
    
    	err = os.MkdirAll("/etc/webhook/certs/", 0666)
    	if err != nil {
    		log.Panic(err)
    	}
    	err = WriteFile("/etc/webhook/certs/tls.crt", serverCertPEM)
    	if err != nil {
    		log.Panic(err)
    	}
    
    	err = WriteFile("/etc/webhook/certs/tls.key", serverPrivKeyPEM)
    	if err != nil {
    		log.Panic(err)
    	}
    
    }
    
    // WriteFile writes data in the file at the given path
    func WriteFile(filepath string, sCert *bytes.Buffer) error {
    	f, err := os.Create(filepath)
    	if err != nil {
    		return err
    	}
    	defer f.Close()
    
    	_, err = f.Write(sCert.Bytes())
    	if err != nil {
    		return err
    	}
    	return nil
    }

    The steps to generate self-signed CA and sign webhook server certificate using this CA in Golang:

    • Create a config for the CA, ca in the code above.
    • Create an RSA private key for this CA, caPrivKey in the code above.
    • Generate a self-signed CA, caBytes, and caPEM above. Here caPEM is the PEM encoded caBytes and will be the CA bundle given to the API server.
    • Create a config for webhook server certificate, cert in the code above. The important field in this configuration is the DNSNames and commonName. This name must be the full webhook service name of the webhook server to reach the webhook pod.
    • Create an RS private key for the webhook server, serverPrivKey in the code above.
    • Create server certificate using ca and caPrivKey, serverCertBytes in the code above.
    • Now, PEM encode the serverPrivKey and serverCertBytes. This serverPrivKeyPEM and serverCertPEM is the TLS certificate and key and will be consumed by the webhook server.

    At this point, we have generated the required certificate, key, and CA bundle using init container. Now we will share this server certificate and key with the actual webhook server container in the same pod. 

    • One approach is to create an empty secret resource before-hand, create webhook deployment by passing the secret name as an environment variable. Init container will generate server certificate and key and populate the empty secret with certificate and key information. This secret will be mounted on to webhook server container to start HTTP server with TLS.
    • The second approach (used in the code above) is to use Kubernete’s native pod specific emptyDir volume. This volume will be shared between both the containers. In the code above, we can see the init container is writing these certificate and key information in a file on a particular path. This path will be the one emptyDir volume is mounted to, and the webhook server container will read certificate and key for TLS configuration from that path and start the HTTP webhook server. Refer to the below diagram:

    The pod spec will look something like this:

    spec:
      initContainers:
          image: <webhook init-image name>
          imagePullPolicy: IfNotPresent
          name: webhook-init
          volumeMounts:
            - mountPath: /etc/webhook/certs
              name: webhook-certs
      containers:
          image: <webhook server image name>
          imagePullPolicy: IfNotPresent
          name: webhook-server
          volumeMounts:
            - mountPath: /etc/webhook/certs
              name: webhook-certs
              readOnly: true
      volumes:
        - name: webhook-certs
          emptyDir: {}

    The only part remaining is to give this CA bundle information to the API server using mutation/validation configs. This can be done in two ways:

    • Patch the CA bundle in the existing MutatingWebhookConfiguration or ValidatingWebhookConfiguration using Kubernetes go-client in the init container.
    • Create MutatingWebhookConfiguration or ValidatingWebhookConfiguration in the init container itself with CA bundle information in configs.

    Here, we will create configs through init container. To get certain parameters, like mutation config name, webhook service name, and webhook namespace dynamically, we can take these values from init containers env:

    initContainers:
      image: <webhook init-image name>
      imagePullPolicy: IfNotPresent
      name: webhook-init
      volumeMounts:
        - mountPath: /etc/webhook/certs
          name: webhook-certs
      env:
        - name: MUTATE_CONFIG
          value: mutating-webhook-configuration
        - name: VALIDATE_CONFIG
          value: validating-webhook-configuration
        - name: WEBHOOK_SERVICE
          value: webhook-service
        - name: WEBHOOK_NAMESPACE
          value:  default

    To create MutatingWebhookConfiguration, we will add the below piece of code in init container code below the certificate generation code.

    package main
    
    import (
    	"bytes"
    	admissionregistrationv1 "k8s.io/api/admissionregistration/v1"
    	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    	"k8s.io/client-go/kubernetes"
    	"os"
    	ctrl "sigs.k8s.io/controller-runtime"
    )
    
    func createMutationConfig(caCert *bytes.Buffer) {
    
    	var (
    		webhookNamespace, _ = os.LookupEnv("WEBHOOK_NAMESPACE")
    		mutationCfgName, _  = os.LookupEnv("MUTATE_CONFIG")
    		// validationCfgName, _ = os.LookupEnv("VALIDATE_CONFIG") Not used here in below code
    		webhookService, _ = os.LookupEnv("WEBHOOK_SERVICE")
    	)
    	config := ctrl.GetConfigOrDie()
    	kubeClient, err := kubernetes.NewForConfig(config)
    	if err != nil {
    		panic("failed to set go -client")
    	}
    
    	path := "/mutate"
    	fail := admissionregistrationv1.Fail
    
    	mutateconfig := &admissionregistrationv1.MutatingWebhookConfiguration{
    		ObjectMeta: metav1.ObjectMeta{
    			Name: mutationCfgName,
    		},
    		Webhooks: []admissionregistrationv1.MutatingWebhook{{
    			Name: "mapplication.kb.io",
    			ClientConfig: admissionregistrationv1.WebhookClientConfig{
    				CABundle: caCert.Bytes(), // CA bundle created earlier
    				Service: &admissionregistrationv1.ServiceReference{
    					Name:      webhookService,
    					Namespace: webhookNamespace,
    					Path:      &path,
    				},
    			},
    			Rules: []admissionregistrationv1.RuleWithOperations{{Operations: []admissionregistrationv1.OperationType{
    				admissionregistrationv1.Create},
    				Rule: admissionregistrationv1.Rule{
    					APIGroups:   []string{"apps"},
    					APIVersions: []string{"v1"},
    					Resources:   []string{"deployments"},
    				},
    			}},
    			FailurePolicy: &fail,
    		}},
    	}
      
    	if _, err := kubeClient.AdmissionregistrationV1().MutatingWebhookConfigurations().Create(mutateconfig)
    		err != nil {
    		panic(err)
    	}
    }

    The code above is just a sample code to create MutatingWebhookConfiguration. Here first, we are importing the required packages. Then, we are reading the environment variables like webhookNamespace, etc. Next, we are defining the MutatingWebhookConfiguration struct with CA bundle information (created earlier) and other required information. Finally, we are creating a configuration using the go-client. The same approach can be followed for creating the ValidatingWebhookConfiguration. For cases of pod restart or deletion, we can add extra logic in init containers like delete the existing configs first before creating or updating only the CA bundle if configs already exist.  

    For certificate rotation, the approach will be different for each approach taken for serving this certificate to the server container:

    • If we are using emptyDir volume, then the approach will be to just restart the webhook pod. As emptyDir volume is ephemeral and bound to the lifecycle of the pod, on restart, a new certificate will be generated and served to the server container. A new CA bundle will be added in configs if configs already exist.
    • If we are using secret volume, then, while restarting the webhook pod, the expiration of the existing certificate from the secret can be checked to decide whether to use the existing certificate for the server or create a new one.

    In both cases, the webhook pod restart is required to trigger the certificate rotation/renew process. When you will want to restart the webhook pod and how the webhook pod will be restarted will vary depending on the use-case. A few possible ways can be using cron-job, controllers, etc.

    Now, our custom webhook is registered, the API server can read CA bundle information through configs, and the webhook server is ready to serve the mutation/validation requests as per rules defined in configs. 

    Conclusion:

    We covered how we will add additional checks mutation/validation by registering our own custom admission webhook server. We also covered how we can automatically handle webhook server TLS certificate and key using init containers and passing the CA bundle information to API server through mutation/validation configs.

    Related Articles:

    1. OPA On Kubernetes: An Introduction For Beginners

    2. Prow + Kubernetes – A Perfect Combination To Execute CI/CD At Scale

  • A Primer on HTTP Load Balancing in Kubernetes using Ingress on Google Cloud Platform

    Containerized applications and Kubernetes adoption in cloud environments is on the rise. One of the challenges while deploying applications in Kubernetes is exposing these containerized applications to the outside world. This blog explores different options via which applications can be externally accessed with focus on Ingress – a new feature in Kubernetes that provides an external load balancer. This blog also provides a simple hand-on tutorial on Google Cloud Platform (GCP).  

    Ingress is the new feature (currently in beta) from Kubernetes which aspires to be an Application Load Balancer intending to simplify the ability to expose your applications and services to the outside world. It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL, offer name based virtual hosting etc. Before we dive into Ingress, let’s look at some of the alternatives currently available that help expose your applications, their complexities/limitations and then try to understand Ingress and how it addresses these problems.

    Current ways of exposing applications externally:

    There are certain ways using which you can expose your applications externally. Lets look at each of them:

    EXPOSE Pod:

    You can expose your application directly from your pod by using a port from the node which is running your pod, mapping that port to a port exposed by your container and using the combination of your HOST-IP:HOST-PORT to access your application externally. This is similar to what you would have done when running docker containers directly without using Kubernetes. Using Kubernetes you can use hostPortsetting in service configuration which will do the same thing. Another approach is to set hostNetwork: true in service configuration to use the host’s network interface from your pod.

    Limitations:

    • In both scenarios you should take extra care to avoid port conflicts at the host, and possibly some issues with packet routing and name resolutions.
    • This would limit running only one replica of the pod per cluster node as the hostport you use is unique and can bind with only one service.

    EXPOSE Service:

    Kubernetes services primarily work to interconnect different pods which constitute an application. You can scale the pods of your application very easily using services. Services are not primarily intended for external access, but there are some accepted ways to expose services to the external world.

    Basically, services provide a routing, balancing and discovery mechanism for the pod’s endpoints. Services target pods using selectors, and can map container ports to service ports. A service exposes one or more ports, although usually, you will find that only one is defined.

    A service can be exposed using 3 ServiceType choices:

    • ClusterIP: Exposes the service on a cluster-internal IP. Choosing this value makes the service only reachable from within the cluster. This is the default ServiceType.
    • NodePort: Exposes the service on each Node’s IP at a static port (the NodePort). A ClusterIP service, to which the NodePort service will route, is automatically created. You’ll be able to contact the NodePort service, from outside the cluster, by requesting <nodeip>:<nodeport>.Here NodePort remains fixed and NodeIP can be any node IP of your Kubernetes cluster.</nodeport></nodeip>
    • LoadBalancer: Exposes the service externally using a cloud provider’s load balancer (eg. AWS ELB). NodePort and ClusterIP services, to which the external load balancer will route, are automatically created.
    • ExternalName: Maps the service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value. No proxying of any kind is set up. This requires version 1.7 or higher of kube-dns

    Limitations:

    • If we choose NodePort to expose our services, kubernetes will generate ports corresponding to the ports of your pods in the range of 30000-32767. You will need to add an external proxy layer that uses DNAT to expose more friendly ports. The external proxy layer will also have to take care of load balancing so that you leverage the power of your pod replicas. Also it would not be easy to add TLS or simple host header routing rules to the external service.
    • ClusterIP and ExternalName similarly while easy to use have the limitation where we can add any routing or load balancing rules.
    • Choosing LoadBalancer is probably the easiest of all methods to get your service exposed to the internet. The problem is that there is no standard way of telling a Kubernetes service about the elements that a balancer requires, again TLS and host headers are left out. Another limitation is reliance on an external load balancer (AWS’s ELB, GCP’s Cloud Load Balancer etc.)

    Endpoints

    Endpoints are usually automatically created by services, unless you are using headless services and adding the endpoints manually. An endpoint is a host:port tuple registered at Kubernetes, and in the service context it is used to route traffic. The service tracks the endpoints as pods, that match the selector are created, deleted and modified. Individually, endpoints are not useful to expose services, since they are to some extent ephemeral objects.

    Summary

    If you can rely on your cloud provider to correctly implement the LoadBalancer for their API, to keep up-to-date with Kubernetes releases, and you are happy with their management interfaces for DNS and certificates, then setting up your services as type LoadBalancer is quite acceptable.

    On the other hand, if you want to manage load balancing systems manually and set up port mappings yourself, NodePort is a low-complexity solution. If you are directly using Endpoints to expose external traffic, perhaps you already know what you are doing (but consider that you might have made a mistake, there could be another option).

    Given that none of these elements has been originally designed to expose services to the internet, their functionality may seem limited for this purpose.

    Understanding Ingress

    Traditionally, you would create a LoadBalancer service for each public application you want to expose. Ingress gives you a way to route requests to services based on the request host or path, centralizing a number of services into a single entrypoint.

    Ingress is split up into two main pieces. The first is an Ingress resource, which defines how you want requests routed to the backing services and second is the Ingress Controller which does the routing and also keeps track of the changes on a service level.

    Ingress Resources

    The Ingress resource is a set of rules that map to Kubernetes services. Ingress resources are defined purely within Kubernetes as an object that other entities can watch and respond to.

    Ingress Supports defining following rules in beta stage:

    • host header:  Forward traffic based on domain names.
    • paths: Looks for a match at the beginning of the path.
    • TLS: If the ingress adds TLS, HTTPS and a certificate configured through a secret will be used.

    When no host header rules are included at an Ingress, requests without a match will use that Ingress and be mapped to the backend service. You will usually do this to send a 404 page to requests for sites/paths which are not sent to the other services. Ingress tries to match requests to rules, and forwards them to backends, which are composed of a service and a port.

    Ingress Controllers

    Ingress controller is the entity which grants (or remove) access, based on the changes in the services, pods and Ingress resources. Ingress controller gets the state change data by directly calling Kubernetes API.

    Ingress controllers are applications that watch Ingresses in the cluster and configure a balancer to apply those rules. You can configure any of the third party balancers like HAProxy, NGINX, Vulcand or Traefik to create your version of the Ingress controller.  Ingress controller should track the changes in ingress resources, services and pods and accordingly update configuration of the balancer.

    Ingress controllers will usually track and communicate with endpoints behind services instead of using services directly. This way some network plumbing is avoided, and we can also manage the balancing strategy from the balancer. Some of the open source implementations of Ingress Controllers can be found here.

    Now, let’s do an exercise of setting up a HTTP Load Balancer using Ingress on Google Cloud Platform (GCP), which has already integrated the ingress feature in it’s Container Engine (GKE) service.

    Ingress-based HTTP Load Balancer in Google Cloud Platform

    The tutorial assumes that you have your GCP account setup done and a default project created. We will first create a Container cluster, followed by deployment of a nginx server service and an echoserver service. Then we will setup an ingress resource for both the services, which will configure the HTTP Load Balancer provided by GCP

    Basic Setup

    Get your project ID by going to the “Project info” section in your GCP dashboard. Start the Cloud Shell terminal, set your project id and the compute/zone in which you want to create your cluster.

    $ gcloud config set project glassy-chalice-129514$ 
    gcloud config set compute/zone us-east1-d
    # Create a 3 node cluster with name “loadbalancedcluster”$ 
    gcloud container clusters create loadbalancedcluster  

    Fetch the cluster credentials for the kubectl tool:

    $ gcloud container clusters get-credentials loadbalancedcluster --zone us-east1-d --project glassy-chalice-129514

    Step 1: Deploy an nginx server and echoserver service

    $ kubectl run nginx --image=nginx --port=80
    $ kubectl run echoserver --image=gcr.io/google_containers/echoserver:1.4 --port=8080
    $ kubectl get deployments
    NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    echoserver   1         1         1            1           15s
    nginx        1         1         1            1           26m

    Step 2: Expose your nginx and echoserver deployment as a service internally

    Create a Service resource to make the nginx and echoserver deployment reachable within your container cluster:

    $ kubectl expose deployment nginx --target-port=80  --type=NodePort
    $ kubectl expose deployment echoserver --target-port=8080 --type=NodePort

    When you create a Service of type NodePort with this command, Container Engine makes your Service available on a randomly-selected high port number (e.g. 30746) on all the nodes in your cluster. Verify the Service was created and a node port was allocated:

    $ kubectl get service nginx
    NAME      CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
    nginx     10.47.245.54   <nodes>       80:30746/TCP   20s
    $ kubectl get service echoserver
    NAME         CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
    echoserver   10.47.251.9   <nodes>       8080:32301/TCP   33s

    In the output above, the node port for the nginx Service is 30746 and for echoserver service is 32301. Also, note that there is no external IP allocated for this Services. Since the Container Engine nodes are not externally accessible by default, creating this Service does not make your application accessible from the Internet. To make your HTTP(S) web server application publicly accessible, you need to create an Ingress resource.

    Step 3: Create an Ingress resource

    On Container Engine, Ingress is implemented using Cloud Load Balancing. When you create an Ingress in your cluster, Container Engine creates an HTTP(S) load balancer and configures it to route traffic to your application. Container Engine has internally defined an Ingress Controller, which takes the Ingress resource as input for setting up proxy rules and talk to Kubernetes API to get the service related information.

    The following config file defines an Ingress resource that directs traffic to your nginx and echoserver server:

    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata: 
    name: fanout-ingress
    spec: 
    rules: 
    - http:     
    paths:     
    - path: /       
    backend:         
    serviceName: nginx         
    servicePort: 80     
    - path: /echo       
    backend:         
    serviceName: echoserver         
    servicePort: 8080

    To deploy this Ingress resource run in the cloud shell:

    $ kubectl apply -f basic-ingress.yaml

    Step 4: Access your application

    Find out the external IP address of the load balancer serving your application by running:

    $ kubectl get ingress fanout-ingres
    NAME             HOSTS     ADDRESS          PORTS     AG
    fanout-ingress   *         130.211.36.168   80        36s    

     

    Use http://<external-ip-address> </external-ip-address>and http://<external-ip-address>/echo</external-ip-address> to access nginx and the echo-server.

    Summary

    Ingresses are simple and very easy to deploy, and really fun to play with. However, it’s currently in beta phase and misses some of the features that may restrict it from production use. Stay tuned to get updates in Ingress on Kubernetes page and their Github repo.

    References

  • Elasticsearch 101: Fundamentals & Core Components

    Elasticsearch is currently the most popular way to implement free text search and analytics in applications. It is highly scalable and can easily manage petabytes of data. It supports a variety of use cases like allowing users to easily search through any portal, collect and analyze log data, build business intelligence dashboards to quickly analyze and visualize data.  

    This blog acts as an introduction to Elasticsearch and covers the basic concepts of clusters, nodes, index, document and shards.

    What is Elasticsearch?

    Elasticsearch (ES) is a combination of open-source, distributed, highly scalable data store, and Lucene – a search engine that supports extremely fast full-text search. It is a beautifully crafted software, which hides the internal complexities and provides full-text search capabilities with simple REST APIs. Elasticsearch is written in Java with Apache Lucene at its core. It should be clear that Elasticsearch is not like a traditional RDBMS. It is not suitable for your transactional database needs, and hence, in my opinion, it should not be your primary data store. It is a common practice to use a relational database as the primary data store and inject only required data into Elasticsearch.

    Elasticsearch is meant for fast text search. There are several functionalities, which make it different from RDBMS. Unlike RDBMS, Elasticsearch stores data in the form of a JSON document, which is denormalized and doesn’t support transactions, referential integrity, joins, and subqueries.

    Elasticsearch works with structured, semi-structured, and unstructured data as well. In the next section, let’s walk through the various components in Elasticsearch.

    Elasticsearch Components

    Cluster

    One or more servers collectively providing indexing and search capabilities form an Elasticsearch cluster. The cluster size can vary from a single node to thousands of nodes, depending on the use cases.

    Node

    Node is a single physical or virtual machine that holds full or part of your data and provides computing power for indexing and searching your data. Every node is identified with a unique name. If the node identifier is not specified, a random UUID is assigned as a node identifier at the startup. Every node configuration has the property `cluster.name`. The cluster will be formed automatically with all the nodes having the same `cluster.name` at startup.

    A node has to accomplish several duties such as:

    • storing the data
    • performing operations on data (indexing, searching, aggregation, etc.)
    • maintaining the health of the cluster

    Each node in a cluster can do all these operations. Elasticsearch provides the capability to split responsibilities across different nodes. This makes it easy to scale, optimize, and maintain the cluster. Based on the responsibilities, the following are the different types of nodes that are supported:

    Data Node

    Data node is the node that has storage and computation capability. Data node stores the part of data in the form of shards (explained in the later section). Data nodes also participate in the CRUD, search, and aggregate operations. These operations are resource-intensive, and hence, it is a good practice to have dedicated data nodes without having the additional load of cluster administration. By default, every node of the cluster is a data node.

    Master Node

    Master nodes are reserved to perform administrative tasks. Master nodes track the availability/failure of the data nodes. The master nodes are responsible for creating and deleting the indices (explained in the later section).

    This makes the master node a critical part of the Elasticsearch cluster. It has to be stable and healthy. A single master node for a cluster is certainly a single point of failure. Elasticsearch provides the capability to have multiple master-eligible nodes. All the master eligible nodes participate in an election to elect a master node. It is recommended to have a minimum of three nodes in the cluster to avoid a split-brain situation. By default, all the nodes are both data nodes as well as master nodes. However, some nodes can be master-eligible nodes only through explicit configuration.

    Coordinating-Only Node

    Any node, which is not a master node or a data node, is a coordinating node. Coordinating nodes act as smart load balancers. Coordinating nodes are exposed to end-user requests. It appropriately redirects the requests between data nodes and master nodes.

    To take an example, a user’s search request is sent to different data nodes. Each data node searches locally and sends the result back to the coordinating node. Coordinating node aggregates and returns the result to the user.

    There are a few concepts that are core to Elasticsearch. Understanding these basic concepts will tremendously ease the learning process.

    Index

    Index is a container to store data similar to a database in the relational databases. An index contains a collection of documents that have similar characteristics or are logically related. If we take an example of an e-commerce website, there will be one index for products, one for customers, and so on. Indices are identified by the lowercase name. The index name is required to perform the add, update, and delete operations on the documents.

    Type

    Type is a logical grouping of the documents within the index. In the previous example of product index, we can further group documents into types, like electronics, fashion, furniture, etc. Types are defined on the basis of documents having similar properties in it. It isn’t easy to decide when to use the type over the index. Indices have more overheads, so sometimes, it is better to use different types in the same index for better performance. There are a couple of restrictions to use types as well. For example, two fields having the same name in different types of documents should be of the same datatype (string, date, etc.).

    Document

    Document is the piece indexed by Elasticsearch. A document is represented in the JSON format. We can add as many documents as we want into an index. The following snippet shows how to create a document of type mobile in the index store. We will cover more about the individual field of the document in the Mapping Type section.

    HTTP POST <hostname:port>/store/mobile/
    {    
    "name": "Motorola G5",    
    "model": "XT3300",    
    "release_date": "2016-01-01",    
    "features": "16 GB ROM | Expandable Upto 128 GB | 5.2 inch Full HD Display | 12MP Rear Camera | 5MP Front Camera | 3000 mAh Battery | Snapdragon 625 Processor",    
    "ram_gb": "3",    
    "screen_size_inches": "5.2"
    }

    Mapping Types

    To create different types in an index, we need mapping types (or simply mapping) to be specified during index creation. Mappings can be defined as a list of directives given to Elasticseach about how the data is supposed to be stored and retrieved. It is important to provide mapping information at the time of index creation based on how we want to retrieve our data later. In the context of relational databases, think of mappings as a table schema.

    Mapping provides information on how to treat each JSON field. For example, the field can be of type date, geolocation, or person name. Mappings also allow specifying which fields will participate in the full-text search, and specify the analyzers used to transform and decorate data before storing into an index. If no mapping is provided, Elasticsearch tries to identify the schema itself, known as Dynamic Mapping. 

    Each mapping type has Meta Fields and Properties. The snippet below shows the mapping of the type mobile.

    {    
    "mappings": {        
      "mobile": {            
        "properties": {                
          "name": {                    
            "type": "keyword"                
          },                
            "model": {                    
              "type": "keyword"                
           },               
              "release_date": {                    
                "type": "date"                
           },                
                "features": {                    
                  "type": "text"               
             },                
                "ram_gb": {                    
                  "type": "short"                
              },                
                  "screen_size_inches": {                    
                    "type": "float"                
              }            
            }        
          }    
       }
    }

    Meta Fields

    As the name indicates, meta fields stores additional information about the document. Meta fields are meant for mostly internal usage, and it is unlikely that the end-user has to deal with meta fields. Meta field names starts with an underscore. There are around ten meta fields in total. We will talk about some of them here:

    _index

    It stores the name of the index document it belongs to. This is used internally to store/search the document within an index.

    _type

    It stores the type of the document. To get better performance, it is often included in search queries.

    _id

    This is the unique id of the document. It is used to access specific document directly over the HTTP GET API.

    _source

    This holds the original JSON document before applying any analyzers/transformations. It is important to note that Elasticsearch can query on fields that are indexed (provided mapping for). The _source field is not indexed, and hence, can’t be queried on but it can be included in the final search result.

    Fields Or Properties

    List of fields specifies which all JSON fields in the document should be included in a particular type. In the e-commerce website example, mobile can be a type. It will have fields, like operating_system, camera_specification, ram_size, etc.

    Fields also carry the data type information with them. This directs Elasticsearch to treat the specific fields in a particular way of storing/searching data. Data types are similar to what we see in any other programming language. We will talk about a few of them here.

    Simple Data Types

    Text

    This data type is used to store full-text like product description. These fields participate in full-text search. These types of fields are analyzed while storing, which enables to searching them by the individual word in it. Such fields are not used in sorting and aggregation queries.

    Keywords

    This type is also used to store text data, but unlike Text, it is not analyzed and stored. This is suitable to store information like a user’s mobile number, city, age, etc. These fields are used in filter, aggregation, and sorting queries. For e.g., list all users from a particular city and filter them by age.

    Numeric

    Elasticsearch supports a wide range of numeric type: long, integer, short, byte, double, float.

    There are a few more data types to support date, boolean (true/false, on/off, 1/0), IP (to store IP addresses).

    Special Data Types

    Geo Point

    This data type is used to store geographical location. It accepts latitude and longitude pair. For example, this data type can be used to arrange the user’s photo library by their geographical location or graphically display the locations trending on social media news.

    Geo Shape

    It allows storing arbitrary geometric shapes like rectangle, polygon, etc.

    Completion Suggester

    This data type is used to provide auto-completion feature over a specific field. As the user types certain text, the completion suggester can guide the user to reach particular results.

    Complex Data Type

    Object

    If you know JSON well, this concept won’t be new for you. Elasticsearch also allows storing nested JSON object structure as a document.

    Nested

    The Object data type is not that useful due to its underlying data representation in the Lucene index. Lucene index does not support inner JSON object. ES flattens the original JSON to make it compatible with storing in Lucene index. Thus, fields of the multiple inner objects get merged into one leading object to wrong search results. Most of the time, you may use Nested data type over Object.

    Shards

    Shards help with enabling Elasticsearch to become horizontally scalable. An index can store millions of documents and occupy terabytes of data. This can cause problems with performance, scalability, and maintenance. Let’s see how Shards help achieve scalability.

    Indices are divided into multiple units called Shards (refer the diagram below). Shard is a full-featured subset of an index. Shards of the same index now can reside on the same or different nodes of the cluster. Shard decides the degree of parallelism for search and indexing operations. Shards allow the cluster to grow horizontally. The number of shards per index can be specified at the time of index creation. By default, the number of shards created is 5. Although, once the index is created the number of shards can not be changed. To change the number of shards, reindex the data.

    Replication

    Hardware can fail at any time. To ensure fault tolerance and high availability, ES provides a feature to replicate the data. Shards can be replicated. A shard which is being copied is called as Primary Shard. The copy of the primary shard is called a replica shard or simply replica. Like the number of shards, the number of replication can also be specified at the time of index creation. Replication served two purposes:

    • High Availability – Replica is never been created on the same node where the primary shard is present. This ensures that data can be available through the replica shard even if the complete node is failed.
    • Performance – Replica can also contribute to search capabilities. The search queries will be executed parallelly across the replicas.

    To summarize, to achieve high availability and performance, the index is split into multiple shards. In a production environment, multiple replicas are created for every index. In the replicated index, only primary shards can serve write requests. However, all the shards (the primary shard as well as replicated shards) can serve read/query requests. The replication factor is defined at the time of index creation and can be changed later if required. Choosing the number of shards is an important exercise. As once defined, it can’t be changed. In critical scenarios, changing the number of shards requires creating a new index with required shards and reindexing old data.

    Summary

    In this blog, we have covered the basic but important aspects of Elasticsearch. In the following posts, I will talk about how indexing & searching works in detail. Stay tuned!

  • Mesosphere DC/OS Masterclass : Tips and Tricks to Make Life Easier

    DC/OS is an open-source operating system and distributed system for data center built on Apache Mesos distributed system kernel. As a distributed system, it is a cluster of master nodes and private/public nodes, where each node also has host operating system which manages the underlying machine. 

    It enables the management of multiple machines as if they were a single computer. It automates resource management, schedules process placement, facilitates inter-process communication, and simplifies the installation and management of distributed services. Its included web interface and available command-line interface (CLI) facilitate remote management and monitoring of the cluster and its services.

    • Distributed System DC/OS is distributed system with group of private and public nodes which are coordinated by master nodes.
    • Cluster Manager : DC/OS  is responsible for running tasks on agent nodes and providing required resources to them. DC/OS uses Apache Mesos to provide cluster management functionality.
    • Container Platform : All DC/OS tasks are containerized. DC/OS uses two different container runtimes, i.e. docker and mesos. So that containers can be started from docker images or they can be native executables (binaries or scripts) which are containerized at runtime by mesos.
    • Operating System :  As name specifies, DC/OS is an operating system which abstracts cluster h/w and s/w resources and provide common services to applications.

    Unlike Linux, DC/OS is not a host operating system. DC/OS spans multiple machines, but relies on each machine to have its own host operating system and host kernel.

    The high level architecture of DC/OS can be seen below :

    For the detailed architecture and components of DC/OS, please click here.

    Adoption and usage of Mesosphere DC/OS:

    Mesosphere customers include :

    • 30% of the Fortune 50 U.S. Companies
    • 5 of the top 10 North American Banks
    • 7 of the top 12 Worldwide Telcos
    • 5 of the top 10 Highest Valued Startups

    Some companies using DC/OS are :

    • Cisco
    • Yelp
    • Tommy Hilfiger
    • Uber
    • Netflix
    • Verizon
    • Cerner
    • NIO

    Installing and using DC/OS

    A guide to installing DC/OS can be found here. After installing DC/OS on any platform, install dcos cli by following documentation found here.

    Using dcos cli, we can manager cluster nodes, manage marathon tasks and services, install/remove packages from universe and it provides great support for automation process as each cli command can be output to json.

    NOTE: The tasks below are executed with and tested on below tools:

    • DC/OS 1.11 Open Source
    • DC/OS cli 0.6.0
    • jq:1.5-1-a5b5cbe

    DC/OS commands and scripts

    Setup DC/OS cli with DC/OS cluster

    dcos cluster setup <CLUSTER URL>

    Example :

    dcos cluster setup http://dcos-cluster.com

    The above command will give you the link for oauth authentication and prompt for auth token. You can authenticate yourself with any of Google, Github or Microsoft account. Paste the token generated after authentication to cli prompt. (Provided oauth is enabled).

    DC/OS authentication token

    docs config show core.dcos_acs_token

    DC/OS cluster url

    dcos config show core.dcos_url

    DC/OS cluster name

    dcos config show cluster.name

    Access Mesos UI

    <DC/OS_CLUSTER_URL>/mesos

    Example:

    http://dcos-cluster.com/mesos

    Access Marathon UI

    <DC/OS_CLUSTER_URL>/service/marathon

    Example:

    http://dcos-cluster.com/service/marathon

    Access any DC/OS service, like Marathon, Kafka, Elastic, Spark etc.[DC/OS Services]

    <DC/OS_CLUSTER_URL>/service/<SERVICE_NAME>

    Example:

    http://dcos-cluster.com/service/marathon
    http://dcos-cluster.com/service/kafka

    Access DC/OS slaves info in json using Mesos API [Mesos Endpoints]

    curl -H "Authorization: Bearer $(dcos config show 
    core.dcos_acs_token)" $(dcos config show 
    core.dcos_url)/mesos/slaves | jq

    Access DC/OS slaves info in json using DC/OS cli

    dcos node --json

    Note : DC/OS cli ‘dcos node –json’ is equivalent to running mesos slaves endpoint (/mesos/slaves)

    Access DC/OS private slaves info using DC/OS cli

    dcos node --json | jq '.[] | select(.type | contains("agent")) | select(.attributes.public_ip == null) | "Private Agent : " + .hostname ' -r

    Access DC/OS public slaves info using DC/OS cli

    dcos node --json | jq '.[] | select(.type | contains("agent")) | select(.attributes.public_ip != null) | "Public Agent : " + .hostname ' -r

    Access DC/OS private and public slaves info using DC/OS cli

    dcos node --json | jq '.[] | select(.type | contains("agent")) | if (.attributes.public_ip != null) then "Public Agent : " else "Private Agent : " end + " - " + .hostname ' -r | sort

    Get public IP of all public agents

    #!/bin/bash
    
    for id in $(dcos node --json | jq --raw-output '.[] | select(.attributes.public_ip == "true") | .id'); 
    do 
          dcos node ssh --option StrictHostKeyChecking=no --option LogLevel=quiet --master-proxy --mesos-id=$id "curl -s ifconfig.co"
    done 2>/dev/null

    Note: As ‘dcos node ssh’ requires private key to be added to ssh. Make sure you add your private key as ssh identity using :

    ssh-add </path/to/private/key/file/.pem>

    Get public IP of master leader

    dcos node ssh --option StrictHostKeyChecking=no --option LogLevel=quiet --master-proxy --leader "curl -s ifconfig.co" 2>/dev/null

    Get all master nodes and their private ip

    dcos node --json | jq '.[] | select(.type | contains("master"))
    | .ip + " = " + .type' -r

    Get list of all users who have access to DC/OS cluster

    curl -s -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)"
    "$(dcos config show core.dcos_url)/acs/api/v1/users" | jq ‘.array[].uid’ -r

    Add users to cluster using Mesosphere script (Run this on master)

    Users to add are given in list.txt, each user on new line

    for i in `cat list.txt`; do echo $i;
    sudo -i dcos-shell /opt/mesosphere/bin/dcos_add_user.py $i; done

    Add users to cluster using DC/OS API

    #!/bin/bash
    
    # Uage dcosAddUsers.sh <Users to add are given in list.txt, each user on new line>
    for i in `cat users.list`; 
    do 
      echo $i
      curl -X PUT -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)" "$(dcos config show core.dcos_url)/acs/api/v1/users/$i" -d "{}"
    done

    Delete users from DC/OS cluster organization

    #!/bin/bash
    
    # Usage dcosDeleteUsers.sh <Users to delete are given in list.txt, each user on new line>
    
    for i in `cat users.list`; 
    do 
      echo $i
      curl -X DELETE -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)" "$(dcos config show core.dcos_url)/acs/api/v1/users/$i" -d "{}"
    done

    Offers/resources from individual DC/OS agent

    In recent versions of the many dcos services, a scheduler endpoint at                

    http://yourcluster.com/service/<service-name>/v1/debug/offers

    will display an HTML table containing a summary of recently-evaluated offers. This table’s contents are currently very similar to what can be found in logs, but in a slightly more accessible format. Alternately, we can look at the scheduler’s logs in stdout. An offer is a set of resources all from one individual DC/OS agent.

    <DC/OS_CLUSTER_URL>/service/<service_name>/v1/debug/offers

    Example:

    http://dcos-cluster.com/service/kafka/v1/debug/offers
    http://dcos-cluster.com/service/elastic/v1/debug/offers

    Save JSON configs of all running Marathon apps

    #!/bin/bash
    
    # Save marathon configs in json format for all marathon apps
    # Usage : saveMarathonConfig.sh
    
    for service in `dcos marathon app list --quiet | tr -d "/" | sort`; do
      dcos marathon app show $service | jq '. | del(.tasks, .version, .versionInfo, .tasksHealthy, .tasksRunning, .tasksStaged, .tasksUnhealthy, .deployments, .executor, .lastTaskFailure, .args, .ports, .residency, .secrets, .storeUrls, .uris, .user)' >& $service.json
    done

    Get report of Marathon apps with details like container type, Docker image, tag or service version used by Marathon app.

    #!/bin/bash
    
    TMP_CSV_FILE=$(mktemp /tmp/dcos-config.XXXXXX.csv)
    TMP_CSV_FILE_SORT="${TMP_CSV_FILE}_sort"
    #dcos marathon app list --json | jq '.[] | if (.container.docker.image != null ) then .id + ",Docker Application," + .container.docker.image else .id + ",DCOS Service," + .labels.DCOS_PACKAGE_VERSION end' -r > $TMP_CSV_FILE
    dcos marathon app list --json | jq '.[] | .id + if (.container.type == "DOCKER") then ",Docker Container," + .container.docker.image else ",Mesos Container," + if(.labels.DCOS_PACKAGE_VERSION !=null) then .labels.DCOS_PACKAGE_NAME+":"+.labels.DCOS_PACKAGE_VERSION  else "[ CMD ]" end end' -r > $TMP_CSV_FILE
    sed -i "s|^/||g" $TMP_CSV_FILE
    sort -t "," -k2,2 -k3,3 -k1,1 $TMP_CSV_FILE > ${TMP_CSV_FILE_SORT}
    cnt=1
    printf '%.0s=' {1..150}
    printf "n  %-5s%-35s%-23s%-40s%-20sn" "No" "Application Name" "Container Type" "Docker Image" "Tag / Version"
    printf '%.0s=' {1..150}
    while IFS=, read -r app typ image; 
    do
            tag=`echo $image | awk -F':' -v im="$image" '{tag=(im=="[ CMD ]")?"NA":($2=="")?"latest":$2; print tag}'`
            image=`echo $image | awk -F':' '{print $1}'`
            printf "n  %-5s%-35s%-23s%-40s%-20s" "$cnt" "$app" "$typ" "$image" "$tag"
            cnt=$((cnt + 1))
            sleep 0.3
    done < $TMP_CSV_FILE_SORT
    printf "n"
    printf '%.0s=' {1..150}
    printf "n"

    Get DC/OS nodes with more information like node type, node ip, attributes, number of running tasks, free memory, free cpu etc.

    #!/bin/bash
    
    printf "n  %-15s %-18s%-18s%-10s%-15s%-10sn" "Node Type" "Node IP" "Attribute" "Tasks" "Mem Free (MB)" "CPU Free"
    printf '%.0s=' {1..90}
    printf "n"
    TAB=`echo -e "t"`
    dcos node --json | jq '.[] | if (.type | contains("leader")) then "Master (leader)" elif ((.type | contains("agent")) and .attributes.public_ip != null) then "Public Agent" elif ((.type | contains("agent")) and .attributes.public_ip == null) then "Private Agent" else empty end + "t"+ if(.type |contains("master")) then .ip else .hostname end + "t" +  (if (.attributes | length !=0) then (.attributes | to_entries[] | join(" = ")) else "NA" end) + "t" + if(.type |contains("agent")) then (.TASK_RUNNING|tostring) + "t" + ((.resources.mem - .used_resources.mem)| tostring) + "tt" +  ((.resources.cpus - .used_resources.cpus)| tostring)  else "ttNAtNAttNA"  end' -r | sort -t"$TAB" -k1,1d -k3,3d -k2,2d
    printf '%.0s=' {1..90}
    printf "n"

    Framework Cleaner

    Uninstall framework and clean reserved resources if any after framework is deleted/uninstalled. (applicable if running DC/OS 1.9 or older, if higher than 1.10, then only uninstall cli is sufficient)

    SERVICE_NAME=
    dcos package uninstall $SERVICE_NAME
    dcos node ssh --option StrictHostKeyChecking=no --master-proxy
    --leader "docker run mesosphere/janitor /janitor.py -r
    ${SERVICE_NAME}-role -p ${SERVICE_NAME}-principal -z dcos-service-${SERVICE_NAME}"

    Get DC/OS apps and their placement constraints

    dcos marathon app list --json | jq '.[] |
    if (.constraints != null) then .id, .constraints else empty end'

    Run shell command on all slaves

    #!/bin/bash
    
    # Run any shell command on all slave nodes (private and public)
    
    # Usage : dcosRunOnAllSlaves.sh <CMD= any shell command to run, Ex: ulimit -a >
    CMD=$1
    for i in `dcos node | egrep -v "TYPE|master" | awk '{print $1}'`; do 
       echo -e "n###> Running command [ $CMD ] on $i"
       dcos node ssh --option StrictHostKeyChecking=no --option LogLevel=quiet --master-proxy --private-ip=$i "$CMD"
       echo -e "======================================n"
    done

    Run shell command on master leader

    CMD=<shell command, Ex: ulimit -a >dcos node ssh --option StrictHostKeyChecking=no --option
    LogLevel=quiet --master-proxy --leader "$CMD"

    Run shell command on all master nodes

    #!/bin/bash
    
    # Run any shell command on all master nodes
    
    # Usage : dcosRunOnAllSlaves.sh <CMD= any shell command to run, Ex: ulimit -a >
    CMD=$1
    for i in `dcos node | egrep -v "TYPE|agent" | awk '{print $2}'` 
    do 
      echo -e "n###> Running command [ $CMD ] on $i"
      dcos node ssh --option StrictHostKeyChecking=no --option LogLevel=quiet --master-proxy --private-ip=$i "$CMD"
     echo -e "======================================n"
    done

    Add node attributes to dcos nodes and run apps on nodes with required attributes using placement constraints

    #!/bin/bash
    
    #1. SSH on node 
    #2. Create or edit file /var/lib/dcos/mesos-slave-common
    #3. Add contents as :
    #    MESOS_ATTRIBUTES=<key>:<value>
    #    Example:
    #    MESOS_ATTRIBUTES=TYPE:DB;DB_TYPE:MONGO;
    #4. Stop dcos-mesos-slave service
    #    systemctl stop dcos-mesos-slave
    #5. Remove link for latest slave metadata
    #    rm -f /var/lib/mesos/slave/meta/slaves/latest
    #6. Start dcos-mesos-slave service
    #    systemctl start dcos-mesos-slave
    #7. Wait for some time, node will be in HEALTHY state again.
    #8. Add app placement constraint with field = key and value = value
    #9. Verify attributes, run on any node
    #    curl -s http://leader.mesos:5050/state | jq '.slaves[]| .hostname ,.attributes'
    #    OR Check DCOS cluster UI
    #    Nodes => Select any Node => Details Tab
    
    tmpScript=$(mktemp "/tmp/addDcosNodeAttributes-XXXXXXXX")
    
    # key:value paired attribues, separated by ;
    ATTRIBUTES=NODE_TYPE:GPU_NODE
    
    cat <<EOF > ${tmpScript}
    echo "MESOS_ATTRIBUTES=${ATTRIBUTES}" | sudo tee /var/lib/dcos/mesos-slave-common
    sudo systemctl stop dcos-mesos-slave
    sudo rm -f /var/lib/mesos/slave/meta/slaves/latest
    sudo systemctl start dcos-mesos-slave
    EOF
    
    # Add the private ip of nodes on which you want to add attrubutes, one ip per line.
    for i in `cat nodes.txt`; do 
        echo $i
        dcos node ssh --master-proxy --option StrictHostKeyChecking=no --private-ip $i <$tmpScript
        sleep 10
    done

    Install DC/OS Datadog metrics plugin on all DC/OS nodes

    #!/bin/bash
    
    # Usage : bash installDCOSDataDogMetricsPlugin.sh <Datadog API KEY>
    
    DDAPI=$1
    
    if [[ -z $DDAPI ]]; then
        echo "[Datadog Plugin] Need datadog API key as parameter."
        echo "[Datadog Plugin] Usage : bash installDCOSDataDogMetricsPlugin.sh <Datadog API KEY>."
    fi
    tmpScriptMaster=$(mktemp "/tmp/installDatadogPlugin-XXXXXXXX")
    tmpScriptAgent=$(mktemp "/tmp/installDatadogPlugin-XXXXXXXX")
    
    declare agent=$tmpScriptAgent
    declare master=$tmpScriptMaster
    
    for role in "agent" "master"
    do
    cat <<EOF > ${!role}
    curl -s -o /opt/mesosphere/bin/dcos-metrics-datadog -L https://downloads.mesosphere.io/dcos-metrics/plugins/datadog
    chmod +x /opt/mesosphere/bin/dcos-metrics-datadog
    echo "[Datadog Plugin] Downloaded dcos datadog metrics plugin."
    export DD_API_KEY=$DDAPI
    export AGENT_ROLE=$role
    sudo curl -s -o /etc/systemd/system/dcos-metrics-datadog.service https://downloads.mesosphere.io/dcos-metrics/plugins/datadog.service
    echo "[Datadog Plugin] Downloaded dcos-metrics-datadog.service."
    sudo sed -i "s/--dcos-role master/--dcos-role \$AGENT_ROLE/g;s/--datadog-key .*/--datadog-key \$DD_API_KEY/g" /etc/systemd/system/dcos-metrics-datadog.service
    echo "[Datadog Plugin] Updated dcos-metrics-datadog.service with DD API Key and agent role."
    sudo systemctl daemon-reload
    sudo systemctl start dcos-metrics-datadog.service
    echo "[Datadog Plugin] dcos-metrics-datadog.service is started !"
    servStatus=\$(sudo systemctl is-failed dcos-metrics-datadog.service)
    echo "[Datadog Plugin] dcos-metrics-datadog.service status : \${servStatus}"
    #sudo systemctl status dcos-metrics-datadog.service | head -3
    #sudo journalctl -u dcos-metrics-datadog
    EOF
    done
    
    echo "[Datadog Plugin] Temp script for master saved at : $tmpScriptMaster"
    echo "[Datadog Plugin] Temp script for agent saved at : $tmpScriptAgent"
    
    for i in `dcos node | egrep -v "TYPE|master" | awk '{print $1}'` 
    do 
        echo -e "\n###> Node - $i"
        dcos node ssh --option LogLevel=quiet --option StrictHostKeyChecking=no --master-proxy --private-ip=$i < $tmpScriptAgent
        echo -e "======================================================="
    done
    
    for i in `dcos node | egrep -v "TYPE|agent" | awk '{print $2}'` 
    do 
        echo -e "\n###> Master Node - $i"
        dcos node ssh --option LogLevel=quiet --option StrictHostKeyChecking=no --master-proxy --private-ip=$i < $tmpScriptMaster
        echo -e "======================================================="
    done
    
    # Check status of dcos-metrics-datadog.service on all nodes.
    #for i in `dcos node | egrep -v "TYPE|master" | awk '{print $1}'` ; do  echo -e "\n###> $i"; dcos node ssh --option StrictHostKeyChecking=no --option LogLevel=quiet --master-proxy --private-ip=$i "sudo systemctl is-failed dcos-metrics-datadog.service"; echo -e "======================================\n"; done

    Get app / node metrics fetched by dcos-metrics component using metrics API

    • Get DC/OS node id [dcos node]
    • Get Node metrics (CPU, memory, local filesystems, networks, etc) :  <dc os_cluster_url=””>/system/v1/agent/<agent_id>/metrics/v0/node</agent_id></dc>
    • Get id of all containers running on that agent : <dc os_cluster_url=””>/system/v1/agent/<agent_id>/metrics/v0/containers</agent_id></dc>
    • Get Resource allocation and usage for the given container ID. : <dc os_cluster_url=””>/system/v1/agent/<agent_id>/metrics/v0/containers/<container_id></container_id></agent_id></dc>
    • Get Application-level metrics from the container (shipped in StatsD format using the listener available at STATSD_UDP_HOST and STATSD_UDP_PORT) : <dc os_cluster_url=””>/system/v1/agent/<agent_id>/metrics/v0/containers/<container_id>/app     </container_id></agent_id></dc>

    Get app / node metrics fetched by dcos-metrics component using dcos cli

    • Summary of container metrics for a specific task
    dcos task metrics summary <task-id>

    • All metrics in details for a specific task
    dcos task metrics details <task-id>

    • Summary of Node metrics for a specific node
    dcos task metrics summary <mesos-node-id>

    • All Node metrics in details for a specific node
    dcos node metrics details <mesos-node-id>

    NOTE – All above commands have ‘–json’ flag to use them programmatically.  

    Launch / run command inside container for a task

    DC/OS task exec cli only supports Mesos containers, this script supports both Mesos and Docker containers.

    #!/bin/bash
    
    echo "DCOS Task Exec 2.0"
    if [ "$#" -eq 0 ]; then
            echo "Need task name or id as input. Exiting."
            exit 1
    fi
    taskName=$1
    taskCmd=${2:-bash}
    TMP_TASKLIST_JSON=/tmp/dcostasklist.json
    dcos task --json > $TMP_TASKLIST_JSON
    taskExist=`cat /tmp/dcostasklist.json | jq --arg tname $taskName '.[] | if(.name == $tname ) then .name else empty end' -r | wc -l`
    if [[ $taskExist -eq 0 ]]; then 
            echo "No task with name $taskName exists."
            echo "Do you mean ?"
            dcos task | grep $taskName | awk '{print $1}'
            exit 1
    fi
    taskType=`cat $TMP_TASKLIST_JSON | jq --arg tname $taskName '[.[] | select(.name == $tname)][0] | .container.type' -r`
    TaskId=`cat $TMP_TASKLIST_JSON | jq --arg tname $taskName '[.[] | select(.name == $tname)][0] | .id' -r`
    if [[ $taskExist -ne 1 ]]; then
            echo -e "More than one instances. Please select task ID for executing command.n"
            #allTaskIds=$(dcos task $taskName | tee /dev/tty | grep -v "NAME" | awk '{print $5}' | paste -s -d",")
            echo ""
            read TaskId
    fi
    if [[ $taskType !=  "DOCKER" ]]; then
            echo "Task [ $taskName ] is of type MESOS Container."
            execCmd="dcos task exec --interactive --tty $TaskId $taskCmd"
            echo "Running [$execCmd]"
            $execCmd
    else
            echo "Task [ $taskName ] is of type DOCKER Container."
            taskNodeIP=`dcos task $TaskId | awk 'FNR == 2 {print $2}'`
            echo "Task [ $taskName ] with task Id [ $TaskId ] is running on node [ $taskNodeIP ]."
            taskContID=`dcos node ssh --option LogLevel=quiet --option StrictHostKeyChecking=no --private-ip=$taskNodeIP --master-proxy "docker ps -q --filter "label=MESOS_TASK_ID=$TaskId"" 2> /dev/null`
            taskContID=`echo $taskContID | tr -d 'r'`
            echo "Task Docker Container ID : [ $taskContID ]"
            echo "Running [ docker exec -it $taskContID $taskCmd ]"
            dcos node ssh --option StrictHostKeyChecking=no --option LogLevel=quiet --private-ip=$taskNodeIP --master-proxy "docker exec -it $taskContID $taskCmd" 2>/dev/null
    fi

    Get DC/OS tasks by node

    #!/bin/bash 
    
    function tasksByNodeAPI
    {
        echo "DC/OS Tasks By Node"
        if [ "$#" -eq 0 ]; then
            echo "Need node ip as input. Exiting."
            exit 1
        fi
        nodeIp=$1
        mesosId=`dcos node | grep $nodeIp | awk '{print $3}'`
        if [ -z "mesosId" ]; then
            echo "No node found with ip $nodeIp. Exiting."
            exit 1
        fi
        curl -s -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)" "$(dcos config show core.dcos_url)/mesos/tasks?limit=10000" | jq --arg mesosId $mesosId '.tasks[] | select (.slave_id == $mesosId and .state == "TASK_RUNNING") | .name + "ttt" + .id'  -r
    }
    
    function tasksByNodeCLI
    {
            echo "DC/OS Tasks By Node"
            if [ "$#" -eq 0 ]; then
                    echo "Need node ip as input. Exiting."
                    exit 1
            fi
            nodeIp=$1
            dcos task | egrep "HOST|$nodeIp"
    }

    Get cluster metadata – cluster Public IP and cluster ID

    curl -s -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)"           
    $(dcos config show core.dcos_url)/metadata 

    Sample Output:

    {
    "PUBLIC_IPV4": "123.456.789.012",
    "CLUSTER_ID": "abcde-abcde-abcde-abcde-abcde-abcde"
    }

    Get DC/OS metadata – DC/OS version

    curl -s -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)"
    $(dcos config show core.dcos_url)/dcos-metadata/dcos-version.jsonq  

    Sample Output:

    {
    "version": "1.11.0",
    "dcos-image-commit": "b6d6ad4722600877fde2860122f870031d109da3",
    "bootstrap-id": "a0654657903fb68dff60f6e522a7f241c1bfbf0f"
    }

    Get Mesos version

    curl -s -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)"
    $(dcos config show core.dcos_url)/mesos/version

    Sample Output:

    {
    "build_date": "2018-02-27 21:31:27",
    "build_time": 1519767087.0,
    "build_user": "",
    "git_sha": "0ba40f86759307cefab1c8702724debe87007bb0",
    "version": "1.5.0"
    }

    Access DC/OS cluster exhibitor UI (Exhibitor supervises ZooKeeper and provides a management web interface)

    <CLUSTER_URL>/exhibitor

    Access DC/OS cluster data from cluster zookeeper using Zookeeper Python client – Run inside any node / container

    from kazoo.client import KazooClient
    
    zk = KazooClient(hosts='leader.mesos:2181', read_only=True)
    zk.start()
    
    clusterId = ""
    # Here we can give znode path to retrieve its decoded data,
    # for ex to get cluster-id, use
    # data, stat = zk.get("/cluster-id")
    # clusterId = data.decode("utf-8")
    
    # Get cluster Id
    if zk.exists("/cluster-id"):
        data, stat = zk.get("/cluster-id")
        clusterId = data.decode("utf-8")
    
    zk.stop()
    
    print (clusterId)

    Access dcos cluster data from cluster zookeeper using exhibitor rest API

    # Get znode data using endpoint :
    # /exhibitor/exhibitor/v1/explorer/node-data?key=/path/to/node
    # Example : Get znode data for path = /cluster-id
    curl -s -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)"
    $(dcos config show core.dcos_url)/exhibitor/exhibitor/v1/explorer/node-data?key=/cluster-id

    Sample Output:

    {
    "bytes": "3333-XXXXXX",
    "str": "abcde-abcde-abcde-abcde-abcde-",
    "stat": "XXXXXX"
    }

    Get cluster name using Mesos API

    curl -s -H "Authorization: Bearer $(dcos config show core.dcos_acs_token)"
    $(dcos config show core.dcos_url)/mesos/state-summary | jq .cluster -r

    Mark Mesos node as decommissioned

    Some times instances which are running as DC/OS node gets terminated and can not come back online, like AWS EC2 instances, once terminated due to any reason, can not start back. When Mesos detects that a node has stopped, it puts the node in the UNREACHABLE state because Mesos does not know if the node is temporarily stopped and will come back online, or if it is permanently stopped. In such case, we can explicitly tell Mesos to put a node in the GONE state if we know a node will not come back.

    dcos node decommission <mesos-agent-id>

    Conclusion

    We learned about Mesosphere DC/OS, its functionality and roles. We also learned how to setup and use DC/OS cli and use http authentication to access DC/OS APIs as well as using DC/OS cli for automating tasks.

    We went through different API endpoints like Mesos, Marathon, DC/OS metrics, exhibitor, DC/OS cluster organization etc. Finally, we looked at different tricks and scripts to automate DC/OS, like DC/OS node details, task exec, Docker report, DC/OS API http authentication etc.