Tag: cloud native

  • A Practical Guide to Deploying Multi-tier Applications on Google Container Engine (GKE)

    Introduction

    All modern era programmers can attest that containerization has afforded more flexibility and allows us to build truly cloud-native applications. Containers provide portability – ability to easily move applications across environments. Although complex applications comprise of many (10s or 100s) containers. Managing such applications is a real challenge and that’s where container orchestration and scheduling platforms like Kubernetes, Mesosphere, Docker Swarm, etc. come into the picture. 
    Kubernetes, backed by Google is leading the pack given that Redhat, Microsoft and now Amazon are putting their weight behind it.

    Kubernetes can run on any cloud or bare metal infrastructure. Setting up & managing Kubernetes can be a challenge but Google provides an easy way to use Kubernetes through the Google Container Engine(GKE) service.

    What is GKE?

    Google Container Engine is a Management and orchestration system for Containers. In short, it is a hosted Kubernetes. The goal of GKE is to increase the productivity of DevOps and development teams by hiding the complexity of setting up the Kubernetes cluster, the overlay network, etc.

    Why GKE? What are the things that GKE does for the user?

    • GKE abstracts away the complexity of managing a highly available Kubernetes cluster.
    • GKE takes care of the overlay network
    • GKE also provides built-in authentication
    • GKE also provides built-in auto-scaling.
    • GKE also provides easy integration with the Google storage services.

    In this blog, we will see how to create your own Kubernetes cluster in GKE and how to deploy a multi-tier application in it. The blog assumes you have a basic understanding of Kubernetes and have used it before. It also assumes you have created an account with Google Cloud Platform. If you are not familiar with Kubernetes, this guide from Deis  is a good place to start.

    Google provides a Command-line interface (gcloud) to interact with all Google Cloud Platform products and services. gcloud is a tool that provides the primary command-line interface to Google Cloud Platform. Gcloud tool can be used in the scripts to automate the tasks or directly from the command-line. Follow this guide to install the gcloud tool.

    Now let’s begin! The first step is to create the cluster.

    Basic Steps to create cluster

    In this section, I would like to explain about how to create GKE cluster. We will use a command-line tool to setup the cluster.

    Set the zone in which you want to deploy the cluster

    $ gcloud config set compute/zone us-west1-a

    Create the cluster using following command,

    $ gcloud container --project <project-name> 
    clusters create <cluster-name> 
    --machine-type n1-standard-2 
    --image-type "COS" --disk-size "50" 
    --num-nodes 2 --network default 
    --enable-cloud-logging --no-enable-cloud-monitoring

    Let’s try to understand what each of these parameters mean:

    –project: Project Name

    –machine-type: Type of the machine like n1-standard-2, n1-standard-4

    –image-type: OS image.”COS” i.e. Container Optimized OS from Google: More Info here.

    –disk-size: Disk size of each instance.

    –num-nodes: Number of nodes in the cluster.

    –network: Network that users want to use for the cluster. In this case, we are using default network.

    Apart from the above options, you can also use the following to provide specific requirements while creating the cluster:

    –scopes: Scopes enable containers to direct access any Google service without needs credentials. You can specify comma separated list of scope APIs. For example:

    • Compute: Lets you view and manage your Google Compute Engine resources
    • Logging.write: Submit log data to Stackdriver.

    You can find all the Scopes that Google supports here: .

    –additional-zones: Specify additional zones to high availability. Eg. –additional-zones us-east1-b, us-east1-d . Here Kubernetes will create a cluster in 3 zones (1 specified at the beginning and additional 2 here).

    –enable-autoscaling : To enable the autoscaling option. If you specify this option then you have to specify the minimum and maximum required nodes as follows; You can read more about how auto-scaling works here. Eg:   –enable-autoscaling –min-nodes=15 –max-nodes=50

    You can fetch the credentials of the created cluster. This step is to update the credentials in the kubeconfig file, so that kubectl will point to required cluster.

    $ gcloud container clusters get-credentials my-first-cluster --project project-name

    Now, your First Kubernetes cluster is ready. Let’s check the cluster information & health.

    $ kubectl get nodes
    NAME    STATUS    AGE   VERSION
    gke-first-cluster-default-pool-d344484d-vnj1  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-kdd7  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-ytre2  Ready  2h  v1.6.4

    After creating Cluster, now let’s see how to deploy a multi tier application on it. Let’s use simple Python Flask app which will greet the user, store employee data & get employee data.

    Application Deployment

    I have created simple Python Flask application to deploy on K8S cluster created using GKE. you can go through the source code here. If you check the source code then you will find directory structure as follows:

    TryGKE/
    ├── Dockerfile
    ├── mysql-deployment.yaml
    ├── mysql-service.yaml
    ├── src    
      ├── app.py    
      └── requirements.txt    
      ├── testapp-deployment.yaml    
      └── testapp-service.yaml

    In this, I have written a Dockerfile for the Python Flask application in order to build our own image to deploy. For MySQL, we won’t build an image of our own. We will use the latest MySQL image from the public docker repository.

    Before deploying the application, let’s re-visit some of the important Kubernetes terms:

    Pods:

    The pod is a Docker container or a group of Docker containers which are deployed together on the host machine. It acts as a single unit of deployment.

    Deployments:

    Deployment is an entity which manages the ReplicaSets and provides declarative updates to pods. It is recommended to use Deployments instead of directly using ReplicaSets. We can use deployment to create, remove and update ReplicaSets. Deployments have the ability to rollout and rollback the changes.

    Services:

    Service in K8S is an abstraction which will connect you to one or more pods. You can connect to pod using the pod’s IP Address but since pods come and go, their IP Addresses change.  Services get their own IP & DNS and those remain for the entire lifetime of the service. 

    Each tier in an application is represented by a Deployment. A Deployment is described by the YAML file. We have two YAML files – one for MySQL and one for the Python application.

    1. MySQL Deployment YAML

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: mysql
    spec:
      template:
        metadata:
          labels:
            app: mysql
        spec:
          containers:
            - env:
                - name: MYSQL_DATABASE
                  value: admin
                - name: MYSQL_ROOT_PASSWORD
                  value: admin
              image: 'mysql:latest'
              name: mysql
              ports:
                - name: mysqlport
                  containerPort: 3306
                  protocol: TCP

    2. Python Application Deployment YAML

    apiVersion: apps/v1beta1
    kind: Deployment
    metadata:
      name: test-app
    spec:
      replicas: 1
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - name: test-app
            image: ajaynemade/pymy:latest
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 5000

    Each Service is also represented by a YAML file as follows:

    1. MySQL service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: mysql-service
    spec:
      ports:
      - port: 3306
        targetPort: 3306
        protocol: TCP
        name: http
      selector:
        app: mysql

    2. Python Application service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: test-service
    spec:
      type: LoadBalancer
      ports:
      - name: test-service
        port: 80
        protocol: TCP
        targetPort: 5000
      selector:
        app: test-app

    You will find a ‘kind’ field in each YAML file. It is used to specify whether the given configuration is for deployment, service, pod, etc.

    In the Python app service YAML, I am using type = LoadBalancer. In GKE, There are two types of cloud load balancers available to expose the application to outside world.

    1. TCP load balancer: This is a TCP Proxy-based load balancer. We will use this in our example.
    2. HTTP(s) load balancer: It can be created using Ingress. For more information, refer to this post that talks about Ingress in detail.

    In the MySQL service, I’ve not specified any type, in that case, type ‘ClusterIP’ will get used, which will make sure that MySQL container is exposed to the cluster and the Python app can access it.

    If you check the app.py, you can see that I have used “mysql-service.default” as a hostname. “Mysql-service.default” is a DNS name of the service. The Python application will refer to that DNS name while accessing the MySQL Database.

    Now, let’s actually setup the components from the configurations. As mentioned above, we will first create services followed by deployments.

    Services:

    $ kubectl create -f mysql-service.yaml
    $ kubectl create -f testapp-service.yaml

    Deployments:

    $ kubectl create -f mysql-deployment.yaml
    $ kubectl create -f testapp-deployment.yaml

    Check the status of the pods and services. Wait till all pods come to the running state and Python application service to get external IP like below:

    $ kubectl get services
    NAME            CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
    kubernetes      10.55.240.1     <none>        443/TCP        5h
    mysql-service   10.55.240.57    <none>        3306/TCP       1m
    test-service    10.55.246.105   35.185.225.67     80:32546/TCP   11s

    Once you get the external IP, then you should be able to make APIs calls using simple curl requests.

    Eg. To Store Data :

    curl -H "Content-Type: application/x-www-form-urlencoded" -X POST  http://35.185.225.67:80/storedata -d id=1 -d name=NoOne

    Eg. To Get Data :

    curl 35.185.225.67:80/getdata/1

    At this stage your application is completely deployed and is externally accessible.

    Manual scaling of pods

    Scaling your application up or down in Kubernetes is quite straightforward. Let’s scale up the test-app deployment.

    $ kubectl scale deployment test-app --replicas=3

    Deployment configuration for test-app will get updated and you can see 3 replicas of test-app are running. Verify it using,

    kubectl get pods

    In the same manner, you can scale down your application by reducing the replica count.

    Cleanup :

    Un-deploying an application from Kubernetes is also quite straightforward. All we have to do is delete the services and delete the deployments. The only caveat is that the deletion of the load balancer is an asynchronous process. You have to wait until it gets deleted.

    $ kubectl delete service mysql-service
    $ kubectl delete service test-service

    The above command will deallocate Load Balancer which was created as a part of test-service. You can check the status of the load balancer with the following command.

    $ gcloud compute forwarding-rules list

    Once the load balancer is deleted, you can clean-up the deployments as well.

    $ kubectl delete deployments test-app
    $ kubectl delete deployments mysql

    Delete the Cluster:

    $ gcloud container clusters delete my-first-cluster

    Conclusion

    In this blog, we saw how easy it is to deploy, scale & terminate applications on Google Container Engine. Google Container Engine abstracts away all the complexity of Kubernetes and gives us a robust platform to run containerised applications. I am super excited about what the future holds for Kubernetes!

    Check out some of Velotio’s other blogs on Kubernetes.

  • How To Get Started With Logging On Kubernetes?

    In distributed systems like Kubernetes, logging is critical for monitoring and providing observability and insight into an application’s operations. With the ever-increasing complexity of distributed systems and the proliferation of cloud-native solutions, monitoring and observability have become critical components in knowing how the systems are functioning.

    Logs don’t lie! They have been one of our greatest companions when investigating a production incident.

    How is logging in Kubernetes different?

    Log aggregation in Kubernetes differs greatly from logging on traditional servers or virtual machines, owing to the way it manages its applications (pods).

    When an app crashes on a virtual machine, its logs remain accessible until they are deleted. When pods are evicted, crashed, deleted, or scheduled on a different node in Kubernetes, the container logs are lost. The system is self-cleaning. As a result, you are left with no knowledge of why the anomaly occurred. Because default logging in Kubernetes is transient, a centralized log management solution is essential.

    Kubernetes is highly distributed and dynamic in nature; hence, in production, you’ll most certainly be working with multiple machines that have multiple containers each, which can crash at any time. Kubernetes clusters add to the complexity by introducing new layers that must be monitored, each of which generates its own type of log.

    We’ve curated some of the best tools to help you achieve this, alongside a simple guide on how to get started with each of them, as well as a comparison of these tools to match your use case.

    PLG Stack

    Introduction:

    Promtail is an agent that ships the logs from the local system to the Loki cluster.

    Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It indexes only metadata and doesn’t index the content of the log. This design decision makes it very cost-effective and easy to operate.

    Grafana is the visualisation tool which consumes data from Loki data source

    Loki is like Prometheus, but for logs: we prefer a multidimensional label-based approach to indexing and want a single-binary, easy to operate a system with no dependencies. Loki differs from Prometheus by focusing on logs instead of metrics, and delivering logs via push, instead of pull.

    Configuration Options:

    Installation with Helm chart –

    # Create a namespace to deploy PLG stack :
    
    kubectl create ns loki
    
    # Add Grafana's Helm Chart repository and Update repo :
    
    helm repo add grafana https://grafana.github.io/helm-charts
    helm repo update
    
    # Deploy the Loki stack :
    
    helm upgrade --install loki-stack grafana/loki-stack -n loki --set grafana.enabled=true
    
    # Retrieve password to log into Grafana with user admin
    
    kubectl get secret loki-stack-grafan -n loki -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
    
    # Finally execute command below to access the Grafana UI on http://localhost:3000
    
    kubectl port-forward -n loki service/loki-stack-grafana 3000:80

    Log in with user name “admin” and the password you retrieved earlier.

    Query Methods:

    Using CLI :

    Curl command to fetch logs directly from Loki

    curl -G -s "http://localhost:3100/loki/api/v1/query" 
    --data-urlencode 'query={job="shruti/logging-golang"}' | jq

    Using LogQL :

    • LogQL provides the functionality to filter logs through operators.

    For example :

    {container="kube-apiserver"} |= "error" != "timeout"

    • LogCLI is the command-line interface to Grafana Loki. It facilitates running LogQL queries against a Loki instance.

    For example :

    logcli query '{job="shruti/logging-golang"}'

    Using Dashboard :

    Click on Explore tab on the left side. Select Loki from the data source dropdown

    EFK Stack

    Introduction :

    The Elastic Stack contains most of the tools required for log management

    • Elastic search is an open source, distributed, RESTful and scalable search engine. It is a NoSQL database, primarily to store logs and retrive logs from Fluentd.
    • Log shippers such as LogStash, Fluentd , Fluent-bit. It is an open source log collection agent which support multiple data sources and output formats. It can forward logs to solutions like Stackdriver, CloudWatch, Splunk, Bigquery, etc.
    • Kibana as the UI tool for querying, data visualisation and dashboards. It has ability to virtually  build any type of dashboards using Kibana. Kibana Query Language (KQL) is used for querying elasticsearch data.
    • Fluentd ➖Deployed as daemonset as it need to collect the container logs from all the nodes. It connects to the Elasticsearch service endpoint to forward the logs.
    • ElasticSearch ➖ Deployed as statefulset as it holds the log data. A service endpoint is also exposed for Fluentd and Kibana to connect with it.
    • Kibana ➖ Deployed as deployment and connects to elasticsearch service endpoint.

    Configuration Options :

    Can be installed through helm chart as a Stack or as Individual components

    • Add the Elastic Helm charts repo:
     helm repo add elastic https://helm.elastic.co && helm repo update

    • More information related to deploying these Helm Charts can be found here

    After installation is complete and Kibana Dashboard is accessible, We need to define index patterns to be able to see logs in Kibana.

    From homepage, write Kibana / Index Patterns to search bar. Go to Index patterns page and click to Create index pattern on the right corner. You will see the list of index patterns here.

    Add required patterns to your indices and From left side menu, click to discover and check your logs 🙂

    Query Methods :

    • Elastic Search can be queried directly on any indices.
    • For Example –
    curl -XGET 'localhost:9200/my_index/my_type/_count?q=field:value&pretty'

    More information can be found here  https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html

    Using Dashboard :

    Graylog Stack

    Introduction

    Graylog is a leading centralised log management solution built to open standards for capturing, storing, and enabling real-time analysis of terabytes of machine. it supports the Master-Slave Architecture. The Graylog Stack — Graylog v3, Elasticsearch v6 along with MongoDB v3.

    Graylog is an open-source log management tool, using Elasticsearch as its storage. Unlike the ELK stack, which is built from individual components (Elasticsearch, Logstash, Kibana), Graylog is built as a complete package that can do everything.

    • One package with all the essentials of log processing: collect, parse, buffer, index, search, analyze
    • Additional features that you don’t get with the open-source ELK stack, such as role-based access control and alerts
    • Fits the needs of most centralized log management use-cases in one package
    • Easily scale both the storage (Elasticsearch) and the ingestion pipeline
    • Graylog’s extractors allow to extract fields out of log messages using a lot of methods such grok expression, regex and json

    Cons :

    • Visualization capabilities are limited, at least compared to ELK’s Kibana
    • Can’t use the whole ELK ecosystem, because they wouldn’t directly access the Elasticsearch API. Instead, Graylog has its own API
    • It is Not implemented for kubernetes distribution directly rather supports logging via fluent-bit/logstash/fluentd

    Configurations Options

    Graylog is very flexible in such a way that it supports multiple inputs (data sources ) we can mention :

    • GELF TCP.
    • GELF Kafka.
    • AWS Logs.

    as well as Outputs (how can Graylog nodes forward messages) — we can mention :

    • GELF Output.
    • STDOUT.- query via http / rest api

    Connecting External GrayLog Stack:

    • Host & IP (12201) TCP input to push logs to graylog stack directly

    Query Methods

    Using CLI:

    curl -u admin:password -H 'X-Requested-By: cli' "http://GRAYLOG_IP_OR_HOSTNAME/api/search/universal/relative?query=*&range=3600&limit=100&sort=timestamp:desc&pretty=true" -H "Accept: application/json" -H "Content-Type: application/json"

    Where:

    • query=* – replace * with your desired string
    • range=3600 – replace 3600 with time range (in seconds)
    • limit=100 – replace 100 with number of returned results
    • sort=timestamp:desc – replace timestamp:desc with field you want to sort

    Using Dashboard:

    One can easily navigate the filter section and perform search with the help of labels generated by log collectors.

    Splunk Stack

    Introduction

    Splunk is used for monitoring and searching through big data. It indexes and correlates information in a container that makes it searchable, and makes it possible to generate alerts, reports and visualisations.

    Configuration Options

    1. Helm based Installation as well as Operator based Installation is supported

    2. Splunk Connect for Kubernetes provides a way to import and search your Kubernetes logging, object, and metrics data in your Splunk platform deployment. Splunk Connect for Kubernetes supports importing and searching your container logs on ECS, EKS, AKS, GKE and Openshift

    3. Splunk Connect for Kubernetes supports installation using Helm.

    4. Splunk Connect for Kubernetes deploys a DaemonSet on each node. And in the DaemonSet, a Fluentd container runs and does the collecting job. Splunk Connector for Kubernetes collects three types of data – Logs, Objects and Metrics

    5. We need a minimum of two Splunk platform indexes

    One events index, which will handle logs and objects (you may also create two separate indexes for logs and objects).

    One metrics index. If you do not configure these indexes, Kubernetes Connect for Splunk uses the defaults created in your HTTP Event Collector (HEC) token.

    6. An HEC token will be required, before moving on to installation

    7. To install and configure defaults with Helm :

    Add Splunk chart repo

    helm repo add splunk <https://splunk.github.io/splunk-connect-for-kubernetes/>

    Get values file in your working directory and prepare this Values file.

    helm show values splunk/splunk-connect-for-kubernetes > values.yaml

    Once you have a Values file, you can simply install the chart with by running

    helm install my-splunk-connect -f values.yaml splunk/splunk-connect-for-kubernetes

    To learn more about using and modifying charts, see: https://github.com/splunk/splunk-connect-for-kubernetes/tree/main/helm-chart

    The values file for logging

    Query Methods

    Using CLI :

    curl --location -k --request GET '<https://localhost:8089/services/search/jobs/export?search=search%20index=%22event%22%20sourcetype=%22kube:container:docker-log-generator%22&output_mode=json>' -u admin:Admin123!

    Using Dashboard :

    Logging Stack

    Comparison of Tools

    Some of the other tools that are interesting but aren’t open source—but are too good not to talk about and offer end-to-end functionality for all your logging needs:

    Sumo Logic :

    This log management tool can store logs as well as metrics. It has a powerful search syntax, where you can define operations similarly to UNIX pipes.

    • Powerful query language
    • Capability to detect common log patterns and trends
    • Centralized management of agents
    • Supports Log Archival & Retention
    • Ability to perform Audit Trails and Compliance Tracking

    Configuration Options :

    • A subscription to Sumo Logic will be required
    • Helm installation
    • Provides options to install side-by-side existing Prometheus Operator

    More information can be found here!

    Cons :

    • Performance can be bad for searches over large data sets or long timeframes.
    • Deployment only available on Cloud, SaaS, and Web-Based
    • Expensive – Pricing is per ingested byte, so it forces you to pick and choose what you log, rather than ingesting everything and figuring it out later

    Datadog:

    Datadog is a SaaS that started up as a monitoring (APM) tool and later added log management capabilities as well.

    You can send logs via HTTP(S) or syslog, either via existing log shippers (rsyslog, syslog-ng, Logstash, etc.) or through Datadog’s own agent. With it, observe your logs in real-time using the Live Tail, without indexing them. You can also ingest all of the logs from your applications and infrastructure, decide what to index dynamically with filters, and then store them in an archive.

    It features Logging without Limits™, which is a double-edged sword: it’s harder to predict and manage costs, but you get pay-as-you-use pricing combined with the fact that you can archive and restore from archive

    • Log processing pipelines have the ability to process millions of logs per minute or petabytes per month seamlessly.
    • Automatically detects common log patterns
    • Can archive logs to AWS/Azure/Google Cloud storage and rehydrate them later
    • Easy search with good autocomplete (based on facets)
    • Integration with Datadog metrics and traces
    • Affordable, especially for short retention and/or if you rely on the archive for a few searches going back

    Configuration options :

    Using CLI :

    curl -X GET "<https://api.datadoghq.com/api/v2/logs/events>" -H "Content-Type: application/json" -H "DD-API-KEY: {DD_API_KEY}" -H "DD-APPLICATION-KEY: ${DD_APP_KEY}"

    Cons :

    • Not available on premises
    • It is a bit complicated to set up for the first time. Is not quite easy to use or know at first about all the available features that Datadog has. The interface is tricky and can be a hindrance sometimes. Following that, if application fields are not mapped in the right way, filters are not that useful.
    • Datadog per host pricing can be very expensive.

    Conclusion :

    As one can see, each software has its own benefits and downsides. Grafana’s Loki is more lightweight than Elastic Stack in overall performance, supporting Persistent Storage Options.

    That being said, the right solution platform really depends on each administrator’s needs.

    That’s all! Thank you.

    If you enjoyed this article, please like it.

    Feel free to drop a comment too.

  • Cloud Native Applications — The Why, The What & The How

    Cloud-native is an approach to build & run applications that can leverage the advantages of the cloud computing model — On demand computing power & pay-as-you-go pricing model. These applications are built and deployed in a rapid cadence to the cloud platform and offer organizations greater agility, resilience, and portability across clouds.

    This blog explains the importance, the benefits and how to go about building Cloud Native Applications.

    CLOUD NATIVE – The Why?

    Early technology adapters like FANG (Facebook, Amazon, Netflix & Google) have some common themes when it comes to shipping software. They have invested heavily in building capabilities that enable them to release new features regularly (weekly, daily or in some cases even hourly). They have achieved this rapid release cadence while supporting safe and reliable operation of their applications; in turn allowing them to respond more effectively to their customers’ needs.

    They have achieved this level of agility by moving beyond ad-hoc automation and by adopting cloud native practices that deliver these predictable capabilities. DevOps,Continuous Delivery, micro services & containers form the 4 main tenets of Cloud Native patterns. All of them have the same overarching goal of making application development and operations team more efficient through automation.

    At this point though, these techniques have only been successfully proven at the aforementioned software driven companies. Smaller, more agile companies are also realising the value here. However, as per Joe Beda(creator of Kubernetes & CTO at Heptio) there are very few examples of this philosophy being applied outside these technology centric companies.

    Any team/company shipping products should seriously consider adopting Cloud Native practices if they want to ship software faster while reducing risk and in turn delighting their customers.

    CLOUD NATIVE – The What?

    Cloud Native practices comprise of 4 main tenets.

     

    Cloud native — main tenets
    • DevOps is the collaboration between software developers and IT operations with the goal of automating the process of software delivery & infrastructure changes.
    • Continuous Delivery enables applications to released quickly, reliably & frequently, with less risk.
    • Micro-services is an architectural approach to building an application as a collection of small independent services that run on their own and communicate over HTTP APIs.
    • Containers provide light-weight virtualization by dynamically dividing a single server into one or more isolated containers. Containers offer both effiiciency & speed compared to standard Virual Machines (VMs). Containers provide the ability to manage and migrate the application dependencies along with the application. while abstracting away the OS and the underlying cloud platform in many cases.

    The benefits that can be reaped by adopting these methodologies include:

    1. Self managing infrastructure through automation: The Cloud Native practice goes beyond ad-hoc automation built on top of virtualization platforms, instead it focuses on orchestration, management and automation of the entire infrastructure right upto the application tier.
    2. Reliable infrastructure & application: Cloud Native practice ensures that it much easier to handle churn, replace failed components and even easier to recover from unexpected events & failures.
    3. Deeper insights into complex applications: Cloud Native tooling provides visualization for health management, monitoring and notifications with audit logs making applications easy to audit & debug
    4. Security: Enable developers to build security into applications from the start rather than an afterthought.
    5. More efficient use of resources: Containers are lighter in weight that full systems. Deploying applications in containers lead to increased resource utilization.

    Software teams have grown in size and the amount of applications and tools that a company needs to be build has grown 10x over last few years. Microservices break large complex applications into smaller pieces so that they can be developed, tested and managed independently. This enables a single microservice to be updated or rolled-back without affecting other parts of the application. Also nowadays software teams are distributed and microservices enables each team to own a small piece with service contracts acting as the communication layer.

    CLOUD NATIVE – The How?

    Now, lets look at the various building blocks of the cloud native stack that help achieve the above described goals. Here, we have grouped tools & solutions as per the problem they solve. We start with the infrastructure layer at the bottom, then the tools used to provision the infrastructure, following which we have the container runtime environment; above that we have tools to manage clusters of container environments and then at the very top we have the tools, frameworks to develop the applications.

    1. Infrastructure: At the very bottom, we have the infrastructure layer which provides the compute, storage, network & operating system usually provided by the Cloud (AWS, GCP, Azure, Openstack, VMware).

    2. Provisioning: The provisioning layer consists of automation tools that help in provisioning the infrastructure, managing images and deploying the application. Chef, Puppet & Ansible are the DevOps tools that give the ability to manage their configuration & environments. Spinnaker, Terraform, Cloudformation provide workflows to provision the infrastructure. Twistlock, Clair provide the ability to harden container images.

    3. Runtime: The Runtime provides the environment in which the application runs. It consists of the Container Engines where the application runs along with the associated storage & networking. containerd & rkt are the most widely used Container engines. Flannel, OpenContrail provide the necessary overlay networking for containers to interact with each other and the outside world while Datera, Portworx, AppOrbit etc. provide the necessary persistent storage enabling easy movement of containers across clouds.

    4. Orchestration and Management: Tools like Kubernetes, Docker Swarm and Apache Mesos abstract the management container clusters allowing easy scheduling & orchestration of containers across multiple hosts. etcd, Consul provide service registries for discovery while AVI, Envoy provide proxy, load balancer etc. services.

    5. Application Definition & Development: We can build micro-services for applications across multiple langauges — Python, Spring/Java, Ruby, Node. Packer, Habitat & Bitnami provide image management for the application to run across all infrastructure — container or otherwise. 
    Jenkins, TravisCI, CircleCI and other build automation servers provide the capability to setup continuous integration and delivery pipelines.

    6. Monitoring, Logging & Auditing: One of the key features of managing Cloud Native Infrastructure is the ability to monitor & audit the applications & underlying infrastructure.

    All modern monitoring platforms like Datadog, Newrelic, AppDynamic support monitoring of containers & microservices.

    Splunk, Elasticsearch & fluentd help in log aggregration while Open Tracing and Zipkin help in debugging applications.

    7. Culture: Adopting cloud native practices needs a cultural change where teams no longer work in independent silos. End-to-end automation of software delivery pipelines is only possible when there is an increased collaboration between development and IT operations team with a shared responbility.

    When we put all the pieces together we get the complete Cloud Native Landscape as shown below.

    Cloud Native Landscape

    I hope this post gives an idea why Cloud Native is important and what the main benefits are. As you may have noticed in the above infographic, there are several projects, tools & companies trying to solve similar problems. The next questions in mind most likely will be How do i get started? Which tools are right for me? and so on. I will cover these topics and more in my following blog posts. Stay tuned!

    Please let us know what you think by adding comments to this blog or reaching out to chirag_jog or Velotio on Twitter.

    Learn more about what we do at Velotio here and how Velotio can get you started on your cloud native journey here.

    References:

  • Container Security: Let’s Secure Your Enterprise Container Infrastructure!

    Introduction

    Containerized applications are becoming more popular with each passing year. A reason for this rise in popularity could be the pivotal role they play in Continuous Delivery by enabling fast and automated deployment of software services.

    Security still remains a major concern mainly because of the way container images are being used. In the world of VMs, infra/security team used to validate the OS images and installed packages for vulnerabilities. But with the adoption of containers, developers are building their own container images. Images are rarely built from scratch. They are typically built on some base image, which is itself built on top of other base images. When a developer builds a container image, he typically grabs a base image and other layers from public third party sources. These images and libraries may contain obsolete or vulnerable packages, thereby putting your infrastructure at risk. An added complexity is that many existing vulnerability-scanning tools may not work with containers, nor do they support container delivery workflows including registries and CI/CD pipelines. In addition, you can’t simply scan for vulnerabilities – you must scan, manage vulnerability fixes and enforce vulnerability-based policies.

    The Container Security Problem

    The table below shows the number of vulnerabilities found in the images available on dockerhub. Note that (as of April 2016) the worst offending community images contained almost 1,800 vulnerabilities! Official images were much better, but still contained 392 vulnerabilities in the worst case.

    If we look at the distribution of vulnerability severities, we see that pretty much all of them are high severity, for both official and community images. What we’re not told is the underlying distribution of vulnerability severities in the CVE database, so this could simply be a reflection of that distribution.

    Over 80% of the latest versions of official images contained at least one high severity vulnerability!

    • There are so many docker images readily available on dockerhub – are you sure the ones you are using are safe?
    • Do you know where your containers come from?
    • Are your developers downloading container images and libraries from unknown and potentially harmful sources?
    • Do the containers use third party library code that is obsolete or vulnerable?

    In this blog post, I will explain some of the solutions available which can help with these challenges. Solutions like ‘Docker scanning services‘, ‘Twistlock Trust’ and an open-source solution ‘Clair‘ from Coreos.com which can help in scanning and fixing vulnerability problems making your container images secure.

    Clair

    Clair is an open source project for the static analysis of vulnerabilities in application containers. It works as an API that analyzes every container layer to find known vulnerabilities using existing package managers such as Debian (dpkg), Ubuntu (dpkg), CentOS (rpm). It also can be used from the command line. It provides a list of vulnerabilities that threaten a container, and can notify users when new vulnerabilities that affect existing containers become known. In regular intervals, Clair ingests vulnerability metadata from a configured set of sources and stores it in the database. Clients use the Clair API to index their container images; this parses a list of installed source packages and stores them in the database. Clients use the Clair API to query the database; correlating data in real time, rather than a cached result that needs re-scanning.

    Clair identifies security issues that developers introduce in their container images. The vanilla process for using Clair is as follows:

    1. A developer programmatically submits their container image to Clair
    2. Clair analyzes the image, looking for security vulnerabilities
    3. Clair returns a detailed report of security vulnerabilities present in the image
    4. Developer acts based on the report

    How to use Clair

    Docker is required to follow along with this demonstration. Once Docker is installed, use the Dockerfile below to create an Ubuntu image that contains a version of SSL that is susceptible to Heartbleed attacks.

    #Dockerfile
    FROM ubuntu:precise-20160303
    #Install WGet
    RUN apt-get update
    RUN apt-get -f install
    RUN apt-get install -y wget
    #Install an OpenSSL vulnerable to Heartbleed (CVE-2014-0160)
    RUN wget --no-check-certificate https://launchpad.net/~ubuntu-security/+archive/ubuntu/ppa/+build/5436462/+files/openssl_1.0.1-4ubuntu5.11_amd64.deb
    RUN dpkg -i openssl_1.0.1-4ubuntu5.11_amd64.deb

    Build the image using below command:

    $ docker build . -t madhurnawandar/heartbeat

    After creating the insecure Docker image, the next step is to download and install Clair from here. The installation choice used for this demonstration was the Docker Compose solution. Once Clair is installed, it can be used via querying its API or through the clairctl command line tool. Submit the insecure Docker image created above to Clair for analysis and it will catch the Heartbleed vulnerability.

    $ clairctl analyze --local madhurnawandar/heartbeat
    Image: /madhurnawandar/heartbeat:latest
    9 layers found 
    ➜ Analysis [f3ce93f27451] found 0 vulnerabilities. 
    ➜ Analysis [738d67d10278] found 0 vulnerabilities. 
    ➜ Analysis [14dfb8014dea] found 0 vulnerabilities. 
    ➜ Analysis [2ef560f052c7] found 0 vulnerabilities. 
    ➜ Analysis [69a7b8948d35] found 0 vulnerabilities. 
    ➜ Analysis [a246ec1b6259] found 0 vulnerabilities. 
    ➜ Analysis [fc298ae7d587] found 0 vulnerabilities. 
    ➜ Analysis [7ebd44baf4ff] found 0 vulnerabilities. 
    ➜ Analysis [c7aacca5143d] found 52 vulnerabilities.
    $ clairctl report --local --format json madhurnawandar/heartbeat
    JSON report at reports/json/analysis-madhurnawandar-heartbeat-latest.json

    You can view the detailed report here.

    Docker Security Scanning

    Docker Cloud and Docker Hub can scan images in private repositories to verify that they are free from known security vulnerabilities or exposures, and report the results of the scan for each image tag. Docker Security Scanning is available as an add-on to Docker hosted private repositories on both Docker Cloud and Docker Hub.

    Security scanning is enabled on a per-repository basis and is only available for private repositories. Scans run each time a build pushes a new image to your private repository. They also run when you add a new image or tag. The scan traverses each layer of the image, identifies the software components in each layer, and indexes the SHA of each component.

    The scan compares the SHA of each component against the Common Vulnerabilities and Exposures (CVE®) database. The CVE is a “dictionary” of known information security vulnerabilities. When the CVE database is updated, the service reviews the indexed components for any that match the new vulnerability. If the new vulnerability is detected in an image, the service sends an email alert to the maintainers of the image.

    A single component can contain multiple vulnerabilities or exposures and Docker Security Scanning reports on each one. You can click an individual vulnerability report from the scan results and navigate to the specific CVE report data to learn more about it.

    Twistlock

    Twistlock is a rule-based access control policy system for Docker and Kubernetes containers. Twistlock is able to be fully integrated within Docker, with out-of-the-box security policies that are ready to use.

    Security policies can set the conditions for users to, say, create new containers but not delete them; or, they can launch containers but aren’t allowed to push code to them. Twistlock features the same policy management rules as those on Kubernetes, wherein a user can modify management policies but cannot delete them.

    Twistlock also handles image scanning. Users can scan an entire container image, including any packaged Docker application. Twistlock has done its due-diligence in this area, correlating with Red Hat and Mirantis to ensure no container is left vulnerable while a scan is running.

    Twistlock also deals with image scanning of containers within the registries themselves. In runtime environments, Twistlock features a Docker proxy running on the same server with an application’s other containers. This is essentially traffic filtering, whereupon the application container calling the Docker daemon is then re-routed through Twistlock. This approach enforces access control, allowing for safer configuration where no containers are set to run as root. It’s also able to SSH into an instance, for example. In order to delve into these layers of security, Twistlock enforces the policy at runtime.

    When new code is written in images, it is then integrated into the Twistlock API to push an event, whereupon the new image is deposited into the registry along with its unique IDs. It is then pulled out by Twistlock and scanned to ensure it complies with the set security policies in place. Twistlock deposits the scan result into the CI process so that developers can view the result for debugging purposes.

    Integrating these vulnerability scanning tools into your CI/CD Pipeline:

    These tools becomes more interesting paired with a CI server like Jenkins, TravisCI, etc. Given proper configuration, process becomes:

    1. A developer submits application code to source control
    2. Source control triggers a Jenkins build
    3. Jenkins builds the software containers necessary for the application
    4. Jenkins submits the container images to vulnerability scanning tool
    5. Tool identifies security vulnerabilities in the container
    6. Jenkins receives the security report, identifies a high vulnerability in the report, and stops the build

    Conclusion

    There are many solutions like ‘Docker scanning services’, ‘Twistlock Trust’, ‘Clair‘, etc to secure your containers. It’s critical for organizations to adopt such tools in their CI/CD pipelines. But this itself is not going to make containers secure. There are lot of guidelines available in the CIS Benchmark for containers like tuning kernel parameters, setting proper network configurations for inter-container connectivity, securing access to host level directories and others. I will cover these items in the next set of blogs. Stay tuned!

  • Continuous Deployment with Azure Kubernetes Service, Azure Container Registry & Jenkins

    Introduction

    Containerization has taken the application development world by storm. Kubernetes has become the standard way of deploying new containerized distributed applications used by the largest enterprises in a wide range of industries for mission-critical tasks, it has become one of the biggest open-source success stories.

    Although Google Cloud has been providing Kubernetes as a service since November 2014 (Note it started with a beta project), Microsoft with AKS (Azure Kubernetes Service) and Amazon with EKS (Elastic Kubernetes Service)  have jumped on to the scene in the second half of 2017.

    Example:

    AWS had KOPS

    Azure had Azure Container Service.

    However, they were wrapper tools available prior to these services which would help a user create a Kubernetes cluster, but the management and the maintenance (like monitoring and upgrades) needed efforts.

    Azure Container Registry:

    With container demand growing, there is always a need in the market for storing and protecting the container images. Microsoft provides a Geo Replica featured private repository as a service named Azure Container Registry.

    Azure Container Registry is a registry offering from Microsoft for hosting container images privately. It integrates well with orchestrators like Azure Container Service, including Docker Swarm, DC/OS, and the new Azure Kubernetes service. Moreover, ACR  provides capabilities such as Azure Active Directory-based authentication, webhook support, and delete operations.

    The coolest feature provided is Geo-Replication. This will create multiple copies of your image and distribute it across the globe and the container when spawned will have access to the image which is nearest.

    Although Microsoft has good documentation on how to set up ACR  in your Azure Subscription, we did encounter some issues and hence decided to write a blog on the precautions and steps required to configure the Registry in the correct manner.

    Note: We tried this using a free trial account. You can setup it up by referring the following link

    Prerequisites:

    • Make sure you have resource groups created in the supported region.
      Supported Regions: eastus, westeurope, centralus, canada central, canadaeast
    • If you are using Azure CLI for operations please make sure you use the version: 2.0.23 or 2.0.25 (This was the latest version at the time of writing this blog)

    Steps to install Azure CLI 2.0.23 or 2.0.25 (ubuntu 16.04 workstation):

    echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ wheezy main" |            
    sudo tee /etc/apt/sources.list.d/azure-cli.list
    sudo apt-key adv --keyserver packages.microsoft.com --recv-keys 52E16F86FEE04B979B07E28DB02C46DF417A0893
    sudo apt-get install apt-transport-httpssudo apt-get update && sudo apt-get install azure-cli
    
    Install a specific version:
    
    sudo apt install azure-cli=2.0.23-1
    sudo apt install azure-cli=2.0.25.1

    Steps for Container Registry Setup:

    • Login to your Azure Account:
    az  login --username --password

    • Create a resource group:
    az group create --name <RESOURCE-GROUP-NAME>  --location eastus
    Example : az group create --name acr-rg  --location eastus

    • Create a Container Registry:
    az acr create --resource-group <RESOURCE-GROUP-NAME> --name <CONTAINER-REGISTRY-NAME> --sku Basic --admin-enabled true
    Example : az acr create --resource-group acr-rg --name testacr --sku Basic --admin-enabled true

    Note: SKU defines the storage available for the registry for type Basic the storage available is 10GB, 1 WebHook and the billing amount is 11 Rs/day.

    For detailed information on the different SKU available visit the following link

    • Login to the registry :
    az acr login --name <CONTAINER-REGISTRY-NAME>
    Example :az acr login --name testacr

    • Sample docker file for a node application :
    FROM node:carbon
    # Create app directory
    WORKDIR /usr/src/app
    COPY package*.json ./
    # RUN npm install
    EXPOSE 8080
    CMD [ "npm", "start" ]

    • Build the docker image :
    docker build -t <image-tag>:<software>
    Example :docker build -t base:node8

    • Get the login server value for your ACR :
    az acr list --resource-group acr-rg --query "[].{acrLoginServer:loginServer}" --output table
    Output  :testacr.azurecr.io

    • Tag the image with the Login Server Value:
      Note: Get the image ID from docker images command

    Example:

    docker tag image-id testacr.azurecr.io/base:node8

    Push the image to the Azure Container Registry:Example:

    docker push testacr.azurecr.io/base:node8

    Microsoft does provide a GUI option to create the ACR.

    • List Images in the Registry:

    Example:

    az acr repository list --name testacr --output table

    • List tags for the Images:

    Example:

    az acr repository show-tags --name testacr --repository <name> --output table

    • How to use the ACR image in Kubernetes deployment: Use the login Server Name + the image name

    Example :

    containers:- 
    name: demo
    image: testacr.azurecr.io/base:node8

    Azure Kubernetes Service

    Microsoft released the public preview of Managed Kubernetes for Azure Container Service (AKS) on October 24, 2017. This service simplifies the deployment, management, and operations of Kubernetes. It features an Azure-hosted control plane, automated upgrades, self-healing, easy scaling.

    Similarly to Google AKE and Amazon EKS, this new service will allow access to the nodes only and the master will be managed by Cloud Provider. For more information visit the following link.

    Let’s now get our hands dirty and deploy an AKS infrastructure to play with:

    • Enable AKS preview for your Azure Subscription: At the time of writing this blog, AKS is in preview mode, it requires a feature flag on your subscription.
    az provider register -n Microsoft.ContainerService

    • Kubernetes Cluster Creation Command: Note: A new separate resource group should be created for the Kubernetes service.Since the service is in preview, it is available only to certain regions.

    Make sure you create a resource group under the following regions.

    eastus, westeurope, centralus, canadacentral, canadaeast
    az  group create  --name  <RESOURCE-GROUP>   --location eastus
    Example : az group create --name aks-rg --location eastus
    az aks create --resource-group <RESOURCE-GROUP-NAME> --name <CLUSTER-NAME>   --node-count 2 --generate-ssh-keys
    Example : az aks create --resource-group aks-rg --name akscluster  --node-count 2 --generate-ssh-keys

    Example with different arguments :

    Create a Kubernetes cluster with a specific version.

    az aks create -g MyResourceGroup -n MyManagedCluster --kubernetes-version 1.8.1

    Create a Kubernetes cluster with a larger node pool.

    az aks create -g MyResourceGroup -n MyManagedCluster --node-count 7

    Install the Kubectl CLI :

    To connect to the kubernetes cluster from the client computer Kubectl command line client is required.

    sudo az aks install-cli

    Note: If you’re using Azure CloudShell, kubectl is already installed. If you want to install it locally, run the above  command:

    • To configure kubectl to connect to your Kubernetes cluster :
    az aks get-credentials --resource-group=<RESOURCE-GROUP-NAME> --name=<CLUSTER-NAME>

    Example :

    CODE: <a href="https://gist.github.com/velotiotech/ac40b6014a435271f49ca0e3779e800f">https://gist.github.com/velotiotech/ac40b6014a435271f49ca0e3779e800f</a>.js

    • Verify the connection to the cluster :
    kubectl get nodes -o wide 

    • For all the command line features available for Azure check the link: https://docs.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest

    We had encountered a few issues while setting up the AKS cluster at the time of writing this blog. Listing them along with the workaround/fix:

    az aks create --resource-group aks-rg --name akscluster  --node-count 2 --generate-ssh-keys

    Error: Operation failed with status: ‘Bad Request’.

    Details: Resource provider registrations Microsoft.Compute, Microsoft.Storage, Microsoft.Network are needed we need to enable them.

    Fix: If you are using the trial account, click on subscriptions and check whether the following providers are registered or not :

    • Microsoft.Compute
    • Microsoft.Storage
    • Microsoft.Network
    • Microsoft.ContainerRegistry
    • Microsoft.ContainerService

    Error: We had encountered the following mentioned open issues at the time of writing this blog.

    1. Issue-1
    2. Issue-2
    3. Issue-3

    Jenkins setup for CI/CD with ACR, AKS

    Microsoft provides a solution template which will install the latest stable Jenkins version on a Linux (Ubuntu 14.04 LTS) VM along with tools and plugins configured to work with Azure. This includes:

    • git for source control
    • Azure Credentials plugin for connecting securely
    • Azure VM Agents plugin for elastic build, test and continuous integration
    • Azure Storage plugin for storing artifacts
    • Azure CLI to deploy apps using scripts

    Refer the below link to bring up the Instance

    Pipeline plan for Spinning up a Nodejs Application using ACR – AKS – Jenkins

    What the pipeline accomplishes :

    Stage 1:

    The code gets pushed in the Github. The Jenkins job gets triggered automatically. The Dockerfile is checked out from Github.

    Stage 2:

    Docker builds an image from the Dockerfile and then the image is tagged with the build number.Additionally, the latest tag is also attached to the image for the containers to use.

    Stage 3:

    We have default deployment and service YAML files stored on the Jenkins server. Jenkins makes a copy of the default YAML files, make the necessary changes according to the build and put them in a separate folder.

    Stage 4:

    kubectl was initially configured at the time of setting up AKS on the Jenkins server. The YAML files are fed to the kubectl util which in turn creates pods and services.

    Sample Jenkins pipeline code :

    node {      
      // Mark the code checkout 'stage'....        
        stage('Checkout the dockefile from GitHub') {            
          git branch: 'docker-file', credentialsId: 'git_credentials', url: 'https://gitlab.com/demo.git'        
        }        
        // Build and Deploy to ACR 'stage'...        
        stage('Build the Image and Push to Azure Container Registry') {                
          app = docker.build('testacr.azurecr.io/demo')                
          withDockerRegistry([credentialsId: 'acr_credentials', url: 'https://testacr.azurecr.io']) {                
          app.push("${env.BUILD_NUMBER}")                
          app.push('latest')                
          }        
         }        
         stage('Build the Kubernetes YAML Files for New App') {
    <The code here will differ depending on the YAMLs used for the application>        
      }        
      stage('Delpoying the App on Azure Kubernetes Service') {            
        app = docker.image('testacr.azurecr.io/demo:latest')            
        withDockerRegistry([credentialsId: 'acr_credentials', url: 'https://testacr.azurecr.io']) {            
        app.pull()            
        sh "kubectl create -f ."            
        }       
       }    
    }

    What we achieved:

    • We managed to create a private Docker registry on Azure using the ACR feature using az-cli 2.0.25.
    • Secondly, we were able to spin up a private Kubernetes cluster on Azure with 2 nodes.
    • Setup Up Jenkins using a pre-cooked template which had all the plugins necessary for communication with ACR and AKS.
    • Orchestrate  a Continuous Deployment pipeline in Jenkins which uses docker features.