Tag: containers

Real Time Analytics for IoT Data using Mosquitto, AWS Kinesis and InfluxDB

Internet of things (IoT) is maturing rapidly and it is finding application across various industries. Every common device that we use is turning into the category of smart devices. Smart devices are basically IoT devices. These devices captures various parameters in and around their environment leading to generation of a huge amount of data. This data needs to be collected, processed, stored and analyzed in order to get actionable insights from them. To do so, we need to build data pipeline. In this blog we will be building a similar pipeline using Mosquitto, Kinesis, InfluxDB and Grafana. We will discuss all these individual components of the pipeline and the steps to build it.

Why the Analysis of IoT data is different

In an IoT setup, the data is generated by sensors that are distributed across various locations. In order to use the data generated by them we should first get them to a common location from where the various applications which want to process them can read it.

Network Protocol

IoT devices have low computational and network resources. Moreover, these devices write data in very short intervals thus high throughput is expected on the network. For transferring IoT data it is desirable to use lightweight network protocols. A protocol like HTTP uses a complex structure for communication resulting in consumption of more resources making it unsuitable for IoT data transfer. One of the lightweight protocol suitable for IoT data is MQTT which we are using in our pipeline. MQTT is designed for machine to machine (M2M) connectivity. It uses a publisher/subscriber communication model and helps clients to distribute telemetry data with very low network resource consumption. Along with IoT MQTT has been found to be useful in other fields as well.

Other similar protocols include Constrained Application Protocol (CoAP), Advanced Message Queuing Protocol (AMQP) etc.

Datastore

IoT devices generally collect telemetry about its environment usually through sensors. In most of the IoT scenarios, we try to analyze how things have changed over a period of time. Storing these data in a time series database makes our analysis simpler and better. InfluxDB is popular time series database which we will use in our pipeline. More about time series databases can be read here.

Pipeline Overview

The first thing we need for a data pipeline is data. As shown in the image above the data generated by various sensors are written to a topic in the MQTT message broker. To mimic sensors we will use a program which uses the MQTT client to write data to the MQTT broker.

The next component is Amazon Kinesis which is used for streaming data analysis. It closely resembles apache Kafka which is an open source tool used for similar purposes. Kinesis brings the data generated by a number of clients to a single location from where different consumers can pull it for processing. We are using Kinesis so that multiple consumers can read data from a single location. This approach scales well even if we have multiple message brokers.

Once the data is written to the MQTT broker a Kinesis producer subscribes to it and pull the data from it and writes it to the Kinesis stream, from the Kinesis stream the data is pulled by Kinesis consumers which processes the data and writes it to an InfluxDB which is a time series database.

Finally, we use Grafana which is a well-known tool for analytics and monitoring, we can connect it to many popular databases and perform analytics and monitoring. Another popular tool in this space is Kibana (the K of ELK stack)

Setting up a MQTT Message Broker Server:

For MQTT message broker we will use Mosquitto which is a popular open source message broker that implements MQTT. The details of downloading and installing mosquitto for various platforms are available here.

For Ubuntu, it can be installed using the following commands

sudo apt-add-repository ppa:mosquitto-dev/mosquitto-ppa
sudo apt-get update
sudo apt-get install mosquitto
service mosquitto status

sudo apt-add-repository ppa:mosquitto-dev/mosquitto-ppa
sudo apt-get update
sudo apt-get install mosquitto
service mosquitto status

Setting up InfluxDB and Grafana

The simplest way to set up both these components is to use their docker image directly

docker run --name influxdb -p 8083:8083 -p 8086:8086 influxdb:1.0
docker run --name grafana -p 3000:3000 --link influxdb grafana/grafana:3.1.1

docker run --name influxdb -p 8083:8083 -p 8086:8086 influxdb:1.0
docker run --name grafana -p 3000:3000 --link influxdb grafana/grafana:3.1.1

In InfluxDB we have mapped two ports, port 8086 is the HTTP API endpoint port while 8083 is the administration web server’s port. We need to create a database where we will write our data.

For creating a database we can directly go to the console at <influxdb-ip>:8083 and run the command: </influxdb-ip>

CREATE DATABASE "iotdata"

CREATE DATABASE "iotdata"

Or we can do it via HTTP request :

curl -XPOST "http://localhost:8086/query" --data-urlencode "q=CREATE DATABASE iotdata”

curl -XPOST "http://localhost:8086/query" --data-urlencode "q=CREATE DATABASE iotdata”

Creating a Kinesis stream

In Kinesis, we create streams where the Kinesis producers write the data coming from various sources and then the Kinesis consumers read the data from the stream. In the stream, the data is stored in various shards. For our purpose, one shard would be enough.

Creating the MQTT client

We will use the Golang client available in this repository to connect with our message broker server and write data to a specific topic. We will first create a new MQTT client. Here we can see the list of options we have for configuring our MQTT client.

Once we create the options object we can pass it to the NewClient() method which will return us the MQTT client. Now we can write data to the MQTT server. We have defined the structure of the data in the struct sensor data. Now to mimic two sensors which are writing telemetry data to the MQTT broker we have two goroutines which push data to the MQTT server every five seconds.

package publisher
import (
	"config"
	"encoding/json"
	"fmt"
	"log"
	"math/rand"
	"os"
	"time"
	"github.com/eclipse/paho.mqtt.golang"
)
type SensorData struct {
	Id          string  `json:"id"`
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	Timestamp   int64   `json:"timestamp"`
	City        string  `json:"city"`
}
func StartMQTTPublisher() {
	fmt.Println("MQTT publisher Started")
	mqtt.DEBUG = log.New(os.Stdout, "", 0)
	mqtt.ERROR = log.New(os.Stdout, "", 0)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttPublisherClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	c := mqtt.NewClient(opts)
	if token := c.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	}
	go func() {
		t := 20.04
		h := 32.06
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIMUM",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Mumbai",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	go func() {
		t := 16.02
		h := 24.04
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIPUN",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Pune",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	time.Sleep(1000 * time.Second)
	c.Disconnect(250)
}

package publisher

import (
	"config"
	"encoding/json"
	"fmt"
	"log"
	"math/rand"
	"os"
	"time"

	"github.com/eclipse/paho.mqtt.golang"
)

type SensorData struct {
	Id          string  `json:"id"`
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	Timestamp   int64   `json:"timestamp"`
	City        string  `json:"city"`
}

func StartMQTTPublisher() {
	fmt.Println("MQTT publisher Started")
	mqtt.DEBUG = log.New(os.Stdout, "", 0)
	mqtt.ERROR = log.New(os.Stdout, "", 0)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttPublisherClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	c := mqtt.NewClient(opts)
	if token := c.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	}

	go func() {
		t := 20.04
		h := 32.06
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIMUM",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Mumbai",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	go func() {
		t := 16.02
		h := 24.04
		for i := 0; i < 100; i++ {
			sensordata := SensorData{
				Id:          "CITIPUN",
				Temperature: t,
				Humidity:    h,
				Timestamp:   time.Now().Unix(),
				City:        "Pune",
			}
			requestBody, err := json.Marshal(sensordata)
			if err != nil {
				fmt.Println(err)
			}
			token := c.Publish(config.GetMQTTTopicName(), 0, false, requestBody)
			token.Wait()
			if i < 50 {
				t = t + 1*rand.Float64()
				h = h + 1*rand.Float64()
			} else {
				t = t - 1*rand.Float64()
				h = h - 1*rand.Float64()
			}
			time.Sleep(5 * time.Second)
		}
	}()
	time.Sleep(1000 * time.Second)
	c.Disconnect(250)

}

Create a Kinesis Producer

Now we will create a Kinesis producer which subscribes to the topic to which our MQTT client writes data and pull the data from the broker and pushes it to the Kinesis stream. Just like in the previous section here also we first create an MQTT client which connects to the message broker and subscribe to the topic to which our clients/publishers are going to write data to. In the client option, we have the option to define a function which will be called when data is written to this topic. We have created a function postDataTokinesisStream() which connects Kinesis using the Kinesis client and then writes data to the Kinesis stream, every time a data is pushed to the topic.

package producer
import (
	"config"
	"fmt"
	"os"
	"time"
	"github.com/aws/aws-sdk-go/service/kinesis"
	mqtt "github.com/eclipse/paho.mqtt.golang"
)
func postDataTokinesisStream(client mqtt.Client, message mqtt.Message) {
	fmt.Printf("Received message on topic: %snMessage: %sn", message.Topic(), message.Payload())
	streamName := config.GetKinesisStreamName()
	kclient := config.GetKinesisClient()
	var putRecordInput kinesis.PutRecordInput
	partitionKey := message.Topic()
	putRecordInput.PartitionKey = &partitionKey
	putRecordInput.StreamName = &streamName
	putRecordInput.Data = message.Payload()
	putRecordOutput, err := kclient.PutRecord(&putRecordInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(putRecordOutput)
	}
}
func StartKinesisProducer() {
	fmt.Println("Kinesis Producer Started")
	c := make(chan os.Signal, 1)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttSubscriberClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	opts.OnConnect = func(c mqtt.Client) {
		if token := c.Subscribe(config.GetMQTTTopicName(), 0, postDataTokinesisStream); token.Wait() && token.Error() != nil {
			panic(token.Error())
		}
	}
	client := mqtt.NewClient(opts)
	if token := client.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	} else {
		fmt.Printf("Connected to %sn", config.GetMqttServerurl())
	}
	<-c
}

package producer

import (
	"config"
	"fmt"

	"os"
	"time"

	"github.com/aws/aws-sdk-go/service/kinesis"

	mqtt "github.com/eclipse/paho.mqtt.golang"
)

func postDataTokinesisStream(client mqtt.Client, message mqtt.Message) {
	fmt.Printf("Received message on topic: %snMessage: %sn", message.Topic(), message.Payload())
	streamName := config.GetKinesisStreamName()
	kclient := config.GetKinesisClient()
	var putRecordInput kinesis.PutRecordInput
	partitionKey := message.Topic()
	putRecordInput.PartitionKey = &partitionKey
	putRecordInput.StreamName = &streamName
	putRecordInput.Data = message.Payload()
	putRecordOutput, err := kclient.PutRecord(&putRecordInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(putRecordOutput)
	}

}

func StartKinesisProducer() {
	fmt.Println("Kinesis Producer Started")
	c := make(chan os.Signal, 1)
	opts := mqtt.NewClientOptions().AddBroker(config.GetMqttServerurl()).SetClientID("MqttSubscriberClient")
	opts.SetKeepAlive(2 * time.Second)
	opts.SetPingTimeout(1 * time.Second)
	opts.OnConnect = func(c mqtt.Client) {
		if token := c.Subscribe(config.GetMQTTTopicName(), 0, postDataTokinesisStream); token.Wait() && token.Error() != nil {
			panic(token.Error())
		}
	}

	client := mqtt.NewClient(opts)
	if token := client.Connect(); token.Wait() && token.Error() != nil {
		panic(token.Error())
	} else {
		fmt.Printf("Connected to %sn", config.GetMqttServerurl())
	}

	<-c
}

Create a Kinesis Consumer

Now the data is available in our Kinesis stream we can pull it for processing. In the Kinesis consumer section, we create a Kinesis client just like we did in the previous section and then pull data from it. Here we first make a call to the DescribeStream method which returns us the shardId, we then use this shardId to get the ShardIterator and then finally we are able to fetch the records by passing the ShardIterator to GetRecords() method. GetRecords() also returns the NextShardIterator which we can use to continuously look for records in the shard until NextShardIterator becomes null.

package consumer
import (
	"config"
	"fmt"
	"github.com/aws/aws-sdk-go/service/kinesis"
	"velotio.com/dao"
)
func StartKinesisConsumer() {
	fmt.Println("Kinesis Consumer Started")
	client := config.GetKinesisClient()
	streamName := config.GetKinesisStreamName()
	var describeStreamInput kinesis.DescribeStreamInput
	describeStreamInput.StreamName = &streamName
	describeStreamOutput, err := client.DescribeStream(&describeStreamInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*describeStreamOutput.StreamDescription.Shards[0].ShardId)
	}
	var getShardIteratorInput kinesis.GetShardIteratorInput
	getShardIteratorInput.ShardId = describeStreamOutput.StreamDescription.Shards[0].ShardId
	getShardIteratorInput.StreamName = &streamName
	shardIteratorType := "TRIM_HORIZON"
	getShardIteratorInput.ShardIteratorType = &shardIteratorType
	getShardIteratorOuput, err := client.GetShardIterator(&getShardIteratorInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*getShardIteratorOuput.ShardIterator)
	}
	var getRecordsInput kinesis.GetRecordsInput
	getRecordsInput.ShardIterator = getShardIteratorOuput.ShardIterator
	getRecordsOuput, err := client.GetRecords(&getRecordsInput)
	//fmt.Println(getRecordsOuput)
	if err != nil {
		fmt.Println(err)
	} else {
		for *getRecordsOuput.NextShardIterator != "" {
			i := 0
			for i < len(getRecordsOuput.Records) {
				//fmt.Println(len(getRecordsOuput.Records))
				sdf := &dao.SensorDataFiltered{}
				sdf.PostDataToInfluxDB(getRecordsOuput.Records[i].Data)
				i++
			}
			getRecordsInput.ShardIterator = getRecordsOuput.NextShardIterator
			getRecordsOuput, err = client.GetRecords(&getRecordsInput)
		}
	}
}

package consumer

import (
	"config"
	"fmt"

	"github.com/aws/aws-sdk-go/service/kinesis"
	"velotio.com/dao"
)

func StartKinesisConsumer() {
	fmt.Println("Kinesis Consumer Started")
	client := config.GetKinesisClient()
	streamName := config.GetKinesisStreamName()
	var describeStreamInput kinesis.DescribeStreamInput
	describeStreamInput.StreamName = &streamName
	describeStreamOutput, err := client.DescribeStream(&describeStreamInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*describeStreamOutput.StreamDescription.Shards[0].ShardId)
	}
	var getShardIteratorInput kinesis.GetShardIteratorInput
	getShardIteratorInput.ShardId = describeStreamOutput.StreamDescription.Shards[0].ShardId
	getShardIteratorInput.StreamName = &streamName
	shardIteratorType := "TRIM_HORIZON"
	getShardIteratorInput.ShardIteratorType = &shardIteratorType
	getShardIteratorOuput, err := client.GetShardIterator(&getShardIteratorInput)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(*getShardIteratorOuput.ShardIterator)
	}
	var getRecordsInput kinesis.GetRecordsInput

	getRecordsInput.ShardIterator = getShardIteratorOuput.ShardIterator
	getRecordsOuput, err := client.GetRecords(&getRecordsInput)
	//fmt.Println(getRecordsOuput)
	if err != nil {
		fmt.Println(err)
	} else {
		for *getRecordsOuput.NextShardIterator != "" {
			i := 0
			for i < len(getRecordsOuput.Records) {
				//fmt.Println(len(getRecordsOuput.Records))
				sdf := &dao.SensorDataFiltered{}
				sdf.PostDataToInfluxDB(getRecordsOuput.Records[i].Data)
				i++
			}
			getRecordsInput.ShardIterator = getRecordsOuput.NextShardIterator
			getRecordsOuput, err = client.GetRecords(&getRecordsInput)
		}

	}
}

Processing the data and writing it to InfluxDB

Now we do simple processing of filtering out data. The data that we got from the sensor is having fields sensorId, temperature, humidity, city, and timestamp but we are interested in only the values of temperature and humidity for a city so we have created a new structure ‘SensorDataFiltered’ which contains only the fields we need.

For every record that the Kinesis consumer receives it creates an instance of the SensorDataFiltered type and calls the PostDataToInfluxDB() method where the record received from the Kinesis stream is unmarshaled into the SensorDataFiltered type and send to InfluxDB. Here we need to provide the name of the database we created earlier to the variable dbName and the InfluxDB host and port values to dbHost and dbPort respectively.

In the InfluxDB request body, the first value that we provide is used as the measurement which is an InfluxDB struct to store similar data together. Then we have tags, we have used `city` as our tag so that we can filter the data based on them and then we have the actual values. For more details on InfluxDB data write format please refer here.

package dao
import (
	"bytes"
	"crypto/tls"
	"encoding/json"
	"fmt"
	"net/http"
)
type SensorDataFiltered struct {
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	City        string  `json:"city"`
}
var dbName = "iotdata"
var dbHost = "184.73.62.30"
var dbPort = "8086"
func (sdf *SensorDataFiltered) PostDataToInfluxDB(Data []byte) {
	err := json.Unmarshal(Data, &sdf)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(sdf.Temperature, sdf.Humidity)
	}
	url := "http://" + dbHost + ":" + dbPort + "/write?db=" + dbName
	humidity := fmt.Sprintf("%.2f", sdf.Humidity)
	temperature := fmt.Sprintf("%.2f", sdf.Temperature)
	city := sdf.City
	requestBody := "sensordata,city=" + city + " humidity=" + humidity + ",temperature=" + temperature
	req, err := http.NewRequest("POST", url, bytes.NewBuffer([]byte(requestBody)))
	httpclient := &http.Client{
		Transport: &http.Transport{
			TLSClientConfig: &tls.Config{
				InsecureSkipVerify: true,
			},
		},
	}
	resp, err := httpclient.Do(req)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println("Status code for influxdb data port request = ", resp.StatusCode)
	}
	defer resp.Body.Close()
}

package dao

import (
	"bytes"
	"crypto/tls"
	"encoding/json"
	"fmt"
	"net/http"
)

type SensorDataFiltered struct {
	Temperature float64 `json:"temperature"`
	Humidity    float64 `json:"humidity"`
	City        string  `json:"city"`
}

var dbName = "iotdata"
var dbHost = "184.73.62.30"
var dbPort = "8086"

func (sdf *SensorDataFiltered) PostDataToInfluxDB(Data []byte) {
	err := json.Unmarshal(Data, &sdf)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(sdf.Temperature, sdf.Humidity)
	}
	url := "http://" + dbHost + ":" + dbPort + "/write?db=" + dbName
	humidity := fmt.Sprintf("%.2f", sdf.Humidity)
	temperature := fmt.Sprintf("%.2f", sdf.Temperature)
	city := sdf.City
	requestBody := "sensordata,city=" + city + " humidity=" + humidity + ",temperature=" + temperature
	req, err := http.NewRequest("POST", url, bytes.NewBuffer([]byte(requestBody)))
	httpclient := &http.Client{
		Transport: &http.Transport{
			TLSClientConfig: &tls.Config{
				InsecureSkipVerify: true,
			},
		},
	}
	resp, err := httpclient.Do(req)
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println("Status code for influxdb data port request = ", resp.StatusCode)
	}
	defer resp.Body.Close()

}

Once the data is written to InfluxDB we can see it in the web console by querying the measurement create in our database.

Putting everything together in our main function

Now we need to simply call the functions we discussed above and run our main program. Note that we have used `go` before the first two function call which makes them goroutines and they execute concurrently.

On running the code you will see the logs for all the stages of our pipeline getting written to the stdout and it very closely resembles real-life scenarios where data is written by IoT devices and gets processed in near real-time.

package main
import (
	"time"
	"velotio.com/consumer"
	"velotio.com/producer"
	"velotio.com/publisher"
)
func main() {
	go producer.StartKinesisProducer()
	go publisher.StartMQTTPublisher()
	time.Sleep(5 * time.Second)
	consumer.StartKinesisConsumer()
}

package main

import (
	"time"

	"velotio.com/consumer"
	"velotio.com/producer"
	"velotio.com/publisher"
)

func main() {

	go producer.StartKinesisProducer()
	go publisher.StartMQTTPublisher()
	time.Sleep(5 * time.Second)
	consumer.StartKinesisConsumer()

}

Visualization through Grafana

We can access the Grafana web console at port 3000 of the machine on which it is running. First, we need to add our InfluxDB as a data source to it under the data sources option.

For creating dashboard go to the dashboard option and choose new. Once the dashboard is created we can start by adding a panel.

We need to add Influxdb data source that we added earlier as the panel data source and write queries as shown in the image below.

We can repeat the same process for adding another panel to the dashboard this time choosing a different city in our query.

Conclusion:

IoT data analytics is a fast evolving and interesting space. The number of IoT devices are growing rapidly. There is a great opportunity to get valuable insights from the huge amount of data generated by these device. In this blog, I tried to help you grab that opportunity by building a near real time data pipeline for IoT data. If you like it please share and subscribe to our blog.

December 12, 2022

Taking Amazon’s Elastic Kubernetes Service for a Spin

With the introduction of Elastic Kubernetes service at AWS re: Invent last year, AWS finally threw their hat in the ever booming space of managed Kubernetes services. In this blog post, we will learn the basic concepts of EKS, launch an EKS cluster and also deploy a multi-tier application on it.

What is Elastic Kubernetes service (EKS)?

Kubernetes works on a master-slave architecture. The master is also referred to as control plane. If the master goes down it brings our entire cluster down, thus ensuring high availability of master is absolutely critical as it can be a single point of failure. Ensuring high availability of master and managing all the worker nodes along with it becomes a cumbersome task in itself, thus it is most desirable for organizations to have managed Kubernetes cluster so that they can focus on the most important task which is to run their applications rather than managing the cluster. Other cloud providers like Google cloud and Azure already had their managed Kubernetes service named GKE and AKS respectively. Similarly now with EKS Amazon has also rolled out its managed Kubernetes cluster to provide a seamless way to run Kubernetes workloads.

Key EKS concepts:

EKS takes full advantage of the fact that it is running on AWS so instead of creating Kubernetes specific features from the scratch they have reused/plugged in the existing AWS services with EKS for achieving Kubernetes specific functionalities. Here is a brief overview:

IAM-integration: Amazon EKS integrates IAM authentication with Kubernetes RBAC ( role-based access control system native to Kubernetes) with the help of Heptio Authenticator which is a tool that uses AWS IAM credentials to authenticate to a Kubernetes cluster. Here we can directly attach an RBAC role with an IAM entity this saves the pain of managing another set of credentials at the cluster level.

Container Interface: AWS has developed an open source cni plugin which takes advantage of the fact that multiple network interfaces can be attached to a single EC2 instance and these interfaces can have multiple secondary private ips associated with them, these secondary ips are used to provide pods running on EKS with real ip address from VPC cidr pool. This improves the latency for inter pod communications as the traffic flows without any overlay.

ELB Support: We can use any of the AWS ELB offerings (classic, network, application) to route traffic to our service running on the working nodes.

Auto scaling: The number of worker nodes in the cluster can grow and shrink using the EC2 auto scaling service.

Route 53: With the help of the External DNS project and AWS route53 we can manage the DNS entries for the load balancers which get created when we create an ingress object in our EKS cluster or when we create a service of type LoadBalancer in our cluster. This way the DNS names are always in sync with the load balancers and we don’t have to give separate attention to it.

Shared responsibility for cluster: The responsibilities of an EKS cluster is shared between AWS and customer. AWS takes care of the most critical part of managing the control plane (api server and etcd database) and customers need to manage the worker node. Amazon EKS automatically runs Kubernetes with three masters across three Availability Zones to protect against a single point of failure, control plane nodes are also monitored and replaced if they fail, and are also patched and updated automatically this ensures high availability of the cluster and makes it extremely simple to migrate existing workloads to EKS.

Prerequisites for launching an EKS cluster:

1. IAM role to be assumed by the cluster: Create an IAM role that allows EKS to manage a cluster on your behalf. Choose EKS as the service which will assume this role and add AWS managed policies ‘AmazonEKSClusterPolicy’ and ‘AmazonEKSServicePolicy’ to it.

2. VPC for the cluster: We need to create the VPC where our cluster is going to reside. We need a VPC with subnets, internet gateways and other components configured. We can use an existing VPC for this if we wish or create one using the CloudFormation script provided by AWS here or use the Terraform script available here. The scripts take ‘cidr’ block of the VPC and three other subnets as arguments.

Launching an EKS cluster:

1. Using the web console: With the prerequisites in place now we can go to the EKS console and launch an EKS cluster when we try to launch an EKS cluster we need to provide a the name of the EKS cluster, choose the Kubernetes version to use, provide the IAM role we created in step one and also choose a VPC, once we choose a VPC we also need to select subnets from the VPC where we want our worker nodes to be launched by default all the subnets in the VPC are selected we also need to provide a security group which is applied to the elastic network interfaces (eni) that EKS creates to allow control plane communicate with the worker nodes.

NOTE: Couple of things to note here is that the subnets must be in at least two different availability zones and the security group that we provided is later updated when we create worker node cluster so it is better to not use this security group with any other entity or be completely sure of the changes happening to it.

2. Using awscli :‍

aws eks create-cluster --name eks-blog-cluster --role-arn arn:aws:iam::XXXXXXXXXXXX:role/eks-service-role  
--resources-vpc-config subnetIds=subnet-0b8da2094908e1b23,subnet-01a46af43b2c5e16c,securityGroupIds=sg-03fa0c02886c183d4

aws eks create-cluster --name eks-blog-cluster --role-arn arn:aws:iam::XXXXXXXXXXXX:role/eks-service-role  
--resources-vpc-config subnetIds=subnet-0b8da2094908e1b23,subnet-01a46af43b2c5e16c,securityGroupIds=sg-03fa0c02886c183d4

{
    "cluster": {
        "status": "CREATING",
        "name": "eks-blog-cluster",
        "certificateAuthority": {},
        "roleArn": "arn:aws:iam::XXXXXXXXXXXX:role/eks-service-role",
        "resourcesVpcConfig": {
            "subnetIds": [
                "subnet-0b8da2094908e1b23",
                "subnet-01a46af43b2c5e16c"
            ],
            "vpcId": "vpc-0364b5ed9f85e7ce1",
            "securityGroupIds": [
                "sg-03fa0c02886c183d4"
            ]
        },
        "version": "1.10",
        "arn": "arn:aws:eks:us-east-1:XXXXXXXXXXXX:cluster/eks-blog-cluster",
        "createdAt": 1535269577.147
    }
}

{
    "cluster": {
        "status": "CREATING",
        "name": "eks-blog-cluster",
        "certificateAuthority": {},
        "roleArn": "arn:aws:iam::XXXXXXXXXXXX:role/eks-service-role",
        "resourcesVpcConfig": {
            "subnetIds": [
                "subnet-0b8da2094908e1b23",
                "subnet-01a46af43b2c5e16c"
            ],
            "vpcId": "vpc-0364b5ed9f85e7ce1",
            "securityGroupIds": [
                "sg-03fa0c02886c183d4"
            ]
        },
        "version": "1.10",
        "arn": "arn:aws:eks:us-east-1:XXXXXXXXXXXX:cluster/eks-blog-cluster",
        "createdAt": 1535269577.147
    }
}

In the response, we see that the cluster is in creating state. It will take a few minutes before it is available. We can check the status using the below command:

aws eks describe-cluster --name=eks-blog-cluster

aws eks describe-cluster --name=eks-blog-cluster

Configure kubectl for EKS:

We know that in Kubernetes we interact with the control plane by making requests to the API server. The most common way to interact with the API server is via kubectl command line utility. As our cluster is ready now we need to install kubectl.

1. Install the kubectl binary

curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s 
https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl

curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s 
https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl

Give executable permission to the binary.

chmod +x ./kubectl

chmod +x ./kubectl

Move the kubectl binary to a folder in your system’s $PATH.

sudo cp ./kubectl /bin/kubectl && export PATH=$HOME/bin:$PATH

sudo cp ./kubectl /bin/kubectl && export PATH=$HOME/bin:$PATH

As discussed earlier EKS uses AWS IAM Authenticator for Kubernetes to allow IAM authentication for your Kubernetes cluster. So we need to download and install the same.

2. Install aws-iam-authenticator

curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator

curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator

Give executable permission to the binary

chmod +x ./aws-iam-authenticator

chmod +x ./aws-iam-authenticator

Move the aws-iam-authenticator binary to a folder in your system’s $PATH.

sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator

sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator

3. Create the kubeconfig file

First create the directory.

mkdir -p ~/.kube

mkdir -p ~/.kube

Open a config file in the folder created above

sudo vi .kube/config-eks-blog-cluster

sudo vi .kube/config-eks-blog-cluster

Paste the below code in the file

clusters:      
- cluster:       
server: https://DBFE36D09896EECAB426959C35FFCC47.sk1.us-east-1.eks.amazonaws.com        
certificate-authority-data: ”....................”        
name: kubernetes        
contexts:        
- context:             
cluster: kubernetes             
user: aws          
name: aws        
current-context: aws        
kind: Config       
preferences: {}        
users:           
- name: aws            
user:                
exec:                    
apiVersion: client.authentication.k8s.io/v1alpha1                    
command: aws-iam-authenticator                    
args:                       
- "token"                       
- "-i"                     
- “eks-blog-cluster"

clusters:      
- cluster:       
server: https://DBFE36D09896EECAB426959C35FFCC47.sk1.us-east-1.eks.amazonaws.com        
certificate-authority-data: ”....................”        
name: kubernetes        
contexts:        
- context:             
cluster: kubernetes             
user: aws          
name: aws        
current-context: aws        
kind: Config       
preferences: {}        
users:           
- name: aws            
user:                
exec:                    
apiVersion: client.authentication.k8s.io/v1alpha1                    
command: aws-iam-authenticator                    
args:                       
- "token"                       
- "-i"                     
- “eks-blog-cluster"

Replace the values of the server and certificate–authority data with the values of your cluster and certificate and also update the cluster name in the args section. You can get these values from the web console as well as using the command.

aws eks describe-cluster --name=eks-blog-cluster

aws eks describe-cluster --name=eks-blog-cluster

Save and exit.

Add that file path to your KUBECONFIG environment variable so that kubectl knows where to look for your cluster configuration.

export KUBECONFIG=$KUBECONFIG:~/.kube/config-eks-blog-cluster

export KUBECONFIG=$KUBECONFIG:~/.kube/config-eks-blog-cluster

To verify that the kubectl is now properly configured :

kubectl get all
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 172.20.0.1  443/TCP 50m

kubectl get all
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 172.20.0.1  443/TCP 50m

Launch and configure worker nodes :

Now we need to launch worker nodes before we can start deploying apps. We can create the worker node cluster by using the CloudFormation script provided by AWS which is available here or use the Terraform script available here.

ClusterName: Name of the Amazon EKS cluster we created earlier.
ClusterControlPlaneSecurityGroup: Id of the security group we used in EKS cluster.
NodeGroupName: Name for the worker node auto scaling group.
NodeAutoScalingGroupMinSize: Minimum number of worker nodes that you always want in your cluster.
NodeAutoScalingGroupMaxSize: Maximum number of worker nodes that you want in your cluster.
NodeInstanceType: Type of worker node you wish to launch.
NodeImageId: AWS provides Amazon EKS-optimized AMI to be used as worker nodes. Currently AKS is available in only two AWS regions Oregon and N.virginia and the AMI ids are ami-02415125ccd555295 and ami-048486555686d18a0 respectively
KeyName: Name of the key you will use to ssh into the worker node.
VpcId: Id of the VPC that we created earlier.
Subnets: Subnets from the VPC we created earlier.

To enable worker nodes to join your cluster, we need to download, edit and apply the AWS authenticator config map.

Download the config map:

curl -O https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/aws-auth-cm.yaml

curl -O https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/aws-auth-cm.yaml

Open it in an editor

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: <ARN of instance role (not instance profile)>
      username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: <ARN of instance role (not instance profile)>
      username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes

Edit the value of rolearn with the arn of the role of your worker nodes. This value is available in the output of the scripts that you ran. Save the change and then apply

kubectl apply -f aws-auth-cm.yaml

kubectl apply -f aws-auth-cm.yaml

Now you can check if the nodes have joined the cluster or not.

kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-2-171.ec2.internal Ready  12s v1.10.3
ip-10-0-3-58.ec2.internal Ready  14s v1.10.3

kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-2-171.ec2.internal Ready  12s v1.10.3
ip-10-0-3-58.ec2.internal Ready  14s v1.10.3

Deploying an application:

As our cluster is completely ready now we can start deploying applications on it. We will deploy a simple books api application which connects to a mongodb database and allows users to store,list and delete book information.

1. MongoDB Deployment YAML

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mongodb
spec:
  template:
    metadata:
      labels:
        app: mongodb
    spec:
      containers:
      - name: mongodb
        image: mongo
        ports:
        - name: mongodbport
          containerPort: 27017
          protocol: TCP

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mongodb
spec:
  template:
    metadata:
      labels:
        app: mongodb
    spec:
      containers:
      - name: mongodb
        image: mongo
        ports:
        - name: mongodbport
          containerPort: 27017
          protocol: TCP

2. Test Application Development YAML

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: test-app
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: test-app
    spec:
      containers:
      - name: test-app
        image: akash125/pyapp
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: test-app
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: test-app
    spec:
      containers:
      - name: test-app
        image: akash125/pyapp
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000

3. MongoDB Service YAML

apiVersion: v1
kind: Service
metadata:
  name: mongodb-service
spec:
  ports:
  - port: 27017
    targetPort: 27017
    protocol: TCP
    name: mongodbport
  selector:
    app: mongodb

apiVersion: v1
kind: Service
metadata:
  name: mongodb-service
spec:
  ports:
  - port: 27017
    targetPort: 27017
    protocol: TCP
    name: mongodbport
  selector:
    app: mongodb

4. Test Application Service YAML

apiVersion: v1
kind: Service
metadata:
  name: test-service
spec:
  type: LoadBalancer
  ports:
  - name: test-service
    port: 80
    protocol: TCP
    targetPort: 3000
  selector:
    app: test-app

apiVersion: v1
kind: Service
metadata:
  name: test-service
spec:
  type: LoadBalancer
  ports:
  - name: test-service
    port: 80
    protocol: TCP
    targetPort: 3000
  selector:
    app: test-app

Services

$ kubectl create -f mongodb-service.yaml
$ kubectl create -f testapp-service.yaml

$ kubectl create -f mongodb-service.yaml
$ kubectl create -f testapp-service.yaml

Deployments

$ kubectl create -f mongodb-deployment.yaml
$ kubectl create -f testapp-deployment.yaml$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 12m
mongodb-service ClusterIP 172.20.55.194 <none> 27017/TCP 4m
test-service LoadBalancer 172.20.188.77 a7ee4f4c3b0ea 80:31427/TCP 3m

$ kubectl create -f mongodb-deployment.yaml
$ kubectl create -f testapp-deployment.yaml$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 12m
mongodb-service ClusterIP 172.20.55.194 <none> 27017/TCP 4m
test-service LoadBalancer 172.20.188.77 a7ee4f4c3b0ea 80:31427/TCP 3m

In the EXTERNAL-IP section of the test-service we see dns of an load balancer we can now access the application from outside the cluster using this dns.

To Store Data :

curl -X POST -d '{"name":"A Game of Thrones (A Song of Ice and Fire)“, "author":"George R.R. Martin","price":343}' http://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books
{"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}

curl -X POST -d '{"name":"A Game of Thrones (A Song of Ice and Fire)“, "author":"George R.R. Martin","price":343}' http://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books
{"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}

To Get Data :

curl -X GET http://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books
[{"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}]

curl -X GET http://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books
[{"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}]

We can directly put the URL used in the curl operation above in our browser as well, we will get the same response.

Now our application is deployed on EKS and can be accessed by the users.

Comparison BETWEEN GKE, ECS and EKS:

Cluster creation: Creating GKE and ECS cluster is way simpler than creating an EKS cluster. GKE being the simplest of all three.

Cost: In case of both, GKE and ECS we pay only for the infrastructure that is visible to us i.e., servers, volumes, ELB etc. and there is no cost for master nodes or other cluster management services but with EKS there is a charge of 0.2 $ per hour for the control plane.

Add-ons: GKE provides the option of using Calico as the network plugin which helps in defining network policies for controlling inter pod communication (by default all pods in k8s can communicate with each other).

Serverless: ECS cluster can be created using Fargate which is container as a Service (CaaS) offering from AWS. Similarly EKS is also expected to support Fargate very soon.

In terms of availability and scalability all the services are at par with each other.

Conclusion:

In this blog post we learned the basics concepts of EKS, launched our own EKS cluster and deployed an application as well. EKS is much awaited service from AWS especially for the folks who were already running their Kubernetes workloads on AWS, as now they can easily migrate to EKS and have a fully managed Kubernetes control plane. EKS is expected to be adopted by many organisations in near future.

References:

December 12, 2022

Continuous Integration & Delivery (CI/CD) for Kubernetes Using CircleCI & Helm

Introduction

Kubernetes is getting adopted rapidly across the software industry and is becoming the most preferred option for deploying and managing containerized applications. Once we have a fully functional Kubernetes cluster we need to have an automated process to deploy our applications on it. In this blog post, we will create a fully automated “commit to deploy” pipeline for Kubernetes. We will use CircleCI & helm for it.

What is CircleCI?

CircleCI is a fully managed saas offering which allows us to build, test or deploy our code on every check in. For getting started with circle we need to log into their web console with our GitHub or bitbucket credentials then add a project for the repository we want to build and then add the CircleCI config file to our repository. The CircleCI config file is a yaml file which lists the steps we want to execute on every time code is pushed to that repository.

Some salient features of CircleCI is:

Little or no operational overhead as the infrastructure is managed completely by CircleCI.
User authentication is done via GitHub or bitbucket so user management is quite simple.
It automatically notifies the build status on the github/bitbucket email ids of the users who are following the project on CircleCI.
The UI is quite simple and gives a holistic view of builds.
Can be integrated with Slack, hipchat, jira, etc.

What is Helm?

Helm is chart manager where chart refers to package of Kubernetes resources. Helm allows us to bundle related Kubernetes objects into charts and treat them as a single unit of deployment referred to as release. For example, you have an application app1 which you want to run on Kubernetes. For this app1 you create multiple Kubernetes resources like deployment, service, ingress, horizontal pod scaler, etc. Now while deploying the application you need to create all the Kubernetes resources separately by applying their manifest files. What helm does is it allows us to group all those files into one chart (Helm chart) and then we just need to deploy the chart. This also makes deleting and upgrading the resources quite simple.

Some other benefits of Helm is:

It makes the deployment highly configurable. Thus just by changing the parameters, we can use the same chart for deploying on multiple environments like stag/prod or multiple cloud providers.
We can rollback to a previous release with a single helm command.
It makes managing and sharing Kubernetes specific application much simpler.

Note: Helm is composed of two components one is helm client and the other one is tiller server. Tiller is the component which runs inside the cluster as deployment and serves the requests made by helm client. Tiller has potential security vulnerabilities thus we will use tillerless helm in our pipeline which runs tiller only when we need it.

Building the Pipeline

Overview:

We will create the pipeline for a Golang application. The pipeline will first build the binary, create a docker image from it, push the image to ECR, then deploy it on the Kubernetes cluster using its helm chart.

We will use a simple app which just exposes a `hello` endpoint and returns the hello world message:

package main
import (
	"encoding/json"
	"net/http"
	"log"
	"github.com/gorilla/mux"
)
type Message struct {
	Msg string
}
func helloWorldJSON(w http.ResponseWriter, r *http.Request) {
	m := Message{"Hello World"}
	response, _ := json.Marshal(m)
	w.Header().Set("Content-Type", "application/json")
	w.WriteHeader(http.StatusOK)
	w.Write(response)
}
func main() {
	r := mux.NewRouter()
	r.HandleFunc("/hello", helloWorldJSON).Methods("GET")
	if err := http.ListenAndServe(":8080", r); err != nil {
		log.Fatal(err)
	}
}

package main

import (
	"encoding/json"
	"net/http"
	"log"
	"github.com/gorilla/mux"
)

type Message struct {
	Msg string
}

func helloWorldJSON(w http.ResponseWriter, r *http.Request) {
	m := Message{"Hello World"}
	response, _ := json.Marshal(m)
	w.Header().Set("Content-Type", "application/json")
	w.WriteHeader(http.StatusOK)
	w.Write(response)
}
func main() {
	r := mux.NewRouter()
	r.HandleFunc("/hello", helloWorldJSON).Methods("GET")
	if err := http.ListenAndServe(":8080", r); err != nil {
		log.Fatal(err)
	}
}

We will create a docker image for hello app using the following Dockerfile:

FROM centos/systemd
MAINTAINER "Akash Gautam" <akash.gautam@velotio.com>
COPY hello-app  /
ENTRYPOINT ["/hello-app"]

FROM centos/systemd

MAINTAINER "Akash Gautam" <akash.gautam@velotio.com>

COPY hello-app  /

ENTRYPOINT ["/hello-app"]

Creating Helm Chart:

Now we need to create the helm chart for hello app.

First, we create the Kubernetes manifest files. We will create a deployment and a service file:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: helloapp
spec:
  replicas: 1
  strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 1
  template:
    metadata:
      labels:
        app: helloapp
        env: {{ .Values.labels.env }}
        cluster: {{ .Values.labels.cluster }}
    spec:
      containers:
      - name: helloapp
        image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
        imagePullPolicy: {{ .Values.image.imagePullPolicy }}
        readinessProbe:
          httpGet:
            path: /hello
            port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            successThreshold: 1

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: helloapp
spec:
  replicas: 1
  strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 1
  template:
    metadata:
      labels:
        app: helloapp
        env: {{ .Values.labels.env }}
        cluster: {{ .Values.labels.cluster }}
    spec:
      containers:
      - name: helloapp
        image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
        imagePullPolicy: {{ .Values.image.imagePullPolicy }}
        readinessProbe:
          httpGet:
            path: /hello
            port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            successThreshold: 1

apiVersion: v1
kind: Service
metadata:
  name: helloapp
spec:
  type: {{ .Values.service.type }}
  ports:
  - name: helloapp
    port: {{ .Values.service.port }}
    protocol: TCP
    targetPort: {{ .Values.service.targetPort }}
  selector:
    app: helloapp

apiVersion: v1
kind: Service
metadata:
  name: helloapp
spec:
  type: {{ .Values.service.type }}
  ports:
  - name: helloapp
    port: {{ .Values.service.port }}
    protocol: TCP
    targetPort: {{ .Values.service.targetPort }}
  selector:
    app: helloapp

In the above file, you must have noticed that we have used .Values object. All the values that we specify in the values.yaml file in our helm chart can be accessed using the .Values object inside the template.

Let’s create the helm chart now:

helm create helloapp

helm create helloapp

Above command will create a chart helm chart folder structure for us.

helloapp/
|
|- .helmignore # Contains patterns to ignore when packaging Helm charts.
|
|- Chart.yaml # Information about your chart
|
|- values.yaml # The default values for your templates
|
|- charts/ # Charts that this chart depends on
|
|- templates/ # The template files

helloapp/
|
|- .helmignore # Contains patterns to ignore when packaging Helm charts.
|
|- Chart.yaml # Information about your chart
|
|- values.yaml # The default values for your templates
|
|- charts/ # Charts that this chart depends on
|
|- templates/ # The template files

We can remove the charts/ folder inside our helloapp chart as our chart won’t have any sub-charts. Now we need to move our Kubernetes manifest files to the template folder and update our values.yaml and Chart.yaml

Our values.yaml looks like:

image:
  tag: 0.0.1
  repository: 123456789870.dkr.ecr.us-east-1.amazonaws.com/helloapp
  imagePullPolicy: Always
labels:
  env: "staging"
  cluster: "eks-cluster-blog"
service:
  port: 80
  targetPort: 8080
  type: LoadBalancer

image:
  tag: 0.0.1
  repository: 123456789870.dkr.ecr.us-east-1.amazonaws.com/helloapp
  imagePullPolicy: Always

labels:
  env: "staging"
  cluster: "eks-cluster-blog"

service:
  port: 80
  targetPort: 8080
  type: LoadBalancer

This allows us to make our deployment more configurable. For example, here we have set our service type as LoadBalancer in values.yaml but if we want to change it to nodePort we just need to set is as NodePort while installing the chart (–set service.type=NodePort). Similarly, we have set the image pull policy as Always which is fine for development/staging environment but when we deploy to production we may want to set is as ifNotPresent. In our chart, we need to identify the parameters/values which may change from one environment to another and make them configurable. This allows us to be flexible with our deployment and reuse the same chart

Finally, we need to update Chart.yaml file. This file mostly contains metadata about the chart like the name, version, maintainer, etc, where name & version are two mandatory fields for Chart.yaml.

version: 1.0.0
appVersion: 0.0.1
name: helloapp
description: Helm chart for helloapp
source:
  - https://github.com/akash-gautam/helloapp

version: 1.0.0
appVersion: 0.0.1
name: helloapp
description: Helm chart for helloapp
source:
  - https://github.com/akash-gautam/helloapp

Now our Helm chart is ready we can start with the pipeline. We need to create a folder named .circleci in the root folder of our repository and create a file named config.yml in it. In our config.yml we have defined two jobs one is build&pushImage and deploy.

Configure the pipeline:

build&pushImage:
    working_directory: /go/src/hello-app (1)
    docker:
      - image: circleci/golang:1.10 (2)
    steps:
      - checkout (3)
      - run: (4)
          name: build the binary
          command: go build -o hello-app
      - setup_remote_docker: (5)
          docker_layer_caching: true
      - run: (6)
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_BUILD_NUM' >> $BASH_ENV
      - run: (7)
          name: Build the docker image
          command: docker build . -t ${CIRCLE_PROJECT_REPONAME}:$TAG
      - run: (8)
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run: (9)
          name: Login to ECR
          command: $(aws ecr get-login --region $AWS_REGION | sed -e 's/-e none//g')
      - run: (10)
          name: Tag the image with ECR repo name 
          command: docker tag ${CIRCLE_PROJECT_REPONAME}:$TAG ${HELLOAPP_ECR_REPO}:$TAG    
      - run: (11)
          name: Push the image the ECR repo
          command: docker push ${HELLOAPP_ECR_REPO}:$TAG

build&pushImage:
    working_directory: /go/src/hello-app (1)
    docker:
      - image: circleci/golang:1.10 (2)
    steps:
      - checkout (3)
      - run: (4)
          name: build the binary
          command: go build -o hello-app
      - setup_remote_docker: (5)
          docker_layer_caching: true
      - run: (6)
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_BUILD_NUM' >> $BASH_ENV
      - run: (7)
          name: Build the docker image
          command: docker build . -t ${CIRCLE_PROJECT_REPONAME}:$TAG
      - run: (8)
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run: (9)
          name: Login to ECR
          command: $(aws ecr get-login --region $AWS_REGION | sed -e 's/-e none//g')
      - run: (10)
          name: Tag the image with ECR repo name 
          command: docker tag ${CIRCLE_PROJECT_REPONAME}:$TAG ${HELLOAPP_ECR_REPO}:$TAG    
      - run: (11)
          name: Push the image the ECR repo
          command: docker push ${HELLOAPP_ECR_REPO}:$TAG

We set the working directory for our job, we are setting it on the gopath so that we don’t need to do anything additional.
We set the docker image inside which we want the job to run, as our app is built using golang we are using the image which already has golang installed in it.
This step checks out our repository in the working directory
In this step, we build the binary
Here we setup docker with the help of setup_remote_docker key provided by CircleCI.
In this step we create the tag we will be using while building the image, we use the app version available in the VERSION file and append the $CIRCLE_BUILD_NUM value to it, separated by a dash (`-`).
Here we build the image and tag.
Installing AWS CLI to interact with the ECR later.
Here we log into ECR
We tag the image build in step 7 with the ECR repository name.
Finally, we push the image to ECR.

Now we will deploy our helm charts. For this, we have a separate job deploy.

deploy:
    docker: (1)
        - image: circleci/golang:1.10
    steps: (2)
      - checkout
      - run: (3)
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run: (4)
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_PREVIOUS_BUILD_NUM' >> $BASH_ENV
      - run: (5)
          name: Install and confgure kubectl
          command: sudo curl -L https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl && sudo chmod +x /usr/local/bin/kubectl  
      - run: (6)
          name: Install and confgure kubectl aws-iam-authenticator
          command: curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator && sudo chmod +x ./aws-iam-authenticator && sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator
       - run: (7)
          name: Install latest awscli version
          command: sudo apt install unzip && curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" && unzip awscli-bundle.zip &&./awscli-bundle/install -b ~/bin/aws
      - run: (8)
          name: Get the kubeconfig file 
          command: export KUBECONFIG=$HOME/.kube/kubeconfig && /home/circleci/bin/aws eks --region $AWS_REGION update-kubeconfig --name $EKS_CLUSTER_NAME
      - run: (9)
          name: Install and configuire helm
          command: sudo curl -L https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz | tar xz && sudo mv linux-amd64/helm /bin/helm && sudo rm -rf linux-amd64
      - run: (10)
          name: Initialize helm
          command:  helm init --client-only --kubeconfig=$HOME/.kube/kubeconfig
      - run: (11)
          name: Install tiller plugin
          command: helm plugin install https://github.com/rimusz/helm-tiller --kubeconfig=$HOME/.kube/kubeconfig        
      - run: (12)
          name: Release helloapp using helm chart 
          command: bash scripts/release-helloapp.sh $TAG

deploy:
    docker: (1)
        - image: circleci/golang:1.10
    steps: (2)
      - checkout
      - run: (3)
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run: (4)
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_PREVIOUS_BUILD_NUM' >> $BASH_ENV
      - run: (5)
          name: Install and confgure kubectl
          command: sudo curl -L https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl && sudo chmod +x /usr/local/bin/kubectl  
      - run: (6)
          name: Install and confgure kubectl aws-iam-authenticator
          command: curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator && sudo chmod +x ./aws-iam-authenticator && sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator
       - run: (7)
          name: Install latest awscli version
          command: sudo apt install unzip && curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" && unzip awscli-bundle.zip &&./awscli-bundle/install -b ~/bin/aws
      - run: (8)
          name: Get the kubeconfig file 
          command: export KUBECONFIG=$HOME/.kube/kubeconfig && /home/circleci/bin/aws eks --region $AWS_REGION update-kubeconfig --name $EKS_CLUSTER_NAME
      - run: (9)
          name: Install and configuire helm
          command: sudo curl -L https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz | tar xz && sudo mv linux-amd64/helm /bin/helm && sudo rm -rf linux-amd64
      - run: (10)
          name: Initialize helm
          command:  helm init --client-only --kubeconfig=$HOME/.kube/kubeconfig
      - run: (11)
          name: Install tiller plugin
          command: helm plugin install https://github.com/rimusz/helm-tiller --kubeconfig=$HOME/.kube/kubeconfig        
      - run: (12)
          name: Release helloapp using helm chart 
          command: bash scripts/release-helloapp.sh $TAG

Set the docker image inside which we want to execute the job.
Check out the code using `checkout` key
Install AWS CLI.
Setting the value of tag just like we did in case of build&pushImage job. Note that here we are using CIRCLE_PREVIOUS_BUILD_NUM variable which gives us the build number of build&pushImage job and ensures that the tag values are the same.
Download kubectl and making it executable.
Installing aws-iam-authenticator this is required because my k8s cluster is on EKS.
Here we install the latest version of AWS CLI, EKS is a relatively newer service from AWS and older versions of AWS CLI doesn’t have it.
Here we fetch the kubeconfig file. This step will vary depending upon where the k8s cluster has been set up. As my cluster is on EKS am getting the kubeconfig file via. AWS CLI similarly if your cluster in on GKE then you need to configure gcloud and use the command `gcloud container clusters get-credentials <cluster-name> –zone=<zone-name>`. We can also have the kubeconfig file on some other secure storage system and fetch it from there.</zone-name></cluster-name>
Download Helm and make it executable
Initializing helm, note that we are initializing helm in client only mode so that it doesn’t start the tiller server.
Download the tillerless helm plugin
Execute the release-helloapp.sh shell script and pass it TAG value from step 4.

In the release-helloapp.sh script we first start tiller, after this, we check if the release is already present or not if it is present then we upgrade otherwise we make a new release. Here we override the value of tag for the image present in the chart by setting it to the tag of the newly built image, finally, we stop the tiller server.

#!/bin/bash
TAG=$1
echo "start tiller"
export KUBECONFIG=$HOME/.kube/kubeconfig
helm tiller start-ci
export HELM_HOST=127.0.0.1:44134
result=$(eval helm ls | grep helloapp) 
if [ $? -ne "0" ]; then 
   helm install --timeout 180 --name helloapp --set image.tag=$TAG charts/helloapp
else 
   helm upgrade --timeout 180 helloapp --set image.tag=$TAG charts/helloapp
fi
echo "stop tiller"
helm tiller stop

#!/bin/bash
TAG=$1
echo "start tiller"
export KUBECONFIG=$HOME/.kube/kubeconfig
helm tiller start-ci
export HELM_HOST=127.0.0.1:44134
result=$(eval helm ls | grep helloapp) 
if [ $? -ne "0" ]; then 
   helm install --timeout 180 --name helloapp --set image.tag=$TAG charts/helloapp
else 
   helm upgrade --timeout 180 helloapp --set image.tag=$TAG charts/helloapp
fi
echo "stop tiller"
helm tiller stop

The complete CircleCI config.yml file looks like:

version: 2
jobs:
  build&pushImage:
    working_directory: /go/src/hello-app
    docker:
      - image: circleci/golang:1.10
    steps:
      - checkout
      - run:
          name: build the binary
          command: go build -o hello-app
      - setup_remote_docker:
          docker_layer_caching: true
      - run:
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_BUILD_NUM' >> $BASH_ENV
      - run:
          name: Build the docker image
          command: docker build . -t ${CIRCLE_PROJECT_REPONAME}:$TAG
      - run:
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run:
          name: Login to ECR
          command: $(aws ecr get-login --region $AWS_REGION | sed -e 's/-e none//g')
      - run: 
          name: Tag the image with ECR repo name 
          command: docker tag ${CIRCLE_PROJECT_REPONAME}:$TAG ${HELLOAPP_ECR_REPO}:$TAG    
      - run: 
          name: Push the image the ECR repo
          command: docker push ${HELLOAPP_ECR_REPO}:$TAG
  deploy:
    docker:
        - image: circleci/golang:1.10
    steps:
      - attach_workspace:
          at: /tmp/workspace
      - checkout
      - run:
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run:
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_PREVIOUS_BUILD_NUM' >> $BASH_ENV
      - run:
          name: Install and confgure kubectl
          command: sudo curl -L https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl && sudo chmod +x /usr/local/bin/kubectl  
      - run:
          name: Install and confgure kubectl aws-iam-authenticator
          command: curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator && sudo chmod +x ./aws-iam-authenticator && sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator
       - run:
          name: Install latest awscli version
          command: sudo apt install unzip && curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" && unzip awscli-bundle.zip &&./awscli-bundle/install -b ~/bin/aws
      - run:
          name: Get the kubeconfig file 
          command: export KUBECONFIG=$HOME/.kube/kubeconfig && /home/circleci/bin/aws eks --region $AWS_REGION update-kubeconfig --name $EKS_CLUSTER_NAME
      - run:
          name: Install and configuire helm
          command: sudo curl -L https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz | tar xz && sudo mv linux-amd64/helm /bin/helm && sudo rm -rf linux-amd64
      - run:
          name: Initialize helm
          command:  helm init --client-only --kubeconfig=$HOME/.kube/kubeconfig
      - run:
          name: Install tiller plugin
          command: helm plugin install https://github.com/rimusz/helm-tiller --kubeconfig=$HOME/.kube/kubeconfig        
      - run:
          name: Release helloapp using helm chart 
          command: bash scripts/release-helloapp.sh $TAG
workflows:
  version: 2
  primary:
    jobs:
      - build&pushImage
      - deploy:
          requires:
            - build&pushImage

version: 2

jobs:
  build&pushImage:
    working_directory: /go/src/hello-app
    docker:
      - image: circleci/golang:1.10
    steps:
      - checkout
      - run:
          name: build the binary
          command: go build -o hello-app
      - setup_remote_docker:
          docker_layer_caching: true
      - run:
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_BUILD_NUM' >> $BASH_ENV
      - run:
          name: Build the docker image
          command: docker build . -t ${CIRCLE_PROJECT_REPONAME}:$TAG
      - run:
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run:
          name: Login to ECR
          command: $(aws ecr get-login --region $AWS_REGION | sed -e 's/-e none//g')
      - run: 
          name: Tag the image with ECR repo name 
          command: docker tag ${CIRCLE_PROJECT_REPONAME}:$TAG ${HELLOAPP_ECR_REPO}:$TAG    
      - run: 
          name: Push the image the ECR repo
          command: docker push ${HELLOAPP_ECR_REPO}:$TAG
  deploy:
    docker:
        - image: circleci/golang:1.10
    steps:
      - attach_workspace:
          at: /tmp/workspace
      - checkout
      - run:
          name: Install AWS cli
          command: export TZ=Europe/Minsk && sudo ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > sudo  /etc/timezone && sudo apt-get update && sudo apt-get install -y awscli
      - run:
          name: Set the tag for the image, we will concatenate the app verson and circle build number with a `-` char in between
          command:  echo 'export TAG=$(cat VERSION)-$CIRCLE_PREVIOUS_BUILD_NUM' >> $BASH_ENV
      - run:
          name: Install and confgure kubectl
          command: sudo curl -L https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl && sudo chmod +x /usr/local/bin/kubectl  
      - run:
          name: Install and confgure kubectl aws-iam-authenticator
          command: curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator && sudo chmod +x ./aws-iam-authenticator && sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator
       - run:
          name: Install latest awscli version
          command: sudo apt install unzip && curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" && unzip awscli-bundle.zip &&./awscli-bundle/install -b ~/bin/aws
      - run:
          name: Get the kubeconfig file 
          command: export KUBECONFIG=$HOME/.kube/kubeconfig && /home/circleci/bin/aws eks --region $AWS_REGION update-kubeconfig --name $EKS_CLUSTER_NAME
      - run:
          name: Install and configuire helm
          command: sudo curl -L https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz | tar xz && sudo mv linux-amd64/helm /bin/helm && sudo rm -rf linux-amd64
      - run:
          name: Initialize helm
          command:  helm init --client-only --kubeconfig=$HOME/.kube/kubeconfig
      - run:
          name: Install tiller plugin
          command: helm plugin install https://github.com/rimusz/helm-tiller --kubeconfig=$HOME/.kube/kubeconfig        
      - run:
          name: Release helloapp using helm chart 
          command: bash scripts/release-helloapp.sh $TAG
workflows:
  version: 2
  primary:
    jobs:
      - build&pushImage
      - deploy:
          requires:
            - build&pushImage

At the end of the file, we see the workflows, workflows control the order in which the jobs specified in the file are executed and establishes dependencies and conditions for the job. For example, we may want our deploy job trigger only after my build job is complete so we added a dependency between them. Similarly, we may want to exclude the jobs from running on some particular branch then we can specify those type of conditions as well.

We have used a few environment variables in our pipeline configuration some of them were created by us and some were made available by CircleCI. We created AWS_REGION, HELLOAPP_ECR_REPO, EKS_CLUSTER_NAME, AWS_ACCESS_KEY_ID & AWS_SECRET_ACCESS_KEY variables. These variables are set via. CircleCI web console by going to the projects settings. Other variables that we have used are made available by CircleCI as a part of its environment setup process. Complete list of environment variables set by CircleCI can be found here.

Verify the working of the pipeline:

Once everything is set up properly then our application will get deployed on the k8s cluster and should be available for access. Get the external IP of the helloapp service and make a curl request to the hello endpoint

$ curl http://a31e25e7553af11e994620aebe144c51-242977608.us-west-2.elb.amazonaws.com/hello && printf "n"
{"Msg":"Hello World"}

$ curl http://a31e25e7553af11e994620aebe144c51-242977608.us-west-2.elb.amazonaws.com/hello && printf "n"

{"Msg":"Hello World"}

Now update the code and change the message “Hello World” to “Hello World Returns” and push your code. It will take a few minutes for the pipeline to complete execution and once it is complete make the curl request again to see the changes getting reflected.

$ curl http://a31e25e7553af11e994620aebe144c51-242977608.us-west-2.elb.amazonaws.com/hello && printf "n"
{"Msg":"Hello World Returns"}

$ curl http://a31e25e7553af11e994620aebe144c51-242977608.us-west-2.elb.amazonaws.com/hello && printf "n"

{"Msg":"Hello World Returns"}

Also, verify that a new tag is also created for the helloapp docker image on ECR.

Conclusion

In this blog post, we explored how we can set up a CI/CD pipeline for kubernetes and got basic exposure to CircleCI and Helm. Although helm is not absolutely necessary for building a pipeline, it has lots of benefits and is widely used across the industry. We can extend the pipeline to consider the cases where we have multiple environments like dev, staging & production and make the pipeline deploy the application to any of them depending upon some conditions. We can also add more jobs like integration tests. All the codes used in the blog post are available here.

Related Reads:

December 12, 2022

Automated Containerization and Migration of On-premise Applications to Cloud Platforms
Containerized applications are becoming more popular with each passing year. All enterprise applications are adopting container technology as they modernize their IT systems. Migrating your applications from VMs or physical machines to containers comes with multiple advantages like optimal resource utilization, faster deployment times, replication, quick cloning, lesser lock-in and so on. Various container orchestration platforms like Kubernetes, Google Container Engine (GKE), Amazon EC2 Container Service (Amazon ECS) help in quick deployment and easy management of your containerized applications. But in order to use these platforms, you need to migrate your legacy applications to containers or rewrite/redeploy your applications from scratch with the containerization approach. Rearchitecting your applications using containerization approach is preferable, but is that possible for complex legacy applications? Is your deployment team capable enough to list down each and every detail about the deployment process of your application? Do you have the patience of authoring a Docker file for each of the components of your complex application stack?

Automated migrations!

Velotio has been helping customers with automated migration of VMs and bare-metal servers to various container platforms. We have developed automation to convert these migrated applications as containers on various container deployment platforms like GKE, Amazon ECS and Kubernetes. In this blog post, we will cover one such migration tool developed at Velotio which will migrate your application running on a VM or physical machine to Google Container Engine (GKE) by running a single command.

Migration tool details

We have named our migration tool as A2C(Anything to Container). It can migrate applications running on any Unix or Windows operating system.

The migration tool requires the following information about the server to be migrated:
- IP of the server
- SSH User, SSH Key/Password of the application server
- Configuration file containing data paths for application/database/components (more details below)
- Required name of your docker image (The docker image that will get created for your application)
- GKE Container Cluster details
In order to store persistent data, volumes can be defined in container definition. Data changes done on volume path remain persistent even if the container is killed or crashes. Volumes are basically filesystem path from host machine on which your container is running, NFS or cloud storage. Containers will mount the filesystem path from your local machine to container, leading to data changes being written on the host machine filesystem instead of the container’s filesystem. Our migration tool supports data volumes which can be defined in the configuration file. It will automatically create disks for the defined volumes and copy data from your application server to these disks in a consistent way.

The configuration file we have been talking about is basically a YAML file containing filesystem level information about your application server. A sample of this file can be found below:
```
includes:
- /
volumes:
- var/log/httpd
- var/log/mariadb
- var/www/html
- var/lib/mysql
excludes:
- mnt
- var/tmp
- etc/fstab
- proc
- tmp
```
The configuration file contains 3 sections: includes, volumes and excludes:
- Includes contains filesystem paths on your application server which you want to add to your container image.
- Volumes contain filesystem paths on your application server which stores your application data. Generally, filesystem paths containing database files, application code files, configuration files, log files are good candidates for volumes.
- The excludes section contains filesystem paths which you don’t want to make part of the container. This may include temporary filesystem paths like /proc, /tmp and also NFS mounted paths. Ideally, you would include everything by giving “/” in includes section and exclude specifics in exclude section.
Docker image name to be given as input to the migration tool is the docker registry path in which the image will be stored, followed by the name and tag of the image. Docker registry is like GitHub of docker images, where you can store all your images. Different versions of the same image can be stored by giving version specific tag to the image. GKE also provides a Docker registry. Since in this demo we are migrating to GKE, we will also store our image to GKE registry.

GKE container cluster details to be given as input to the migration tool, contains GKE specific details like GKE project name, GKE container cluster name and GKE region name. A container cluster can be created in GKE to host the container applications. We have a separate set of scripts to perform cluster creation operation. Container cluster creation can also be done easily through GKE UI. For now, we will assume that we have a 3 node cluster created in GKE, which we will use to host our application.

Tasks performed under migration

Our migration tool (A2C), performs the following set of activities for migrating the application running on a VM or physical machine to GKE Container Cluster:

1. Install the A2C migration tool with all it’s dependencies to the target application server

2. Create a docker image of the application server, based on the filesystem level information given in the configuration file

3. Capture metadata from the application server like configured services information, port usage information, network configuration, external services, etc.

4. Push the docker image to GKE container registry

5. Create disk in Google Cloud for each volume path defined in configuration file and prepopulate disks with data from application server

6. Create deployment spec for the container application in GKE container cluster, which will open the required ports, configure required services, add multi container dependencies, attach the pre populated disks to containers, etc.

7. Deploy the application, after which you will have your application running as containers in GKE with application software in running state. New application URL’s will be given as output.

8. Load balancing, HA will be configured for your application.

Demo

For demonstration purpose, we will deploy a LAMP stack (Apache+PHP+Mysql) on a CentOS 7 VM and will run the migration utility for the VM, which will migrate the application to our GKE cluster. After the migration we will show our application preconfigured with the same data as on our VM, running on GKE.

Step 1

We setup LAMP stack using Apache, PHP and Mysql on a CentOS 7 VM in GCP. The PHP application can be used to list, add, delete or edit user data. The data is getting stored in MySQL database. We added some data to the database using the application and the UI would show the following:

Step 2

Now we run the A2C migration tool, which will migrate this application stack running on a VM into a container and auto-deploy it to GKE.
```
# ./migrate.py -c lamp_data_handler.yml -d "tcp://35.202.201.247:4243" -i migrate-lamp -p glassy-chalice-XXXXX -u root -k ~/mykey -l a2c-host --gcecluster a2c-demo --gcezone us-central1-b 130.211.231.58
```
Pushing converter binary to target machine Pushing data config to target machine Pushing installer script to target machine Running converter binary on target machine [130.211.231.58] out: creating docker image [130.211.231.58] out: image created with id 6dad12ba171eaa8615a9c353e2983f0f9130f3a25128708762228f293e82198d [130.211.231.58] out: Collecting metadata for image [130.211.231.58] out: Generating metadata for cent7 [130.211.231.58] out: Building image from metadata Pushing the docker image to GCP container registryInitiate remote data copy Activated service account credentials for: [glassy-chaliceXXXXX@appspot.gserviceaccount.com] for volume var/log/httpd Creating disk migrate-lamp-0 Disk Created Successfully transferring data from sourcefor volume var/log/mariadb Creating disk migrate-lamp-1 Disk Created Successfully transferring data from sourcefor volume var/www/html Creating disk migrate-lamp-2 Disk Created Successfully transferring data from sourcefor volume var/lib/mysql Creating disk migrate-lamp-3 Disk Created Successfully transferring data from sourceConnecting to GCP cluster for deployment Created service file /tmp/gcp-service.yaml Created deployment file /tmp/gcp-deployment.yaml
```
Pushing converter binary to target machine
Pushing data config to target machine
Pushing installer script to target machine
Running converter binary on target machine
[130.211.231.58] out: creating docker image
[130.211.231.58] out: image created with id 6dad12ba171eaa8615a9c353e2983f0f9130f3a25128708762228f293e82198d
[130.211.231.58] out: Collecting metadata for image
[130.211.231.58] out: Generating metadata for cent7
[130.211.231.58] out: Building image from metadata
Pushing the docker image to GCP container registryInitiate remote data copy
Activated service account credentials for: [glassy-chaliceXXXXX@appspot.gserviceaccount.com]
for volume var/log/httpd
Creating disk migrate-lamp-0
Disk Created Successfully
transferring data from sourcefor volume var/log/mariadb
Creating disk migrate-lamp-1
Disk Created Successfully
transferring data from sourcefor volume var/www/html
Creating disk migrate-lamp-2
Disk Created Successfully
transferring data from sourcefor volume var/lib/mysql
Creating disk migrate-lamp-3
Disk Created Successfully
transferring data from sourceConnecting to GCP cluster for deployment
Created service file /tmp/gcp-service.yaml
Created deployment file /tmp/gcp-deployment.yaml
```
Deploying to GKE‍
```
$ kubectl get pod

NAMEREADY STATUSRESTARTS AGE
migrate-lamp-3707510312-6dr5g 0/1 ContainerCreating 058s
```
```
$ kubectl get deployment

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
migrate-lamp 1 1 10 1m
```
```
$ kubectl get service

NAME CLUSTER-IP EXTERNAL-IP PORT(S)AGE
kubernetes 10.59.240.1443/TCP23hmigrate-lamp 10.59.248.44 35.184.53.100 3306:31494/TCP,80:30909/TCP,22:31448/TCP 53s
```
You can access your application using above connection details!

Step 3

Access LAMP stack on GKE using the IP 35.184.53.100 on default 80 port as was done on the source machine.

Here is the Docker image being created in GKE Container Registry:

We can also see that disks were created with migrate-lamp-x, as part of this automated migration.

Load Balancer also got provisioned in GCP as part of the migration process

Following service files and deployment files were created by our migration tool to deploy the application on GKE:
```
# cat /tmp/gcp-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: migrate-lamp
name: migrate-lamp
spec:
ports:
- name: migrate-lamp-3306
port: 3306
- name: migrate-lamp-80
port: 80
- name: migrate-lamp-22
port: 22
selector:
app: migrate-lamp
type: LoadBalancer
```
# cat /tmp/gcp-deployment.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: app: migrate-lamp name: migrate-lamp spec: replicas: 1 selector: matchLabels: app: migrate-lamp template: metadata: labels: app: migrate-lamp spec: containers: - image: us.gcr.io/glassy-chalice-129514/migrate-lamp name: migrate-lamp ports: - containerPort: 3306 - containerPort: 80 - containerPort: 22 securityContext: privileged: true volumeMounts: - mountPath: /var/log/httpd name: migrate-lamp-var-log-httpd - mountPath: /var/www/html name: migrate-lamp-var-www-html - mountPath: /var/log/mariadb name: migrate-lamp-var-log-mariadb - mountPath: /var/lib/mysql name: migrate-lamp-var-lib-mysql volumes: - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-0 name: migrate-lamp-var-log-httpd - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-2 name: migrate-lamp-var-www-html - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-1 name: migrate-lamp-var-log-mariadb - gcePersistentDisk: fsType: ext4 pdName: migrate-lamp-3 name: migrate-lamp-var-lib-mysql
```
# cat /tmp/gcp-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: migrate-lamp
name: migrate-lamp
spec:
replicas: 1
selector:
matchLabels:
app: migrate-lamp
template:
metadata:
labels:
app: migrate-lamp
spec:
containers:
- image: us.gcr.io/glassy-chalice-129514/migrate-lamp
name: migrate-lamp
ports:
- containerPort: 3306
- containerPort: 80
- containerPort: 22
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/log/httpd
name: migrate-lamp-var-log-httpd
- mountPath: /var/www/html
name: migrate-lamp-var-www-html
- mountPath: /var/log/mariadb
name: migrate-lamp-var-log-mariadb
- mountPath: /var/lib/mysql
name: migrate-lamp-var-lib-mysql
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-0
name: migrate-lamp-var-log-httpd
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-2
name: migrate-lamp-var-www-html
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-1
name: migrate-lamp-var-log-mariadb
- gcePersistentDisk:
fsType: ext4
pdName: migrate-lamp-3
name: migrate-lamp-var-lib-mysql
```
Conclusion

Migrations are always hard for IT and development teams. At Velotio, we have been helping customers to migrate to cloud and container platforms using streamlined processes and automation. Feel free to reach out to us at contact@rsystems.com to know more about our cloud and container adoption/migration offerings.
December 12, 2022

Tag: containers

Real Time Analytics for IoT Data using Mosquitto, AWS Kinesis and InfluxDB

Why the Analysis of IoT data is different

Network Protocol

Datastore

Pipeline Overview

Setting up a MQTT Message Broker Server:

Setting up InfluxDB and Grafana

Creating a Kinesis stream

Creating the MQTT client

Create a Kinesis Producer

Create a Kinesis Consumer

Processing the data and writing it to InfluxDB

Putting everything together in our main function

Visualization through Grafana

Conclusion:

Taking Amazon’s Elastic Kubernetes Service for a Spin

What is Elastic Kubernetes service (EKS)?

Key EKS concepts:

Prerequisites for launching an EKS cluster:

Launching an EKS cluster:

Configure kubectl for EKS:

Launch and configure worker nodes :

Deploying an application:

1. MongoDB Deployment YAML

2. Test Application Development YAML

3. MongoDB Service YAML

4. Test Application Service YAML

Services

Deployments

To Store Data :

To Get Data :

Comparison BETWEEN GKE, ECS and EKS:

Conclusion:

References:

Continuous Integration & Delivery (CI/CD) for Kubernetes Using CircleCI & Helm

Introduction

What is CircleCI?

What is Helm?

Building the Pipeline

Overview:

Creating Helm Chart:

Configure the pipeline:

Verify the working of the pipeline:

Conclusion

Automated Containerization and Migration of On-premise Applications to Cloud Platforms

Automated migrations!

Migration tool details

Tasks performed under migration

Demo

Step 1

Step 2

Step 3

Conclusion