Category: Cloud & DevOps

The Ultimate Guide to Disaster Recovery for Your Kubernetes Clusters
Kubernetes allows us to run a containerized application at scale without drowning in the details of application load balancing. You can ensure high availability for your applications running on Kubernetes by running multiple replicas (pods) of the application. All the complexity of container orchestrations is hidden away safely so that you can focus on developing application instead of deploying it. Learn more about high availability of Kubernetes Clusters and how you can use Kubedm for high availability in Kubernetes here.

But using Kubernetes has its own challenges and getting Kubernetes up and running takes some real work. If you are not familiar with getting Kubernetes up and running, you might want to take a look here.

Kubernetes allows us to have a zero downtime deployment, yet service interrupting events are inevitable and can occur at any time. Your network can go down, your latest application push can introduce a critical bug, or in the rarest case, you might even have to face a natural disaster.

When you are using Kubernetes, sooner or later, you need to set up a backup. In case your cluster goes into an unrecoverable state, you will need a backup to go back to the previous stable state of the Kubernetes cluster.

Why Backup and Recovery?

There are three reasons why you need a backup and recovery mechanism in place for your Kubernetes cluster. These are:
1. To recover from Disasters: like someone accidentally deleted the namespace where your deployments reside.
2. Replicate the environment: You want to replicate your production environment to staging environment before any major upgrade.
3. Migration of Kubernetes Cluster: Let’s say, you want to migrate your Kubernetes cluster from one environment to another.
What to Backup?

Now that you know why, let’s see what exactly do you need to backup. The two things you need to backup are:
1. Your Kubernetes control plane is stored into etcd storage and you need to backup the etcd state to get all the Kubernetes resources.
2. If you have stateful containers (which you will have in real world), you need a backup of persistent volumes as well.
How to Backup?

There have been various tools like Heptio ark and Kube-backup to backup and restore the Kubernetes cluster for cloud providers. But, what if you are not using managed Kubernetes cluster? You might have to get your hands dirty if you are running Kubernetes on Baremetal, just like we are.

We are running 3 master Kubernetes cluster with 3 etcd members running on each master. If we lose one master, we can still recover the master because etcd quorum is intact. Now if we lose two masters, we need a mechanism to recover from such situations as well for production grade clusters.

Want to know how to set up multi-master Kubernetes cluster? Keep reading!

Taking etcd backup:

There is a different mechanism to take etcd backup depending on how you set up your etcd cluster in Kubernetes environment.

There are two ways to setup etcd cluster in kubernetes environment:
1. Internal etcd cluster: It means you’re running your etcd cluster in the form of containers/pods inside the Kubernetes cluster and it is the responsibility of Kubernetes to manage those pods.
2. External etcd cluster: Etcd cluster you’re running outside of Kubernetes cluster mostly in the form of Linux services and providing its endpoints to Kubernetes cluster to write to.
Backup Strategy for Internal Etcd Cluster:

To take a backup from inside a etcd pod, we will be using Kubernetes CronJob functionality which will not require any etcdctl client to be installed on the host.

Following is the definition of Kubernetes CronJob which will take etcd backup every minute:
`apiVersion: batch/v1beta1kind: CronJobmetadata: name: backup namespace: kube-systemspec: # activeDeadlineSeconds: 100schedule: "*/1 * * * *" jobTemplate: spec: template: spec: containers: - name: backup # Same image as in /etc/kubernetes/manifests/etcd.yaml image: k8s.gcr.io/etcd:3.2.24 env: - name: ETCDCTL_API value: "3" command: ["/bin/sh"] args: ["-c", "etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key snapshot save /backup/etcd-snapshot-$(date +%Y-%m-%d_%H:%M:%S_%Z).db"] volumeMounts: - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs readOnly: true - mountPath: /backup name: backup restartPolicy: OnFailure hostNetwork: true volumes: - name: etcd-certs hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate - name: backup hostPath: path: /data/backup type: DirectoryOrCreate
```
`apiVersion: batch/v1beta1kind: CronJobmetadata: name: backup namespace: kube-systemspec: # activeDeadlineSeconds: 100schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
# Same image as in /etc/kubernetes/manifests/etcd.yaml
image: k8s.gcr.io/etcd:3.2.24
env:
- name: ETCDCTL_API
value: "3"
command: ["/bin/sh"]
args: ["-c", "etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key snapshot save /backup/etcd-snapshot-$(date +%Y-%m-%d_%H:%M:%S_%Z).db"]
volumeMounts:
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
readOnly: true
- mountPath: /backup
name: backup
restartPolicy: OnFailure
hostNetwork: true
volumes:
- name: etcd-certs
hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
- name: backup
hostPath:
path: /data/backup
type: DirectoryOrCreate
```
Backup Strategy for External Etcd Cluster:

If you running etcd cluster on Linux hosts as a service, you should set up a Linux cron job to take backup of your cluster.

Run the following command to save etcd backup
```
ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save /path/for/backup/snapshot.db
```
Disaster Recovery

Now, Let’s say the Kubernetes cluster went completely down and we need to recover the Kubernetes cluster from the etcd snapshot.

Normally, start the etcd cluster and do the kubeadm init on the master node with etcd endpoints.

Make sure you put the backup certificates into /etc/kubernetes/pki folder before kubeadm init. It will pick up the same certificates.

Restore Strategy for Internal Etcd Cluster:
```
docker run --rm 
-v '/data/backup:/backup' 
-v '/var/lib/etcd:/var/lib/etcd' 
--env ETCDCTL_API=3 
k8s.gcr.io/etcd:3.2.24' 
/bin/sh -c "etcdctl snapshot restore '/backup/etcd-snapshot-2018-12-09_11:12:05_UTC.db' ; mv /default.etcd/member/ /var/lib/etcd/"

kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd
```
Restore Strategy for External Etcd Cluster

Restore the etcd on 3 nodes using following commands:
ETCDCTL_API=3 etcdctl snapshot restore snapshot-188.db --name master-0 --initial-cluster master-0=http://10.0.1.188:2380,master-01=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 --initial-cluster-token my-etcd-token --initial-advertise-peer-urls http://10.0.1.188:2380 ETCDCTL_API=3 etcdctl snapshot restore snapshot-136.db --name master-1 --initial-cluster master-0=http://10.0.1.188:2380,master-1=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 --initial-cluster-token my-etcd-token --initial-advertise-peer-urls http://10.0.1.136:2380 ETCDCTL_API=3 etcdctl snapshot restore snapshot-155.db --name master-2 --initial-cluster master-0=http://10.0.1.188:2380,master-1=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 --initial-cluster-token my-etcd-token --initial-advertise-peer-urls http://10.0.1.155:2380
```
ETCDCTL_API=3 etcdctl snapshot restore snapshot-188.db 
--name master-0 
--initial-cluster master-0=http://10.0.1.188:2380,master-01=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 
--initial-cluster-token my-etcd-token 
--initial-advertise-peer-urls http://10.0.1.188:2380

ETCDCTL_API=3 etcdctl snapshot restore snapshot-136.db 
--name master-1 
--initial-cluster master-0=http://10.0.1.188:2380,master-1=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 
--initial-cluster-token my-etcd-token 
--initial-advertise-peer-urls http://10.0.1.136:2380

ETCDCTL_API=3 etcdctl snapshot restore snapshot-155.db 
--name master-2 
--initial-cluster master-0=http://10.0.1.188:2380,master-1=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 
--initial-cluster-token my-etcd-token 
--initial-advertise-peer-urls http://10.0.1.155:2380
```
The above three commands will give you three restored folders on three nodes named master:

0.etcd, master-1.etcd and master-2.etcd

Now, Stop all the etcd service on the nodes, replace the restored folder with the restored folders on all nodes and start the etcd service. Now you can see all the nodes, but in some time you will see that only master node is ready and other nodes went into the not ready state. You need to join those two nodes again with the existing ca.crt file (you should have a backup of that).

Run the following command on master node:
```
kubeadm token create --print-join-command
```
It will give you kubeadm join command, add one –ignore-preflight-errors and run that command on other two nodes for them to come into the ready state.

Conclusion

One way to deal with master failure is to set up multi-master Kubernetes cluster, but even that does not allow you to completely eliminate the Kubernetes etcd backup and restore, and it is still possible that you may accidentally destroy data on the HA environment.

Need help with disaster recovery for your Kubernetes Cluster? Connect with the experts at Velotio!

For more insights into Kubernetes Disaster Recovery check out here.
March 1, 2024
Simplifying MySQL Sharding with ProxySQL: A Step-by-Step Guide
Introduction:

ProxySQL is a powerful SQL-aware proxy designed to sit between database servers and client applications, optimizing database traffic with features like load balancing, query routing, and failover. This article focuses on simplifying the setup of ProxySQL, especially for users implementing data-based sharding in a MySQL database.

What is Sharding?

Sharding involves partitioning a database into smaller, more manageable pieces called shards based on certain criteria, such as data attributes. ProxySQL supports data-based sharding, allowing users to distribute data across different shards based on specific conditions.

Understanding the Need for ProxySQL:

ProxySQL is an intermediary layer that enhances database management, monitoring, and optimization. With features like data-based sharding, ProxySQL is an ideal solution for scenarios where databases need to be distributed based on specific data attributes, such as geographic regions.

‍Installation & Setup:‍

There are two ways to install the proxy, either by installing it using packages or running ProxySQL in docker. ProxySQL can be installed using two methods: via packages or running it in a Docker container. For this guide, we will focus on the Docker installation.

1. Install ProxySQL and MySQL Docker Images:

To start, pull the necessary Docker images for ProxySQL and MySQL using the following commands:
```
docker pull mysql:latest
docker pull proxysql/proxysql
```
2. Create Docker Network:

Create a Docker network for communication between MySQL containers:
```
docker network create multi-tenant-network
```
Note: ProxySQL setup will need connections to multiple SQL servers. So, we will set up multiple SQL servers on our docker inside a Docker network.

Containers within the same Docker network can communicate with each other using their container names or IP addresses.

You can check the list of all the Docker networks currently present by running the following command:
```
docker network ls
```
3. Set Up MySQL Containers:

Now, create three MySQL containers within the network:

Note: We can create any number of MySQL containers.
```
docker run -d --name mysql_host_1 --network=multi-tenant-network -p 3307:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest 
docker run -d --name mysql_host_2 --network=multi-tenant-network -p 3308:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest 
docker run -d --name mysql_host_3 --network=multi-tenant-network -p 3309:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest
```
Note: Adjust port numbers as necessary.

The default MySQL protocol port is 3306, but since we cannot access all three of our MySQL containers on the same port, we have set their ports to 3307, 3308, and 3309. Although internally, all MySQL containers will connect using port 3306.

–network=multi-tenant-network. This specifies that the container should be created under the specified network.

We have also specified the root password of the MySQL container to log into it, where the username is “root” and the password is “pass123” for all three of them.

After running the above three commands, three MySQL containers will start running inside the network. You can connect to these three hosts using host = localhost or 127.0.0.1 and port = 3307 / 3308 / 3309.

To ping the port, use the following command:

for macOS:
```
nc -zv 127.0.0.1 3307
```
for Windows:
```
ping 127.0.0.1 3307
```
for Linux:
```
telnet 127.0.0.1 3307
```
Reference Image

‍

4. Create Users in MySQL Containers:

Create “user_shard” and “monitor” users in each MySQL container.

The “user_shard” user will be used by the proxy to make queries to the DB.

The “monitor” user will be used by the proxy to monitor the DB.

Note: To access the MySQL container mysql_host_1, use the command:
```
docker exec -it mysql_host_1 mysql -uroot -ppass123
```
Use the following commands inside the MySQL container to create the user:‍‍
```
CREATE USER 'user_shard'@'%' IDENTIFIED BY 'pass123'; 
GRANT ALL PRIVILEGES ON *.* TO 'user_shard'@'%' WITH GRANT OPTION; 
FLUSH PRIVILEGES;

CREATE USER monitor@'%' IDENTIFIED BY 'pass123'; 
GRANT ALL PRIVILEGES ON *.* TO monitor@'%' WITH GRANT OPTION; 
FLUSH PRIVILEGES;
```
Repeat the above steps for mysql_host_2 & mysql_host_3.‍

If, at any point, you need to drop the user, you can use the following command:
```
DROP USER monitor@’%’;
```
5. Prepare ProxySQL Configuration:

To prepare the configuration, we will need the IP addresses of the MySQL containers. To find those, we can use the following command:
```
docker inspect mysql_host_1;
docker inspect mysql_host_2; 
docker inspect mysql_host_3;
```
By running these commands, you will get all the details of the MySQL Docker container under a field named “IPAddress” inside your network. That is the IP address of that particular MySQL container.

Example:
mysql_host_1: 172.19.0.2‍

mysql_host_2: 172.19.0.3‍

mysql_host_3: 172.19.0.4

Reference image for IP address of mysql_host_1: 172.19.0.2

Now, create a ProxySQL configuration file named proxysql.cnf. Include details such as IP addresses of MySQL containers, administrative credentials, and MySQL users.

Below is the content that needs to be added to the proxysql.cnf file:
datadir="/var/lib/proxysql" admin_variables= { admin_credentials="admin:admin;radmin:radmin" mysql_ifaces="0.0.0.0:6032" refresh_interval=2000 hash_passwords=false } mysql_variables= { threads=4 max_connections=2048 default_query_delay=0 default_query_timeout=36000000 have_compress=true poll_timeout=2000 interfaces="0.0.0.0:6033;/tmp/proxysql.sock" default_schema="information_schema" stacksize=1048576 server_version="5.1.30" connect_timeout_server=10000 monitor_history=60000 monitor_connect_interval=200000 monitor_ping_interval=200000 ping_interval_server_msec=10000 ping_timeout_server=200 commands_stats=true sessions_sort=true monitor_username="monitor" monitor_password="pass123" } mysql_servers = ( { address="172.19.0.2" , port=3306 , hostgroup=10, max_connections=100 }, { address="172.19.0.3" , port=3306 , hostgroup=20, max_connections=100 }, { address="172.19.0.4" , port=3306 , hostgroup=30, max_connections=100 } ) mysql_users = ( { username = "user_shard" , password = "pass123" , default_hostgroup = 10 , active = 1 }, { username = "user_shard" , password = "pass123" , default_hostgroup = 20 , active = 1 }, { username = "user_shard" , password = "pass123" , default_hostgroup = 30 , active = 1 } )
```
datadir="/var/lib/proxysql"

admin_variables=
{
    admin_credentials="admin:admin;radmin:radmin"
    mysql_ifaces="0.0.0.0:6032"
    refresh_interval=2000
    hash_passwords=false
}

mysql_variables=
{
    threads=4
    max_connections=2048
    default_query_delay=0
    default_query_timeout=36000000
    have_compress=true
    poll_timeout=2000
    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
    default_schema="information_schema"
    stacksize=1048576
    server_version="5.1.30"
    connect_timeout_server=10000
    monitor_history=60000
    monitor_connect_interval=200000
    monitor_ping_interval=200000
    ping_interval_server_msec=10000
    ping_timeout_server=200
    commands_stats=true
    sessions_sort=true
    monitor_username="monitor"
    monitor_password="pass123"
}

mysql_servers =
(
    { address="172.19.0.2" , port=3306 , hostgroup=10, max_connections=100 },
    { address="172.19.0.3" , port=3306 , hostgroup=20, max_connections=100 },
    { address="172.19.0.4" , port=3306 , hostgroup=30, max_connections=100 }
)


mysql_users =
(
    { username = "user_shard" , password = "pass123" , default_hostgroup = 10 , active = 1 },
    { username = "user_shard" , password = "pass123" , default_hostgroup = 20 , active = 1 },
    { username = "user_shard" , password = "pass123" , default_hostgroup = 30 , active = 1 }
)
```
Most of the settings are default; we won’t go into much detail for each setting.

admin_variables: These variables are used for ProxySQL’s administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

mysql_variables, monitor_username, and monitor_password are used to specify the username that ProxySQL will use when connecting to MySQL servers for monitoring purposes. This monitoring user is used to execute queries and gather statistics about the health and performance of the MySQL servers. This is the user we created during step 4.

mysql_servers will contain all the MySQL servers we want to be connected with ProxySQL. Each entry will have the IP address of the MySQL container, port, host group, and max_connections. Mysql_users will have all the users we created during step 4.

7. Run ProxySQL Container:

Inside the same directory where the proxysql.cnf file is located, run the following command to start ProxySQL:
```
docker run -d --rm -p 6032:6032 -p 6033:6033 -p 6080:6080 --name=proxysql --network=multi-tenant-network -v $PWD/proxysql.cnf:/etc/proxysql.cnf proxysql/proxysql
```
Here, port 6032 is used for ProxySQL’s administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

Port 6033 is the default port for ProxySQL’s MySQL protocol interface. It is used for handling MySQL client connections. Our application will use it to access the ProxySQL db and make SQL queries.

The above command will make ProxySQL run on our Docker with the configuration provided in the proxysql.cnf file.

Inside ProxySQL Container:

8. Access ProxySQL Admin Console:

Now, to access the ProxySQL Docker container, use the following command:
```
docker exec -it proxysql bash
```
Now, once you’re inside the ProxySQL Docker container, you can access the ProxySQL admin console using the command:
```
mysql -u admin -padmin -h 127.0.0.1 -P 6032
```
You can run the following queries to get insights into your ProxySQL server:

i) To get the list of all the connected MySQL servers:
```
SELECT * FROM mysql_servers;
```
ii) Verify the status of the MySQL backends in the monitor database tables in ProxySQL admin using the following command:
```
SHOW TABLES FROM monitor;
```
If this returns an empty set, it means that the monitor username and password are not set correctly. You can do so by using the below commands:
```
UPDATE global_variables SET variable_value=’monitor’ WHERE variable_name='mysql-monitor_username'; 
UPDATE global_variables SET variable_value=’pass123’ WHERE variable_name='mysql-monitor_password';
LOAD MYSQL VARIABLES TO RUNTIME; 
SAVE MYSQL VARIABLES TO DISK;
```
And then restart the proxy Docker container:

iii) Check the status of DBs connected to ProxySQL using the following command:
```
SELECT * FROM monitor.mysql_server_connect_log ORDER BY time_start_us DESC;
```
iv) To get a list of all the ProxySQL global variables, use the following command:
```
SELECT * FROM global_variables; 
```
v) To get all the queries made on ProxySQL, use the following command:
```
Select * from stats_mysql_query_digest;
```
Note: Whenever we change any row, use the below commands to load them:

Change in variables:
```
LOAD MYSQL VARIABLES TO RUNTIME; 
SAVE MYSQL VARIABLES TO DISK;

Change in mysql_servers:
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;

Change in mysql_query_rules:
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;
```
And then restart the proxy docker container.

IMPORTANT:

To connect to ProxySQL’s admin console, first get into the Docker container using the following command:
```
docker exec -it proxysql bash
```
Then, to access the ProxySQL admin console, use the following command:
```
mysql -u admin -padmin -h 127.0.0.1 -P6032
```
To access the ProxySQL MySQL console, we can directly access it using the following command without going inside the Docker ProxySQL container:
```
mysql -u user_shard -ppass123 -h 127.0.0.1 -P6033
```
To make queries to the database, we make use of ProxySQL’s 6033 port, where MySQL is being accessed.

9. Define Query Rules:

We can add custom query rules inside the mysql_query_rules table to redirect queries to specific databases based on defined patterns. Load the rules to runtime and save to disk.

12. Sharding Example:

Now, let’s illustrate how to leverage ProxySQL’s data-based sharding capabilities through a practical example. We’ll create three MySQL containers, each containing data from different continents in the “world” database, specifically within the “countries” table.

Step 1: Create 3 MySQL containers named mysql_host_1, mysql_host_2 & mysql_host_3.

Inside all containers, create a database named “world” with a table named “countries”.

i) Inside mysql_host_1: Insert countries using the following query:
```
INSERT INTO `countries` VALUES (1,'India','Asia'),(2,'Japan','Asia'),(3,'China','Asia'),(4,'USA','North America'),(5,'Cuba','North America'),(6,'Honduras','North America');
```
ii) Inside mysql_host_2: Insert countries using the following query:
```
INSERT INTO `countries` VALUES (1,'Kenya','Africa'),(2,'Ghana','Africa'),(3,'Morocco','Africa'),(4, "Brazil", "South America"), (5, "Chile", "South America"), (6, "Morocco", "South America");
```
iii) Inside mysql_host_3: Insert countries using the following query:

CODE: INSERT INTO `countries` VALUES (1, “Italy”, “Europe”), (2, “Germany”, “Europe”), (3, “France”, “Europe”);

Now, we have distinct data sets for Asia & North America in mysql_host_1, Africa & South America in mysql_host_2, and Europe in mysql_host_3..js

Step 2: Define Query Rules for Sharding

Let’s create custom query rules to redirect queries based on the continent specified in the SQL statement.

For example, if the query contains the continent “Asia,” we want it to be directed to mysql_host_1.

— Query Rule for Asia and North America
```
INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (10, 1, 'user_shard', "s*continents*=s*.*?(Asia|North America).*?s*", 10, 0);
```
— Query Rule for Africa and South America
```
INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (20, 1, 'user_shard', "s*continents*=s*.*?(Africa|South America).*?s*", 20, 0);
```
— Query Rule for Europe ‍
```
INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (30, 1, 'user_shard', "s*continents*=s*.*?(Europe).*?s*", 30, 0);
```
Step 3: Apply and Save Query Rules

After adding the query rules, ensure they take effect by running the following commands:‍
```
LOAD MYSQL QUERY RULES TO RUNTIME; 
SAVE MYSQL QUERY RULES TO DISK;
```
Step 4: Test Sharding

Now, access the MySQL server using the ProxySQL port and execute queries:
```
mysql -u user_shard -ppass123 -h 127.0.0.1 -P 6033
```
```
use world;
```
— Example Queries:
```
Select * from countries where id = 1 and continent = "Asia";
```
— This will return id=1, name=India, continent=Asia
```
Select * from countries where id = 1 and continent = "Africa";
```
— This will return id=1, name=Kenya, continent=Africa.
```
Select * from countries where id = 1 and continent = "Africa";
```
Based on the defined query rules, the queries will be redirected to the specified MySQL host groups. If no rules match, the default host group that’s specified in mysql_users inside proxysql.cnf will be used.

Conclusion:

ProxySQL simplifies access to distributed data through effective sharding strategies. Its flexible query rules, combined with regex patterns and host group definitions, offer significant flexibility with relative simplicity.

By following this step-by-step guide, users can quickly set up ProxySQL and leverage its capabilities to optimize database performance and achieve efficient data distribution.

References:

Download and Install ProxySQL – ProxySQL

How to configure ProxySQL for the first time – ProxySQL

Admin Variables – ProxySQL‍
February 22, 2024
Streamline Kubernetes Storage Upgrades
Introduction:

As technology advances, organizations are constantly seeking ways to optimize their IT infrastructure to enhance performance, reduce costs, and gain a competitive edge. One such approach involves migrating from traditional storage solutions to more advanced options that offer superior performance and cost-effectiveness.

In this blog post, we’ll explore a recent project (On Azure) where we successfully migrated our client’s applications from Disk type Premium SSD to Premium SSD v2. This migration led to performance improvements and cost savings for our client.

Prerequisites:

Before initiating this migration, ensure the following prerequisites are in place:
1. Kubernetes Cluster: Ensure you have a working K8S cluster to host your applications.
2. Velero Backup Tool: Install Velero, a widely-used backup and restoration tool tailored for Kubernetes environments.
Overview of Velero:

Velero stands out as a powerful tool designed for robust backup, restore, and migration solutions within Kubernetes clusters. It plays a crucial role in ensuring data safety and continuity during complex migration operations.

Refer to the article on Velero installation and configuration.

Strategic Plan Overview:

There is two methods for upgrading storage classes:
- Migration via Velero and CSI Integration:
This approach leverages Velero’s capabilities in conjunction with CSI integration to achieve a seamless and efficient migration.
- Using Cloud Methods:
This method involves leveraging cloud provider-specific procedures. It includes steps like taking a snapshot of the disk, creating a new disk from the snapshot, and then establishing a Kubernetes volume using disk referencing.

Step-by-Step Guide:

Migration via Velero and CSI Integration:

Step 1 : Storage Class for Premium SSD v2

Define a new storage class that supports Azure Premium SSD v2 disks. This storage class will be used to provision new persistent volumes during the restore process.
```
# We have taken azure storage class example

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: premium-ssd-v2
parameters:
 cachingMode: None
 skuName: PremiumV2_LRS # (Disk Type)
provisioner: disk.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
```
Step 2: Volume Snapshot Class

Introduce a Volume Snapshot Class to enable snapshot creation for persistent volumes. This class will be utilized for capturing the current state of persistent volumes before restoring them using Premium SSD v2.
```
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: disk-snapshot-class
driver: disk.csi.azure.com
deletionPolicy: Delete
parameters:
  incremental: "false"
```
Step 3: Update Velero Deployment and Daemonset

Enable CSI (Container Storage Interface) support in both the Velero deployment and the node-agent daemonset. This modification allows Velero to interact with the Cloud Disk CSI driver for provisioning and managing persistent volumes. Additionally, configure the Velero client to utilize the CSI plugin, ensuring that Velero utilizes the Cloud Disk CSI driver for backup and restore operations.
```
# Enable CSI Server side features 

$ Kubectl -n velero edit deployment/velero
$ kubectl -n velero edit daemonset/restic

# Add below --features=EnableCSI flag in both resources 

    spec:
      containers:
      - args:
        - server
        - --features=EnableCSI

# Enable client side features 

$ velero client config set features=EnableCSI
```
Step 4: Take Velero Backup

Create a Velero backup of all existing persistent volumes stored on Disk Premium SSD. These backups serve as a safety net in case of any unforeseen issues during the migration process. And we can use the include and exclude flags with the velero backup commands.

Reference Article : https://velero.io/docs/v1.12/resource-filtering
```
# run the below command for taking backup 
$ velero backup create backup_name --include-namespaces namespace_name
```
Step 5: ConfigMap Deployment

Deploy a ConfigMap in the Velero namespace. This ConfigMap defines the mapping between the old storage class (Premium SSD) and the new storage class (Premium SSD v2). During the restore process, Velero will use this mapping to recreate the persistent volumes using the new storage class.
```
apiVersion: v1
data:
  # managed-premium : premium-ssd-v2
  older storage_class name : new storage_class name
kind: ConfigMap
metadata:
  labels:
    velero.io/change-storage-class: RestoreItemAction
    velero.io/plugin-config: ""
  name: storage-class-config
  namespace: velero
```
Step 6: Velero Restore Operation

Initiate the Velero restore process. This will replace the existing persistent volumes with new ones provisioned using Disk Premium SSD v2. The ConfigMap will ensure that the restored persistent volumes utilize the new storage class.

Reference article: https://velero.io/docs/v1.12/restore-reference
```
# run the below command for restoring from backups to different namespace 
$ velero restore create restore-name --from-backup backup-name --namespace-mappings namespace1:namespace2
# verify the new restored resources in namespace2
$ kubectl get pvc,pv,pod -n namespace2
```
Step 7: Verification & Testing

Verify that all applications continue to function correctly after the restore process. Check for any performance improvements and cost savings as a result of the migration to Premium SSD v2.

Step 8: Post-Migration Cleanup

Remove any temporary resources created during the migration process, such as Volume Snapshots, and the custom Volume Snapshot Class. And delete the old persistent volume claims (PVCs) that were associated with the Premium SSD disks. This will trigger the automatic deletion of the corresponding persistent volumes (PVs) and Azure Disk storage.

Impact:

It’s less risky because all new objects are created while retaining other copies with snapshots. And during the scheduling of new pods, the new Premium SSD v2 disks will be provisioned in the same zone as the node where the pod is being scheduled. While the content of the new disks is restored from the snapshot, there may be some downtime expected. The duration of downtime depends on the size of the disks being restored.

Conclusion:

Migrating from any storage class to a newer, more performant one using Velero can provide significant benefits for your organization. By leveraging Velero’s comprehensive backup and restore functionalities, you can effectively migrate your applications to the new storage class while maintaining data integrity and application functionality. Whether you’re upgrading from Premium SSD to Premium SSD v2 or transitioning to a completely different storage provider. By adopting this approach, organizations can reap the rewards of enhanced performance, reduced costs, and simplified storage management.
December 13, 2023
Unlocking Key Insights in NATS Development: My Journey from Novice to Expert – Part 1
By examining my personal journey from a NATS novice to mastering its intricacies, this long-form article aims to showcase the importance and applicability of NATS in the software development landscape. Through comprehensive exploration of various topics, readers will gain a solid foundation, advanced techniques, and best practices for leveraging NATS effectively in their projects.

Introduction

Our topic for today is about how you can get started with NATS. We are assuming that you are aware of why you need NATS and want to know the concepts of NATS along with a walkthrough for how to deploy those concepts/components in your organization.

The first part would include the basic concept and Installation Guide, and setup, admin-related CRUD operations, shell scripts which might not be needed immediately but would be good to have in your arsenal. Whereas the second part would be more developer-focused – applying NATS in application, etc. Let’s Begin

Understanding NATS

In this section, we will delve into the fundamentals of NATS and its key components.

A. Definition and Overview

Nats, which stands for “Naturally Adaptable and Transparent System,” is a lightweight, high-performance messaging system known for its simplicity and scalability. It enables the exchange of messages between applications in a distributed architecture, allowing for seamless communication and increased efficiency.

B. Architecture Diagram

To better understand the inner workings of NATS, let’s take a closer look at its architecture. The diagram below illustrates the key components involved in a typical Nats deployment:

C. Key Features

NATS offers several key features that make it a powerful messaging system. These include:
- Publish-Subscribe Model: NATS follows a publish-subscribe model where publishers send messages to subjects and subscribers receive those messages based on their interest in specific subjects.
- This model allows for flexible and decoupled communication between different parts of an application or across multiple applications.‍
- Scalability: With support for horizontal scaling, NATS can handle high loads of message traffic, making it suitable for large-scale systems.‍
- Performance: NATS is built for speed, providing low-latency message delivery and high throughput.‍
- Reliability: NATS ensures that messages are reliably delivered to subscribers, even in the presence of network interruptions or failures.‍
- Security: NATS supports secure communication through various authentication and encryption mechanisms, protecting sensitive data.
D. Use Cases and Applications

NATS’ simplicity and versatility make it suitable for a wide range of use cases and applications. Some common use cases include:
- Real-time data streaming and processing
- Event-driven architectures
- Microservices communication
- IoT (Internet of Things) systems
- Distributed systems and cloud-native applications
E. Concepts

To better grasp the various components and terminologies associated with NATS, let’s explore some key concepts:
1. NATS server: The NATS server acts as the central messaging infrastructure, responsible for routing messages between publishers and subscribers.
2. NATS CLI: The NATS command-line interface (CLI) is a tool that provides developers with a command-line interface to interact with the NATS server and perform various administrative tasks.
3. NATS clients: NATS CLI and clients both are different. The NATS client is an API/code-based approach to access the NATS server. Clients are not as powerful as CLI but are mainly used along with source code to achieve a specified goal. We won’t be covering this as it is not part of the scope.
4. Routes: Routes allow NATS clusters to bridge and share messages with other nodes within and outside clusters, enabling communication across geographically distributed systems.
5. Accounts: Accounts in NATS provide isolation and access control mechanisms, ensuring that messages are exchanged securely and only between authorized parties.
6. Gateway: Gateways list all the servers in different clusters that you want to connect in order to create a supercluster.
7. SuperCluster: SuperCluster is a powerful feature that allows scaling NATS horizontally across multiple clusters, providing enhanced performance and fault tolerance.
F. System Requirements

Before diving into NATS, it’s important to ensure that our system meets the necessary requirements. The system requirements for NATS will vary depending on the specific deployment scenario and use case. However, in general, the minimum requirements include:

Hardware:

Network:
- All the VMs should be part of the same cluster.
- 4222, 8222, 4248, and 7222 ports should be open for inter-server and client connection.
- Whitelisting of GitHub EMU account on prod servers (Phase 2).
- Get AVI VIP for all the clusters from the network team.
Logs:

By default, logs will be disabled, but the configuration file will have placeholders for logs enablement. Some of the important changes include:
- debug: It will show system logs in verbose.
- trace: It will record every message processed on NATS.
- logtime, logfile_size_limit, log_file: As the name represents, it will show the time when recording the logs, individual file limit for log files (once filled, auto rotation is done by NATS), and the name of the file, respectively.
TLS:

I will be showing the configuration of how to use the certs. Do remember, this setup is being done for the development environment to allow more flexibility towards explaining things and executing it.

Getting Started with NATS

In this section, we will guide you through the installation and setup process for NATS.

Building the Foundation

First, we will focus on building a strong foundation in NATS by understanding its core concepts and implementing basic messaging patterns.

A. Understanding NATS Subjects

In NATS, subjects serve as identifiers that help publishers and subscribers establish communication channels. They are represented as hierarchical strings, allowing for flexibility in message routing and subscription matching.

B. Exploring Messages, Publishers, and Subscribers

Messages are the units of data exchanged between applications through NATS. Publishers create and send messages, while subscribers receive and process them based on their subscribed subjects of interest.

C. Implementing Basic Pub/Sub Pattern

The publish-subscribe pattern is a fundamental messaging pattern in NATS. It allows publishers to distribute messages to multiple subscribers interested in specific subjects, enabling decoupled and efficient communication between different parts of the system.

D. JetStream

JetStream is an advanced addition to NATS that provides durable, persistent message storage and retention policies. It is designed to handle high-throughput streaming scenarios while ensuring data integrity and fault tolerance.

E. Single Cluster vs. SuperCluster

NATS supports both single clusters and superclusters. Single clusters are ideal for smaller deployments, whereas superclusters provide the ability to horizontally scale NATS across multiple clusters, enhancing performance and fault tolerance.

Implementation

As this blog is about deploying NATS from an admin perspective. We will be using only shell script for this purpose.

Let’s start with the implementation process:

Prerequisite

These commands are required to be run on all the servers hosting Nats-server. In this blog, we will cover a 3-node cluster that will be working at Jetstream.

Installing NatsCLI and Nats-server:

_{mkdir -p rpm}

_{# NATSCLI}

_{curl -o rpm/nats-0.0.35.rpm -L}_{https://github.com/nats-io/natscli/releases/download/v0.0.35/nats-0.0.35-amd64.rpm}

_{sudo yum install -y rpm/nats-0.0.35.rpm}

_{# NATS-server}

_{curl -o rpm/nats-server-2.9.20.rpm -L}_{https://github.com/nats-io/nats-server/releases/download/v2.9.20/nats-server-v2.9.20-amd64.rpm}

_{sudo yum install -y rpm/nats-server-2.9.20.rpm}

Local Machine Setup for JetStream:

_{# Create User}

_{sudo useradd –system –home /nats –shell /bin/false nats}

_{# Jetstream Storage}

_{sudo mkdir -p /nats/storage}

_{# Certs}

_{sudo mkdir -p /nats/certs}

_{# Logs}

_{sudo mkdir -p /nats/logs}

_{# Setting Right Permission}

_{sudo chown –recursive nats:nats /nats}

_{sudo chmod 777 /nats}

_{sudo chmod 777 /nats/storage}

Next, we will create the service file in the servers at /etc/systemd/system/nats.service

_{sudo bash -c ‘cat <<EOF > /etc/systemd/system/nats.service}

_[Unit]

_{Description=NATS Streaming Daemon}

_{Requires=network-online.target}

_{After=network-online.target}

_{ConditionFileNotEmpty=/nats/nats.conf}

_[Service]

_#Type=notify

_User=nats

_Group=nats

_{ExecStart=/usr/local/bin/nats-server -config=/nats/nats.conf}

_{#KillMode=process}

_{Restart=always}

_{RestartSec=10}

_{StandardOutput=syslog}

_{StandardError=syslog}

_{#TimeoutSec=900}

_{#LimitNOFILE=65536}

_{#LimitMEMLOCK=infinity}

_[Install]

_{WantedBy=multi-user.target}

_EOF’

Full File will look like:
#!/bin/bash mkdir -p rpm # NATSCLI curl -o rpm/nats-0.0.35.rpm -L https://github.com/nats-io/natscli/releases/download/v0.0.35/nats-0.0.35-amd64.rpm sudo yum install -y rpm/nats-0.0.35.rpm # NATS-server curl -o rpm/nats-server-2.9.20.rpm -L https://github.com/nats-io/nats-server/releases/download/v2.9.20/nats-server-v2.9.20-amd64.rpm sudo yum install -y rpm/nats-server-2.9.20.rpm # Create User sudo useradd --system --home /nats --shell /bin/false nats # Jetstream Storage sudo mkdir -p /nats/storage # Certs sudo mkdir -p /nats/certs # Logs sudo mkdir -p /nats/logs # Setting Right Permission sudo chown --recursive nats:nats /nats sudo chmod 777 /nats sudo chmod 777 /nats/storage sudo bash -c 'cat <<EOF > /etc/systemd/system/nats.service [Unit] Description=NATS Streaming Daemon Requires=network-online.target After=network-online.target ConditionFileNotEmpty=/nats/nats.conf [Service] #Type=notify User=nats Group=nats ExecStart=/usr/local/bin/nats-server -config=/nats/nats.conf #KillMode=process Restart=always RestartSec=10 StandardOutput=syslog StandardError=syslog #TimeoutSec=900 #LimitNOFILE=65536 #LimitMEMLOCK=infinity [Install] WantedBy=multi-user.target EOF'
```
#!/bin/bash

mkdir -p rpm

# NATSCLI
curl -o rpm/nats-0.0.35.rpm  -L https://github.com/nats-io/natscli/releases/download/v0.0.35/nats-0.0.35-amd64.rpm
sudo yum install -y rpm/nats-0.0.35.rpm

# NATS-server
curl -o rpm/nats-server-2.9.20.rpm  -L https://github.com/nats-io/nats-server/releases/download/v2.9.20/nats-server-v2.9.20-amd64.rpm
sudo yum install -y rpm/nats-server-2.9.20.rpm

# Create User
sudo useradd --system --home /nats --shell /bin/false nats

# Jetstream Storage
sudo mkdir -p /nats/storage

# Certs
sudo mkdir -p /nats/certs

# Logs
sudo mkdir -p /nats/logs

# Setting Right Permission
sudo chown --recursive nats:nats /nats
sudo chmod 777 /nats
sudo chmod 777 /nats/storage
sudo bash -c 'cat <<EOF > /etc/systemd/system/nats.service
[Unit]
Description=NATS Streaming Daemon
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/nats/nats.conf
[Service]
#Type=notify
User=nats
Group=nats
ExecStart=/usr/local/bin/nats-server -config=/nats/nats.conf
#KillMode=process
Restart=always
RestartSec=10
StandardOutput=syslog
StandardError=syslog
#TimeoutSec=900
#LimitNOFILE=65536
#LimitMEMLOCK=infinity
[Install]
WantedBy=multi-user.target
EOF'
```
Creating conf file at all the servers at /nats directory

_{Server setup}

_{server_name=nts0}

_{listen: <IP/DNS-First>:4222 # For other servers edit the IP/DNS remaining in the cluster}

_{https: <DNS-First>:8222}

_{#http: <IP/DNS-First>:8222 # Uncommnet this if you are running without tls certs}

_{JetStream Configuration}

_{jetstream {}

_{store_dir=/nats/storage}

_{max_mem_store: 6GB}

_{max_file_store: 90GB}

_}

_{Intra Cluster Setup}

_{cluster {}

_{name: dev-nats # Super Cluster should have unique Cluster names}

_{host: <IP/DNS-First>}

_{port: 4248}

_{routes = [}

_{nats-route://<IP/DNS-First>:4248}

_{nats-route://<IP/DNS-Second>:4248}

_{nats-route://<IP/DNS-Third>:4248}

_]

_}

_{Account Setup}

_{accounts: {}

_{$SYS: {}

_{users: [}

_{{ user: admin, password: password }}

_]

_},

_{B: {}

_{users: [}

_{{user: b, password: b}}

_],

_{jetstream: enabled,}

_{imports: [}

_{# {stream: {account: “$G”}}}

_]

_},

_{C: {}

_{users: [}

_{{user: c, password: c}}

_],

_{jetstream: enabled,}

_{imports: [}

_]

_},

_{E: {}

_{users: [}

_{{user: e, password: e}}

_],

_{jetstream: enabled,}

_{imports: [}

_]

_}

_}

no_auth_user: e # Change this on every server to have a user in the system which does not need password, allowing local account in supercluster

We can use “Accounts” to help us provide local and global stream separation, the configuration is identical except for the changes in the no_auth_user which must be unique for each cluster, making the stream only accessible from the given cluster without the need of providing credentials exclusively.

Gateway Setup:

Intra Cluster/Route Setup and Account Setup remain similar and need to be present in another cluster with the cluster having the name “new-dev-nats.”

_{gateway {}

_{name: dev-nats}

_{listen: <IP/DNS-First>:7222}

_{gateways: [}

_{{name: dev-nats, urls: [nats://<IP/DNS-First>:7222, nats://<IP/DNS-Second>:7222, nats://<IP/DNS-Third>:7222]},}

_{{name: new-dev-nats, urls: [nats://<NEW-IP/DNS-First>:7222, nats://<NEW-IP/DNS-Second>:7222, nats://<NEW-IP/DNS-Third>:7222]}}

_]

_}

TLS setup

_{tls: {}

_{cert_file: “/nats/certs/natsio.crt”}

_{key_file: “/nats/certs/natsio.key”}

_{ca_file: “/nats/certs/natsio_rootCA.pem”}

_}

NOTE: no_auth_user: b is a special directive within NATS. If you choose to keep it seperate accross all the nodes in the cluster, you can have a “local account” setup in supercluster. This is beneficial when you want to publish data which should not be accessible by any other server.

Complete conf file on <IP/DNS-First> machine would look like this:
# `server_name`: Unique name for your node; attaching a number with increment value is recommended # listen: DNS name for the current node:4222 # https: DNS name for the current node:8222 # cluster.name: This is the name of your cluster. It is compulsory for them to be the same across all nodes. # cluster.host: DNS name for the current node # cluster.routes: List of all the DNS entries which will be part of the cluster in separate lines:4248 # account.user: Make sure to use proper names here and also keep the same across all the nodes which will be involved as a super cluster # no_auth_user: To be unique for individual cluster # gateway.name: Should be for the current cluster the node is part of. (Best to match with cluster.name mentioned above) # gateway.listen: The same logic mentioned for listen is applicable here with port 7222 # gateways:Mention all the nodes in all the cluster here with nodes separated logically by the cluster they are part of via name # tls: Make sure to have the certs ready to place at /nats/certs server_name=nts0 listen: <IP/DNS-First>:4222 # For other servers edit the IP/DNS remaining in the cluster https: <DNS-First>:8222 #http: <IP/DNS-First>:8222 # Uncommnet this if you are running without tls certs jetstream { store_dir=/nats/storage max_mem_store: 6GB max_file_store: 90GB } cluster { name: dev-nats # Super Cluster should have unique Cluster names host: <IP/DNS-First> port: 4248 routes = [ nats-route://<IP/DNS-First>:4248 nats-route://<IP/DNS-Second>:4248 nats-route://<IP/DNS-Third>:4248 ] } accounts: { $SYS: { users: [ { user: admin, password: password } ] }, B: { users: [ {user: b, password: b} ], jetstream: enabled, imports: [ # {stream: {account: "$G"}} ] }, C: { users: [ {user: c, password: c} ], jetstream: enabled, imports: [ ] }, E: { users: [ {user: e, password: e} ], jetstream: enabled, imports: [ ] } } no_auth_user: e # Change this on every server to have a user in the system which does not need password, allowing local account in supercluster gateway { name: dev-nats listen: <IP/DNS-First>:7222 gateways: [ {name: dev-nats, urls: [nats://<IP/DNS-First>:7222, nats://<IP/DNS-Second>:7222, nats://<IP/DNS-Third>:7222]}, {name: new-dev-nats, urls: [nats://<NEW-IP/DNS-First>:7222, nats://<NEW-IP/DNS-Second>:7222, nats://<NEW-IP/DNS-Third>:7222]} ] } tls: { cert_file: "/nats/certs/natsio.crt" key_file: "/nats/certs/natsio.key" ca_file: "/nats/certs/natsio_rootCA.pem" }
```
# `server_name`: Unique name for your node; attaching a number with increment value is recommended
# listen: DNS name for the current node:4222
# https: DNS name for the current node:8222
# cluster.name: This is the name of your cluster. It is compulsory for them to be the same across all nodes.
# cluster.host: DNS name for the current node
# cluster.routes: List of all the DNS entries which will be part of the cluster in separate lines:4248
# account.user: Make sure to use proper names here and also keep the same across all the nodes which will be involved as a super cluster
# no_auth_user: To be unique for individual cluster
# gateway.name: Should be for the current cluster the node is part of. (Best to match with cluster.name mentioned above)
# gateway.listen: The same logic mentioned for listen is applicable here with port 7222
# gateways:Mention all the nodes in all the cluster here with nodes separated logically by the cluster they are part of via name
# tls: Make sure to have the certs ready to place at /nats/certs

server_name=nts0
listen: <IP/DNS-First>:4222 # For other servers edit the IP/DNS remaining in the cluster
https: <DNS-First>:8222
#http: <IP/DNS-First>:8222 # Uncommnet this if you are running without tls certs 

jetstream {
  store_dir=/nats/storage
  max_mem_store: 6GB
  max_file_store: 90GB
}

cluster {
  name: dev-nats # Super Cluster should have unique Cluster names
  host: <IP/DNS-First>
  port: 4248
  routes = [
    nats-route://<IP/DNS-First>:4248
    nats-route://<IP/DNS-Second>:4248
    nats-route://<IP/DNS-Third>:4248
  ]
}

accounts: {
  $SYS: {
    users: [
      { user: admin, password: password }
    ]
  },
  B: {
    users: [
      {user: b, password: b}
    ],
    jetstream: enabled,
    imports: [
    # {stream: {account: "$G"}}
    ]
  },
  C: {
    users: [
      {user: c, password: c}
    ],
    jetstream: enabled,
    imports: [
    ]
  },
  E: {
    users: [
      {user: e, password: e}
    ],
    jetstream: enabled,
    imports: [
    ]
  }
}

no_auth_user: e # Change this on every server to have a user in the system which does not need password, allowing local account in supercluster

gateway {
  name: dev-nats
  listen: <IP/DNS-First>:7222
  gateways: [
    {name: dev-nats, urls: [nats://<IP/DNS-First>:7222, nats://<IP/DNS-Second>:7222, nats://<IP/DNS-Third>:7222]},
    {name: new-dev-nats, urls: [nats://<NEW-IP/DNS-First>:7222, nats://<NEW-IP/DNS-Second>:7222, nats://<NEW-IP/DNS-Third>:7222]}
  ]
}

tls: {
  cert_file: "/nats/certs/natsio.crt"
  key_file: "/nats/certs/natsio.key"
  ca_file: "/nats/certs/natsio_rootCA.pem"
}
```
Recap on Conf File Changes

The configuration file in all the nodes for all the environments will need to be updated, to support “gateway” and “accounts.”
- Individual changes on all the conf files need to be done.
- Changes for the gateway will be almost similar except for the change in the name, which will be specific to the local cluster of which the given node is part of.
- Changes for an “account” will be almost similar except for the “no_auth_user” parameter, which will be specific to the local cluster of which the given node is part of.
- The “nats-server –signal reload” command should be able to pick up the changes.
Starting Service

After adding the certs, re-own the files:

_{sudo chown –recursive nats:nats /nats}

Creating firewall rules:

_{sudo firewall-cmd –permanent –add-port=4222/tcp}

_{sudo firewall-cmd –permanent –add-port=8222/tcp}

_{sudo firewall-cmd –permanent –add-port=4248/tcp}

_{sudo firewall-cmd –permanent –add-port=7222/tcp}

_{sudo firewall-cmd –reload}

Start the service:

_{sudo systemctl start nats.service}

_{sudo systemctl enable nats.service}

Check status:

sudo systemctl status nats.service -l

Note: Remember to check logs of status commands in node2 and node3; it should show a connection with node1 and also confirm that node1 has been made the leader.

Setting up the context:

Setting up context will help us in managing our cluster better with NATSCLI.

_{# pass the –tlsca flag in dev because we do not have the DNS registered. In staging and Production the `tlsca` flag will not be needed because certs will be registered.}

_{nats context add nats –server <IP/DNS-First>:4222,<IP/DNS-Second>:4222,<IP/DNS-Third>:4222 –description “Awesome Nats Servers List” –tlsca /nats/certs/natsio_rootCA.pem –select}

_{nats context ls}

_{nats account info}

Complete file for starting service would like this:
#!/bin/bash # Own the files sudo chown --recursive nats:nats /nats # Create Firewall Rules sudo firewall-cmd --permanent --add-port=4222/tcp sudo firewall-cmd --permanent --add-port=8222/tcp sudo firewall-cmd --permanent --add-port=4248/tcp sudo firewall-cmd --permanent --add-port=7222/tcp sudo firewall-cmd --reload # Start Service sudo systemctl start nats.service sudo systemctl enable nats.service sudo systemctl status nats.service -l # Setup Context # pass the --tlsca flag in dev because we do not have the DNS registered. In staging and Production the `tlsca` flag will not be needed because certs will be registered. nats context add nats --server <IP/DNS-First>:4222,<IP/DNS-Second>:4222,<IP/DNS-Third>:4222 --description "Awesome Nats Servers List" --tlsca /nats/certs/natsio_rootCA.pem --select nats context ls nats account info
```
#!/bin/bash

# Own the files
sudo chown --recursive nats:nats /nats

# Create Firewall Rules
sudo firewall-cmd --permanent --add-port=4222/tcp
sudo firewall-cmd --permanent --add-port=8222/tcp
sudo firewall-cmd --permanent --add-port=4248/tcp
sudo firewall-cmd --permanent --add-port=7222/tcp
sudo firewall-cmd --reload

# Start Service
sudo systemctl start nats.service
sudo systemctl enable nats.service
sudo systemctl status nats.service -l

# Setup Context
# pass the --tlsca flag in dev because we do not have the DNS registered. In staging and Production the `tlsca` flag will not be needed because certs will be registered.
nats context add nats --server <IP/DNS-First>:4222,<IP/DNS-Second>:4222,<IP/DNS-Third>:4222 --description "Awesome Nats Servers List" --tlsca /nats/certs/natsio_rootCA.pem --select

nats context ls
nats account info
```
Validation:

The account info command should shuffle among the servers in the Connected URL string.

Stream Listing:

Streams that will be available across the regions would require the credentials. The creds should be common across all clusters:

The same info can be obtained from the different clusters when the same command is fired:

To fetch local streams that are present under the no_auth_user:

And from the different clusters using the same command (without credentials), we should get a different stream:

Advanced Messaging Patterns with NATS

In this section, we will explore advanced messaging patterns that leverage the capabilities of NATS for more complex communication scenarios.

A. Request-Reply Pattern

The request-reply pattern allows applications to send requests and receive corresponding responses through NATS. It enables synchronous communication, making it suitable for scenarios where immediate responses are required.

B. Publish-Subscribe Pattern with Wildcards

Nats introduces the concept of wildcards to the publish-subscribe pattern, allowing subscribers to receive messages based on pattern matching. This enables greater flexibility in subscription matching and expands the possibilities of message distribution.

C. Queue Groups for Load Balancing and Fault Tolerance

Queue groups provide load balancing and fault tolerance capabilities in NATS. By grouping subscribers together, NATS ensures that messages are distributed evenly across the subscribers within the group, preventing any single subscriber from being overwhelmed.

Overcoming Real-World Challenges

In this section, we will discuss real-world challenges that developers may encounter when working with NATS and explore strategies to overcome them.

A. Scalability and High Availability in NATS

As applications grow and message traffic increases, scalability, and high availability become crucial considerations. NATS offers various techniques and features to address these challenges, including clustering, load balancing, and fault tolerance mechanisms.

B. Securing NATS Communication

Security is paramount in any messaging system, and NATS provides several mechanisms to secure communication. These include authentication, encryption, access control, and secure network configurations.

C. Monitoring and Debugging Techniques

Efficiently monitoring and troubleshooting a NATS deployment is essential for maintaining system health. NATS provides tools and techniques to monitor message traffic, track performance metrics, and identify and resolve potential issues in real time.

Recovery Scenarios in NATS

This section is intended to help in scenarios when NATS services are not usable. Scenarios such as Node failure, Not reachable, Service down, or region down are some examples of such a situation.

Summary

In this article, we have embarked on a journey from being a NATS novice to mastering its intricacies. We have explored the importance and applicability of NATS in the software development landscape. Through a comprehensive exploration of NATS’ definition, architecture, key features, and use cases, we have built a strong foundation in NATS. We have also examined advanced messaging patterns and discussed strategies to overcome real-world challenges in scalability, security, and monitoring. Furthermore, we have delved into the Recovery scenarios, which might come in handy when things don’t behave as expected. Armed with this knowledge, developers can confidently utilize NATS to unlock its full potential in their projects.
September 7, 2023
Unveiling the Magic of Kubernetes: Exploring Pod Priority, Priority Classes, and Pod Preemption
‍Introduction:

Generally, during the deployment of a manifest, we observe that some pods get successfully scheduled, while few critical pods encounter scheduling issues. Therefore, we must schedule the critical pods first over other pods. While exploring, we discovered a built-in solution for scheduling using Pod Priority and Priority Class. So, in this blog, we’ll be talking about Priority Class and Pod Priority and how we can implement them in our use case.

Pod Priority:

It is used to prioritize one pod over another based on its importance. Pod Priority is particularly useful when critical pods cannot be scheduled due to limited resources.

Priority Classes:

This Kubernetes object defines the priority of pods. Priority can be set by an integer value. Higher-priority values have higher priority to the pod.

Understanding Priority Values:

Priority Classes in Kubernetes are associated with priority values that range from 0 to 1000000000, with a higher value indicating greater importance.

These values act as a guide for the scheduler when allocating resources.

Pod Preemption:

It is already enabled when we create a priority class. The purpose of Pod Preemption is to evict lower-priority pods in order to make room for higher-priority pods to be scheduled.

Example Scenario: The Enchanted Shop

Let’s dive into a scenario featuring “The Enchanted Shop,” a Kubernetes cluster hosting an online store. The shop has three pods, each with a distinct role and priority:

Priority Class:
- Create High priority class:
```
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000
```
- Create Medium priority class:
```
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: medium-priority
value: 500000
```
- Create Low priority class:
```
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 100000
```
Pods:
- Checkout Pod (High Priority): This pod is responsible for processing customer orders and must receive top priority.
Create the Checkout Pod with a high-priority class:
```
apiVersion: v1
kind: Pod
metadata:
  name: checkout-pod
  labels:
    app: checkout
spec:
  priorityClassName: high-priority
  containers:
  - name: checkout-container
    image: nginx:checkout
```
- Product Recommendations Pod (Medium Priority):
This pod provides personalized product recommendations to customers and holds moderate importance.

Create the Product Recommendations Pod with a medium priority class:
```
apiVersion: v1
kind: Pod
metadata:
  name: product-rec-pod
  labels:
    app: product-recommendations
spec:
  priorityClassName: medium-priority
  containers:
  - name: product-rec-container
    image: nginx:store
```
- Shopping Cart Pod (Low Priority):
This pod manages customers’ shopping carts and has a lower priority compared to the others.

Create the Shopping Cart Pod with a low-priority class:
```
apiVersion: v1
kind: Pod
metadata:
  name: shopping-cart-pod
  labels:
    app: shopping-cart
spec:
  priorityClassName: low-priority
  containers:
  - name: shopping-cart-container
    image: nginx:cart
```
With these pods and their respective priority classes, Kubernetes will allocate resources based on their importance, ensuring smooth operation even during peak loads.

Commands to Witness the Magic:
- Verify Priority Classes:
kubectl get priorityclasses

Note: Kubernetes includes two predefined Priority Classes: system-cluster-critical and system-node-critical. These classes are specifically designed to prioritize the scheduling of critical components, ensuring they are always scheduled first.
- Check Pod Priority:
Conclusion:

In Kubernetes, you have the flexibility to define how your pods are scheduled. This ensures that your critical pods receive priority over lower-priority pods during the scheduling process. To get deeper into the concepts of Pod Priority, Priority Class, and Pod Preemption, you can find more information by referring to the following links.
- https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/
- https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#how-to-use-priority-and-preemption
July 20, 2023

How to deploy GitHub Actions Self-Hosted Runners on Kubernetes

GitHub Actions jobs are run in the cloud by default; however, sometimes we want to run jobs in our own customized/private environment where we have full control. That is where a self-hosted runner saves us from this problem.

To get a basic understanding of running self-hosted runners on the Kubernetes cluster, this blog is perfect for you.

We’ll be focusing on running GitHub Actions on a self-hosted runner on Kubernetes.

An example use case would be to create an automation in GitHub Actions to execute MySQL queries on MySQL Database running in a private network (i.e., MySQL DB, which is not accessible publicly).

A self-hosted runner requires the provisioning and configuration of a virtual machine instance; here, we are running it on Kubernetes. For running a self-hosted runner on a Kubernetes cluster, the action-runner-controller helps us to make that possible.

This blog aims to try out self-hosted runners on Kubernetes and covers:

Deploying MySQL Database on minikube, which is accessible only within Kubernetes Cluster.
Deploying self-hosted action runners on the minikube.
Running GitHub Action on minikube to execute MySQL queries on MySQL Database.

Steps for completing this tutorial:

Create a GitHub repository

Create a private repository on GitHub. I am creating it with the name velotio/action-runner-poc.

Setup a Kubernetes cluster using minikube

Install Docker.
Install Minikube.
Install Helm
Install kubectl

Install cert-manager on a Kubernetes cluster

By default, actions-runner-controller uses cert-manager for certificate management of admission webhook, so we have to make sure cert-manager is installed on Kubernetes before we install actions-runner-controller.
Run the below helm commands to install cert-manager on minikube.

Verify installation using “kubectl –namespace cert-manager get all”. If everything is okay, you will see an output as below:

Setting Up Authentication for Hosted Runners‍

There are two ways for actions-runner-controller to authenticate with the GitHub API (only 1 can be configured at a time, however):

Using a GitHub App (not supported for enterprise-level runners due to lack of support from GitHub.)
Using a PAT (personal access token)

To keep this blog simple, we are going with PAT.

To authenticate an action-runner-controller with the GitHub API, we can use a PAT with the action-runner-controller registers a self-hosted runner.

Go to account > Settings > Developers settings > Personal access token. Click on “Generate new token”. Under scopes, select “Full control of private repositories”.

Click on the “Generate token” button.

Copy the generated token and run the below commands to create a Kubernetes secret, which will be used by action-runner-controller deployment.

export GITHUB_TOKEN=XXXxxxXXXxxxxXYAVNa

export GITHUB_TOKEN=XXXxxxXXXxxxxXYAVNa

kubectl create ns actions-runner-system

kubectl create ns actions-runner-system

Create secret

kubectl create secret generic controller-manager  -n actions-runner-system 
--from-literal=github_token=${GITHUB_TOKEN}

kubectl create secret generic controller-manager  -n actions-runner-system 
--from-literal=github_token=${GITHUB_TOKEN}

Install action runner controller on the Kubernetes cluster

Run the below helm commands

helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
helm repo update
helm upgrade --install --namespace actions-runner-system 
--create-namespace --wait actions-runner-controller 
actions-runner-controller/actions-runner-controller --set 
syncPeriod=1m

helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
helm repo update
helm upgrade --install --namespace actions-runner-system 
--create-namespace --wait actions-runner-controller 
actions-runner-controller/actions-runner-controller --set 
syncPeriod=1m

Verify that the action-runner-controller installed properly using below command

kubectl --namespace actions-runner-system get all

kubectl --namespace actions-runner-system get all

Create a Repository Runner

Create a RunnerDeployment Kubernetes object, which will create a self-hosted runner named k8s-action-runner for the GitHub repository velotio/action-runner-poc
Please Update Repo name from “velotio/action-runner-poc” to “<Your-repo-name>”
To create the RunnerDeployment object, create the file runner.yaml as follows:

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
 name: k8s-action-runner
 namespace: actions-runner-system
spec:
 replicas: 2
 template:
   spec:
     repository: velotio/action-runner-poc

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
 name: k8s-action-runner
 namespace: actions-runner-system
spec:
 replicas: 2
 template:
   spec:
     repository: velotio/action-runner-poc

To create, run this command:

kubectl create -f runner.yaml

kubectl create -f runner.yaml

Check that the pod is running using the below command:

kubectl get pod -n actions-runner-system | grep -i "k8s-action-runner"

kubectl get pod -n actions-runner-system | grep -i "k8s-action-runner"

‍

If everything goes well, you should see two action runners on the Kubernetes, and the same are registered on Github. Check under Settings > Actions > Runner of your repository.

Check the pod with kubectl get po -n actions-runner-system

Install a MySQL Database on the Kubernetes cluster

‍

Create PV and PVC for MySQL Database.
Create mysql-pv.yaml with the below content.

apiVersion: v1
kind: PersistentVolume
metadata:
 name: mysql-pv-volume
 labels:
   type: local
spec:
 capacity:
   storage: 2Gi
 accessModes:
   - ReadWriteOnce
 hostPath:
   path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: mysql-pv-claim
spec:
 accessModes:
   - ReadWriteOnce
 resources:
   requests:
     storage: 2Gi

apiVersion: v1
kind: PersistentVolume
metadata:
 name: mysql-pv-volume
 labels:
   type: local
spec:
 capacity:
   storage: 2Gi
 accessModes:
   - ReadWriteOnce
 hostPath:
   path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: mysql-pv-claim
spec:
 accessModes:
   - ReadWriteOnce
 resources:
   requests:
     storage: 2Gi

Create mysql namespace

kubectl create ns mysql

kubectl create ns mysql

Now apply mysql-pv.yaml to create PV and PVC

kubectl create -f mysql-pv.yaml -n mysql

kubectl create -f mysql-pv.yaml -n mysql

Create the file mysql-svc-deploy.yaml and add the below content to mysql-svc-deploy.yaml

Here, we have used MYSQL_ROOT_PASSWORD as “password”.

apiVersion: v1
kind: Service
metadata:
 name: mysql
spec:
 ports:
   - port: 3306
 selector:
   app: mysql
 clusterIP: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: mysql
spec:
 selector:
   matchLabels:
     app: mysql
 strategy:
   type: Recreate
 template:
   metadata:
     labels:
       app: mysql
   spec:
     containers:
       - image: mysql:5.6
         name: mysql
         env:
             # Use secret in real usage
           - name: MYSQL_ROOT_PASSWORD
             value: password
         ports:
           - containerPort: 3306
             name: mysql
         volumeMounts:
           - name: mysql-persistent-storage
             mountPath: /var/lib/mysql
     volumes:
       - name: mysql-persistent-storage
         persistentVolumeClaim:
           claimName: mysql-pv-claim

apiVersion: v1
kind: Service
metadata:
 name: mysql
spec:
 ports:
   - port: 3306
 selector:
   app: mysql
 clusterIP: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: mysql
spec:
 selector:
   matchLabels:
     app: mysql
 strategy:
   type: Recreate
 template:
   metadata:
     labels:
       app: mysql
   spec:
     containers:
       - image: mysql:5.6
         name: mysql
         env:
             # Use secret in real usage
           - name: MYSQL_ROOT_PASSWORD
             value: password
         ports:
           - containerPort: 3306
             name: mysql
         volumeMounts:
           - name: mysql-persistent-storage
             mountPath: /var/lib/mysql
     volumes:
       - name: mysql-persistent-storage
         persistentVolumeClaim:
           claimName: mysql-pv-claim

Create the service and deployment

kubectl create -f mysql-svc-deploy.yaml -n mysql

kubectl create -f mysql-svc-deploy.yaml -n mysql

Verify that the MySQL database is running

kubectl get po -n mysql

kubectl get po -n mysql

‍

Create a GitHub repository secret to store MySQL password

As we will use MySQL password in the GitHub action workflow file as a good practice, we should not use it in plain text. So we will store MySQL password in GitHub secrets, and we will use this secret in our GitHub action workflow file.

Create a secret in the GitHub repository and give the name to the secret as “MYSQL_PASS”, and in the values, enter “password”.

Create a GitHub workflow file

YAML syntax is used to write GitHub workflows. For each workflow, we use a separate YAML file, which we store at .github/workflows/ directory. So, create a .github/workflows/ directory in your repository and create a file .github/workflows/mysql_workflow.yaml as follows.

---
name: Example 1
on:
 push:
   branches: [ main ]
jobs:
 build:
   name: Build-job
   runs-on: self-hosted
   steps:
   - name: Checkout
     uses: actions/checkout@v2
 
   - name: MySQLQuery
     env:
       PASS: ${{ secrets.MYSQL_PASS }}
     run: |
       docker run -v ${GITHUB_WORKSPACE}:/var/lib/docker --rm mysql:5.6 sh -c "mysql -u root -p$PASS -hmysql.mysql.svc.cluster.local </var/lib/docker/test.sql"

---
name: Example 1
on:
 push:
   branches: [ main ]
jobs:
 build:
   name: Build-job
   runs-on: self-hosted
   steps:
   - name: Checkout
     uses: actions/checkout@v2
 
   - name: MySQLQuery
     env:
       PASS: ${{ secrets.MYSQL_PASS }}
     run: |
       docker run -v ${GITHUB_WORKSPACE}:/var/lib/docker --rm mysql:5.6 sh -c "mysql -u root -p$PASS -hmysql.mysql.svc.cluster.local </var/lib/docker/test.sql"

If you check the docker run command in the mysql_workflow.yaml file, we are referring to the .sql file, i.e., test.sql. So, create a test.sql file in your repository as follows:

use mysql;
CREATE TABLE IF NOT EXISTS Persons (
   PersonID int,
   LastName varchar(255),
   FirstName varchar(255),
   Address varchar(255),
   City varchar(255)
);
 
SHOW TABLES;

use mysql;
CREATE TABLE IF NOT EXISTS Persons (
   PersonID int,
   LastName varchar(255),
   FirstName varchar(255),
   Address varchar(255),
   City varchar(255)
);
 
SHOW TABLES;

In test.sql, we are running MySQL queries like create tables.
Push changes to your repository main branch.
If everything is fine, you will be able to see that the GitHub action is getting executed in a self-hosted runner pod. You can check it under the “Actions” tab of your repository.

You can check the workflow logs to see the output of SHOW TABLES—a command we have used in the test.sql file—and check whether the persons tables is created.

References

January 18, 2023

How to Setup HashiCorp Vault HA Cluster with Integrated Storage (Raft)

As businesses move their data to the public cloud, one of the most pressing issues is how to keep it safe from illegal access.

Using a tool like HashiCorp Vault gives you greater control over your sensitive credentials and fulfills cloud security regulations.

In this blog, we’ll walk you through HashiCorp Vault High Availability Setup.

Hashicorp Vault

Hashicorp Vault is an open-source tool that provides a secure, reliable way to store and distribute sensitive information like API keys, access tokens, passwords, etc. Vault provides high-level policy management, secret leasing, audit logging, and automatic revocation to protect this information using UI, CLI, or HTTP API.

High Availability

Vault can run in a High Availability mode to protect against outages by running multiple Vault servers. When running in HA mode, Vault servers have two additional states, i.e., active and standby. Within a Vault cluster, only a single instance will be active, handling all requests, and all standby instances redirect requests to the active instance.

Integrated Storage Raft

The Integrated Storage backend is used to maintain Vault’s data. Unlike other storage backends, Integrated Storage does not operate from a single source of data. Instead, all the nodes in a Vault cluster will have a replicated copy of Vault’s data. Data gets replicated across all the nodes via the Raft Consensus Algorithm.

Raft is officially supported by Hashicorp.

Architecture

Prerequisites

This setup requires Vault, Sudo access on the machines, and the below configuration to create the cluster.

Install Vault v1.6.3+ent or later on all nodes in the Vault cluster

In this example, we have 3 CentOs VMs provisioned using VMware.

Setup

1. Verify the Vault version on all the nodes using the below command (in this case, we have 3 nodes node1, node2, node3):

vault --version

vault --version

2. Configure SSL certificates‍‍

‍Note: Vault should always be used with TLS in production to provide secure communication between clients and the Vault server. It requires a certificate file and key file on each Vault host.

We can generate SSL certs for the Vault Cluster on the Master and copy them on the other nodes in the cluster.

Refer to: https://developer.hashicorp.com/vault/tutorials/secrets-management/pki-engine#scenario-introduction for generating SSL certs.

Copy tls.crt tls.key tls_ca.pem to /etc/vault.d/ssl/
Change ownership to `vault`

[user@node1 ~]$ cd /etc/vault.d/ssl/           
[user@node1 ssl]$ sudo chown vault. tls*

[user@node1 ~]$ cd /etc/vault.d/ssl/           
[user@node1 ssl]$ sudo chown vault. tls*

Copy tls* from /etc/vault.d/ssl to of the nodes

3. Configure the enterprise license. Copy license on all nodes:

cp /root/vault.hclic /etc/vault.d/vault.hclic
chown root:vault /etc/vault.d/vault.hclic
chmod 0640 /etc/vault.d/vault.hclic

cp /root/vault.hclic /etc/vault.d/vault.hclic
chown root:vault /etc/vault.d/vault.hclic
chmod 0640 /etc/vault.d/vault.hclic

4. Create the storage directory for raft storage on all nodes:

sudo mkdir --parents /opt/raft
sudo chown --recursive vault:vault /opt/raft

sudo mkdir --parents /opt/raft
sudo chown --recursive vault:vault /opt/raft

5. Set firewall rules on all nodes:

sudo firewall-cmd --permanent --add-port=8200/tcp
sudo firewall-cmd --permanent --add-port=8201/tcp
sudo firewall-cmd --reload

sudo firewall-cmd --permanent --add-port=8200/tcp
sudo firewall-cmd --permanent --add-port=8201/tcp
sudo firewall-cmd --reload

6. Create vault configuration file on all nodes:

### Node 1 ###
[user@node1 vault.d]$ cat vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node1"
    retry_join 
    {
        leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
}
listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node1.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"
# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

### Node 1 ###
[user@node1 vault.d]$ cat vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node1"
    retry_join 
    {
        leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
}

listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node1.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"

# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

### Node 2 ###
[user@node2 vault.d]$ cat vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node2"
    retry_join 
    {
        leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    } 
}
listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node2.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"
# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

### Node 2 ###
[user@node2 vault.d]$ cat vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node2"
    retry_join 
    {
        leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    } 
}

listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node2.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"

# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

### Node 3 ###
[user@node3 ~]$ cat /etc/vault.d/vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node3"
    retry_join 
    {
        leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
}
listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node3.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"
# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

### Node 3 ###
[user@node3 ~]$ cat /etc/vault.d/vault.hcl
storage "raft" {
    path = "/opt/raft"
    node_id = "node3"
    retry_join 
    {
        leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
    retry_join 
    {
        leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
        leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
        leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
        leader_client_key_file = "/etc/vault.d/ssl/tls.key"
    }
}

listener "tcp" {
   address = "0.0.0.0:8200"
   tls_disable = false
   tls_cert_file = "/etc/vault.d/ssl/tls.crt"
   tls_key_file = "/etc/vault.d/ssl/tls.key"
   tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
   tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                        TLS_TEST_128_GCM_SHA256,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384,
                        TLS_TEST20_POLY1305,
                        TLS_TEST_256_GCM_SHA384"
}
api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
cluster_addr = "https://node3.int.us-west-1-dev.central.example.com:8201"
disable_mlock = true
ui = true
log_level = "trace"
disable_cache = true
cluster_name = "POC"

# Enterprise license_path
# This will be required for enterprise as of v1.8
license_path = "/etc/vault.d/vault.hclic"

7. Set environment variables on all nodes:

export VAULT_ADDR=https://$(hostname):8200
export VAULT_CACERT=/etc/vault.d/ssl/tls_ca.pem
export CA_CERT=`cat /etc/vault.d/ssl/tls_ca.pem`

export VAULT_ADDR=https://$(hostname):8200
export VAULT_CACERT=/etc/vault.d/ssl/tls_ca.pem
export CA_CERT=`cat /etc/vault.d/ssl/tls_ca.pem`

8. Start Vault as a service on all nodes:

You can view the systemd unit file if interested by:

cat /etc/systemd/system/vault.service
systemctl enable vault.service
systemctl start vault.service
systemctl status vault.service

cat /etc/systemd/system/vault.service
systemctl enable vault.service
systemctl start vault.service
systemctl status vault.service

9. Check Vault status on all nodes:

vault status

vault status

10. Initialize Vault with the following command on vault node 1 only. Store unseal keys securely.

[user@node1 vault.d]$ vault operator init -key-shares=1 -key-threshold=1
Unseal Key 1: HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Initial Root Token: hvs.j4qTq1IZP9nscILMtN2p9GE0
Vault initialized with 1 key shares and a key threshold of 1.
Please securely distribute the key shares printed above. 
When the Vault is re-sealed, restarted, or stopped, you must supply at least 1 of these keys to unseal it
before it can start servicing requests.
Vault does not store the generated root key. 
Without at least 1 keys to reconstruct the root key, Vault will remain permanently sealed!
It is possible to generate new unseal keys, provided you have a
quorum of existing unseal keys shares. See "vault operator rekey" for more information.

[user@node1 vault.d]$ vault operator init -key-shares=1 -key-threshold=1
Unseal Key 1: HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Initial Root Token: hvs.j4qTq1IZP9nscILMtN2p9GE0
Vault initialized with 1 key shares and a key threshold of 1.
Please securely distribute the key shares printed above. 
When the Vault is re-sealed, restarted, or stopped, you must supply at least 1 of these keys to unseal it
before it can start servicing requests.
Vault does not store the generated root key. 
Without at least 1 keys to reconstruct the root key, Vault will remain permanently sealed!
It is possible to generate new unseal keys, provided you have a
quorum of existing unseal keys shares. See "vault operator rekey" for more information.

11. Set Vault token environment variable for the vault CLI command to authenticate to the server. Use the following command, replacing <initial-root- token> with the value generated in the previous step.

export VAULT_TOKEN=<initial-root-token>
echo "export VAULT_TOKEN=$VAULT_TOKEN" >> /root/.bash_profile
### Repeat this step for the other 2 servers.

export VAULT_TOKEN=<initial-root-token>
echo "export VAULT_TOKEN=$VAULT_TOKEN" >> /root/.bash_profile
### Repeat this step for the other 2 servers.

12. Unseal Vault1 using the unseal key generated in step 10. Notice the Unseal Progress key-value change as you present each key. After meeting the key threshold, the status of the key value for Sealed should change from true to false.

[user@node1 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                         Value
---                         -----
Seal Type                   shamir
Initialized                 true
Sealed                      false
Total Shares                1
Threshold                   1
Version                     1.11.0
Build Date                  2022-06-17T15:48:44Z
Storage Type                raft
Cluster Name                POC
Cluster ID                  109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled                  true
HA Cluster                  https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode                     active
Active Since                2022-06-29T12:50:46.992698336Z
Raft Committed Index        36
Raft Applied Index          36

[user@node1 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                         Value
---                         -----
Seal Type                   shamir
Initialized                 true
Sealed                      false
Total Shares                1
Threshold                   1
Version                     1.11.0
Build Date                  2022-06-17T15:48:44Z
Storage Type                raft
Cluster Name                POC
Cluster ID                  109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled                  true
HA Cluster                  https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode                     active
Active Since                2022-06-29T12:50:46.992698336Z
Raft Committed Index        36
Raft Applied Index          36

13. Unseal Vault2 (Use the same unseal key generated in step 10 for Vault1):

[user@node2 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       1
Threshold          1
Unseal Progress    0/1
Unseal Nonce       n/a
Version            1.11.0
Build Date         2022-06-17T15:48:44Z
Storage Type       raft
HA Enabled         true
[user@node2 vault.d]$ vault status
Key                   Value
---                   -----
Seal Type             shamir
Initialized           true
Sealed                true
Total Shares          1
Threshold             1
Version               1.11.0
Build Date            2022-06-17T15:48:44Z
Storage Type          raft
Cluster Name          POC
Cluster ID            109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled            true
HA Cluster            https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode               standby
Active Node Address   https://node1.int.us-west-1-dev.central.example.com:8200
Raft Committed Index  37
Raft Applied Index    37

[user@node2 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       1
Threshold          1
Unseal Progress    0/1
Unseal Nonce       n/a
Version            1.11.0
Build Date         2022-06-17T15:48:44Z
Storage Type       raft
HA Enabled         true

[user@node2 vault.d]$ vault status
Key                   Value
---                   -----
Seal Type             shamir
Initialized           true
Sealed                true
Total Shares          1
Threshold             1
Version               1.11.0
Build Date            2022-06-17T15:48:44Z
Storage Type          raft
Cluster Name          POC
Cluster ID            109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled            true
HA Cluster            https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode               standby
Active Node Address   https://node1.int.us-west-1-dev.central.example.com:8200
Raft Committed Index  37
Raft Applied Index    37

14. Unseal Vault3 (Use the same unseal key generated in step 10 for Vault1):

[user@node3 ~]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       1
Threshold          1
Unseal Progress    0/1
Unseal Nonce       n/a
Version            1.11.0
Build Date         2022-06-17T15:48:44Z
Storage Type       raft
HA Enabled         true
[user@node3 ~]$ vault status
Key                       Value
---                       -----
Seal Type                 shamir
Initialized               true
Sealed                    false
Total Shares              1
Threshold                 1
Version                   1.11.0
Build Date                2022-06-17T15:48:44Z
Storage Type              raft
Cluster Name              POC
Cluster ID                109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled                true
HA Cluster                https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode                   standby
Active Node Address       https://node1.int.us-west-1-dev.central.example.com:8200
Raft Committed Index      39
Raft Applied Index        39

[user@node3 ~]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       1
Threshold          1
Unseal Progress    0/1
Unseal Nonce       n/a
Version            1.11.0
Build Date         2022-06-17T15:48:44Z
Storage Type       raft
HA Enabled         true

[user@node3 ~]$ vault status
Key                       Value
---                       -----
Seal Type                 shamir
Initialized               true
Sealed                    false
Total Shares              1
Threshold                 1
Version                   1.11.0
Build Date                2022-06-17T15:48:44Z
Storage Type              raft
Cluster Name              POC
Cluster ID                109658fe-36bd-7d28-bf92-f095c77e860c
HA Enabled                true
HA Cluster                https://node1.int.us-west-1-dev.central.example.com:8201
HA Mode                   standby
Active Node Address       https://node1.int.us-west-1-dev.central.example.com:8200
Raft Committed Index      39
Raft Applied Index        39

15. Check the cluster’s raft status with the following command:

[user@node3 ~]$ vault operator raft list-peers
Node      Address                                            State       Voter
----      -------                                            -----       -----
node1    node1.int.us-west-1-dev.central.example.com:8201    leader      true
node2    node2.int.us-west-1-dev.central.example.com:8201    follower    true
node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

[user@node3 ~]$ vault operator raft list-peers
Node      Address                                            State       Voter
----      -------                                            -----       -----
node1    node1.int.us-west-1-dev.central.example.com:8201    leader      true
node2    node2.int.us-west-1-dev.central.example.com:8201    follower    true
node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

16. Currently, node1 is the active node. We can experiment to see what happens if node1 steps down from its active node duty.

In the terminal where VAULT_ADDR is set to: https://node1.int.us-west-1-dev.central.example.com, execute the step-down command.

$ vault operator step-down # equivalent of stopping the node or stopping the systemctl service
Success! Stepped down: https://node2.int.us-west-1-dev.central.example.com:8200

$ vault operator step-down # equivalent of stopping the node or stopping the systemctl service
Success! Stepped down: https://node2.int.us-west-1-dev.central.example.com:8200

In the terminal, where VAULT_ADDR is set to https://node2.int.us-west-1-dev.central.example.com:8200, examine the raft peer set.

[user@node1 ~]$ vault operator raft list-peers
Node      Address                                            State       Voter
----      -------                                            -----       -----
node1    node1.int.us-west-1-dev.central.example.com:8201    follower    true
node2    node2.int.us-west-1-dev.central.example.com:8201    leader      true
node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

[user@node1 ~]$ vault operator raft list-peers
Node      Address                                            State       Voter
----      -------                                            -----       -----
node1    node1.int.us-west-1-dev.central.example.com:8201    follower    true
node2    node2.int.us-west-1-dev.central.example.com:8201    leader      true
node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

Conclusion

Vault servers are now operational in High Availability mode, and we can test this by writing a secret from either the active or standby Vault instance and see it succeed as a test of request forwarding. Also, we can shut down the active vault instance (sudo systemctl stop vault) to simulate a system failure and see the standby instance assumes the leadership.

January 11, 2023

How to Avoid Screwing Up CI/CD: Best Practices for DevOps Team
Basic Fundamentals (One-line definition) :

CI/CD is defined as continuous integration, continuous delivery, and/or continuous deployment.

Continuous Integration:

Continuous integration is defined as a practice where a developer’s changes are merged back to the main branch as soon as possible to avoid facing integration challenges.

Continuous Delivery:

Continuous delivery is basically the ability to get all the types of changes deployed to production or delivered to the customer in a safe, quick, and sustainable way.

An oversimplified CI/CD pipeline

Why CI/CD?

‍
- Avoid integration hell
In most modern application development scenarios, multiple developers work on different features simultaneously. However, if all the source code is to be merged on the same day, the result can be a manual, tedious process of resolving conflicts between branches, as well as a lot of rework.

Continuous integration (CI) is the process of merging the code changes frequently (can be daily or multiple times a day also) to a shared branch (aka master or truck branch). The CI process makes it easier and quicker to identify bugs, saving a lot of developer time and effort.
- Faster time to market
Less time is spent on solving integration problems and reworking, allowing faster time to market for products.
- Have a better and more reliable code
The changes are small and thus easier to test. Each change goes through a rigorous cycle of unit tests, integration/regression tests, and performance tests before being pushed to prod, ensuring a better quality code.
- Lower costs
As we have a faster time to market and fewer integration problems, a lot of developer time and development cycles are saved, leading to a lower cost of development.

Enough theory now, let’s dive into “How do I get started ?”

Basic Overview of CI/CD

Decide on your branching strategy

A good branching strategy should have the following characteristics:
- Defines a clear development process from initial commit to production deployment
- Enables parallel development
- Optimizes developer productivity
- Enables faster time to market for products and services
- Facilitates integration with all DevOps practices and tools such as different versions of control systems
Types of branching strategies (please refer to references for more details) :
- Git flow – Ideal when handling multiple versions of the production code and for enterprise customers who have to adhere to release plans and workflows
- Trunk-based development – Ideal for simpler workflows and if automated testing is available, leading to a faster development time
- Other branching strategies that you can read about are Github flow, Gitlab flow, and Forking flow.
Build or compile your code

The next step is to build/compile your code, and if it is interpreted code, go ahead and package it.

Build best practices :
- Build Once – Building the same artifact for multiple env is inadvisable.
- Exact versions of third-party dependencies should be used.
- Libraries used for debugging, etc., should be removed from the product package.
- Have a feedback loop so that the team is made aware of the status of the build step.
- Make sure your builds are versioned correctly using semver 2.0 (https://semver.org/).
- Commit early, commit often.
Select tool for stitching the pipeline together
- You can choose from GitHub actions, Jenkins, circleci, GitLab, etc.
- Tool selection will not affect the quality of your CI/CD pipeline but might increase the maintenance if we go for managed CI/CD services as opposed to services like Jenkins deployed onprem.
Tools and strategy for SAST

Instead of just DevOps, we should think of devsecops. To make the code more secure and reliable, we can introduce a step for SAST (static application security testing).

SAST, or static analysis, is a testing procedure that analyzes source code to find security vulnerabilities. SAST scans the application code before the code is compiled. It’s also known as white-box testing, and it helps shift towards a security-first mindset as the code is scanned right at the start of SDLC.

Problems SAST solves:
- SAST tools give developers real-time feedback as they code, helping them fix issues before they pass the code to the next phase of the SDLC.
- This prevents security-related issues from being considered an afterthought.
Deployment strategies

How will you deploy your code with zero downtime so that the customer has the best experience? Try and implement one of the strategies below automatically via CI/CD. This will help in keeping the blast radius to the minimum in case something goes wrong.
- Ramped (also known as rolling-update or incremental): The new version is slowly rolled out to replace the older version of the product .
- Blue/Green: The new version is released alongside the older version, then the traffic is switched to the newer version.
- Canary: The new version is released to a selected group of users before doing a full rollout. This can be achieved by feature flagging as well. For more information, read about tools like launch darkly(https://launchdarkly.com/) and git unleash (https://github.com/Unleash/unleash).
- A/B testing: The new version is released to a subset of users under specific conditions.
- Shadow: The new version receives real-world traffic alongside the older version and doesn’t impact the response.
Config and Secret Management

According to the 12-factor app, application configs should be exposed to the application with environment variables. However, it does not have restrictions on where these configurations need to be stored and sourced from.

A few things to keep in mind while storing configs.
- Versioning of configs always helps, but storing secrets in VCS is strongly discouraged.
- For an enterprise, it is beneficial to use a cloud-agnostic solution.
Solution:
- Store your configuration secrets outside of the version control system.
- You can use AWS secret manager, Vault, and even S3 for storing your configs, e.g.: S3 with KMS, etc. There are other services available as well, so choose the one which suits your use case the best.
Automate versioning and release notes generation

All the releases should be tagged in the version control system. Versions can be automatically updated by looking at the git commit history and searching for keywords.

There are many modules available for release notes generation. Try and automate these as well as a part of your CI/CD process. If this is done, you can successfully eliminate human intervention from the release process.

Example from GitHub actions workflow :
```
- name: Automated Version Bump
  id: version-bump
  uses: 'phips28/gh-action-bump-version@v9.0.16'
  env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  with:
    commit-message: 'CI: Bump version to v{{version}}'
```
Have a rollback strategy

In case of regression, performance, or smoke test fails after deployment onto an environment, feedback should be given and the version should be rolled back automatically as a part of the CI/CD process. This makes sure that the environment is up and also reduces the MTTR (mean time to recovery), and MTTD (mean time to detection) in case there is a production outage due to code deployment.

GitOps tools like argocd and flux make it easy to do things like this, but even if you are not using any of the GitOps tools, this can be easily managed using scripts or whatever tool you are using for deployment.

Include db changes as a part of your CI/CD

Databases are often created manually and frequently evolve through manual changes, informal processes, and even testing in production. Manual changes often lack documentation and are harder to review, test, and coordinate with software releases. This makes the system more fragile with a higher risk of failure.

The correct way to do this is to include the database in source control and CI/CD pipeline. This lets the team document each change, follow the code review process, test it thoroughly before release, make rollbacks easier, and coordinate with software releases.

For a more enterprise or structured solution, we could use a tool such as Liquibase, Alembic, or Flyway.

How it should ideally be done:
- We can have a migration-based strategy where, for each DB change, an additional migration script is added and is executed as a part of CI/CD .
- Things to keep in mind are that the CI/CD process should be the same across all the environments. Also, the amount of data on prod and other environments might vary drastically, so batching and limits should be used so that we don’t end up using all the memory of our database server.
- As far as possible, DB migrations should be backward compatible. This makes it easier for rollbacks. This is the reason some companies only allow additive changes as a part of db migration scripts.
Real-world scenarios
- Gated approach
It is not always possible to have a fully automated CI/CD pipeline because the team may have just started the development of a product and might not have automated testing yet.

So, in cases like these, we have manual gates that can be approved by the responsible teams. For example, we will deploy to the development environment and then wait for testers to test the code and approve the manual gate, then the pipeline can go forward.

Most of the tools support these kinds of requests. Make sure that you are not using any kind of resources for this step otherwise you will end up blocking resources for the other pipelines.

Example:

https://www.jenkins.io/doc/pipeline/steps/pipeline-input-step/#input-wait-for-interactive-input
def LABEL_ID = "yourappname-${UUID.randomUUID().toString()}" def BRANCH_NAME = "<Your branch name>" def GIT_URL = "<Your git url>" // Start Agent node(LABEL_ID) { stage('Checkout') { doCheckout(BRANCH_NAME, GIT_URL) } stage('Build') { ... } stage('Tests') { ... } } // Kill Agent // Input Step timeout(time: 15, unit: "MINUTES") { input message: 'Do you want to approve the deploy in production?', ok: 'Yes' } // Start Agent Again node(LABEL_ID) { doCheckout(BRANCH_NAME, GIT_URL) stage('Deploy') { ... } } def doCheckout(branchName, gitUrl){ checkout([$class: 'GitSCM', branches: [[name: branchName]], doGenerateSubmoduleConfigurations: false, extensions:[[$class: 'CloneOption', noTags: true, reference: '', shallow: true]], userRemoteConfigs: [[credentialsId: '<Your credentials id>', url: gitUrl]]]) }
```
def LABEL_ID = "yourappname-${UUID.randomUUID().toString()}"
def BRANCH_NAME = "<Your branch name>"
def GIT_URL = "<Your git url>"
// Start Agent
node(LABEL_ID) {
    stage('Checkout') {
        doCheckout(BRANCH_NAME, GIT_URL)
    }
    stage('Build') {
        ...
    }
    stage('Tests') {
        ...
    }    
}
// Kill Agent
// Input Step
timeout(time: 15, unit: "MINUTES") {
    input message: 'Do you want to approve the deploy in production?', ok: 'Yes'
}
// Start Agent Again
node(LABEL_ID) {
    doCheckout(BRANCH_NAME, GIT_URL) 
    stage('Deploy') {
        ...
    }
}
def doCheckout(branchName, gitUrl){
    checkout([$class: 'GitSCM',
        branches: [[name: branchName]],
        doGenerateSubmoduleConfigurations: false,
        extensions:[[$class: 'CloneOption', noTags: true, reference: '', shallow: true]],
        userRemoteConfigs: [[credentialsId: '<Your credentials id>', url: gitUrl]]])
}
```
Observability of releases

Whenever we are debugging the root cause of issues in production, we might need the information below. As the system gets more complex with multiple upstreams and downstream, it becomes imperative that we have this information, all in one place, for efficient debugging and support by the operations team.
- When was the last deployment? What version was deployed?
- The deployment history as to which version was deployed when along with the code changes that went in.
Below are the 2 ways generally organizations follow to achieve this:
- Have a release workflow that is tracked using a Change request or Service request on Jira or any other tracking tool.
- For GitOps applications using tools like Argo CD and flux, all this information is available as a part of the version control system and can be derived from there.
DORA metrics

DevOps maturity of a team is measured based on mainly four metrics that are defined below, and CI/CD helps in improving all of the below. So, teams and organizations should try and achieve the Elite status for DORA metrics.
- Deployment Frequency— How often an org successfully releases to production
- Lead Time for Changes— The amount of time a commit takes to get into prod
- Change Failure Rate— The percentage of deployments causing a failure in prod
- Time to Restore Service— How long an org takes to recover from a failure in prod
Conclusion

CI/CD forms an integral part of DevOps and SRE practices, and if done correctly, it can impact the team’s and organization’s productivity in a huge way.

So, try and implement the above principles and get one step closer to having a highly productive team and a better product.
December 12, 2022

Getting Started With Kubernetes Operators (Golang Based) – Part 3

Introduction

In the first, getting started with Kubernetes operators (Helm based), and the second part, getting started with Kubernetes operators (Ansible based), of this Introduction to Kubernetes operators blog series we learned various concepts related to Kubernetes operators and created a Helm based operator and an Ansible based operator respectively. In this final part, we will build a Golang based operator. In case of Helm based operators, we were executing a helm chart when changes were made to the custom object type of our application, similarly in the case of an Ansible based operator we executed an Ansible role. In case of Golang based operator we write the code for the action we need to perform (reconcile logic) whenever the state of our custom object change, this makes the Golang based operators quite powerful and flexible, at the same time making them the most complex to build out of the 3 types.

What Will We Build?

The database server we deployed as part of our book store app in previous blogs didn’t have any persistent volume attached to it and we would lose data in case the pod restarts, to avoid this we will attach a persistent volume attached to the host (K8s worker nodes ) and run our database as an statefulset rather than a deployment. We will also add a feature to expand the persistent volume associated with the mongodb pod.

Building the Operator

1. Set up the project:

operator-sdk new bookstore-operator –dep-manager=dep

INFO[0000] Generating api version blog.velotio.com/v1alpha1 for kind BookStore. 
INFO[0000] Created pkg/apis/blog/group.go               
INFO[0001] Created pkg/apis/blog/v1alpha1/bookstore_types.go 
INFO[0001] Created pkg/apis/addtoscheme_blog_v1alpha1.go 
INFO[0001] Created pkg/apis/blog/v1alpha1/register.go   
INFO[0001] Created pkg/apis/blog/v1alpha1/doc.go        
INFO[0001] Created deploy/crds/blog.velotio.com_v1alpha1_bookstore_cr.yaml 
INFO[0009] Created deploy/crds/blog.velotio.com_bookstores_crd.yaml 
INFO[0009] Running deepcopy code-generation for Custom Resource group versions: [blog:[v1alpha1], ] 
INFO[0010] Code-generation complete.                    
INFO[0010] Running OpenAPI code-generation for Custom Resource group versions: [blog:[v1alpha1], ] 
INFO[0011] Created deploy/crds/blog.velotio.com_bookstores_crd.yaml 
INFO[0011] Code-generation complete.                    
INFO[0011] API generation complete.

INFO[0000] Generating api version blog.velotio.com/v1alpha1 for kind BookStore. 
INFO[0000] Created pkg/apis/blog/group.go               
INFO[0001] Created pkg/apis/blog/v1alpha1/bookstore_types.go 
INFO[0001] Created pkg/apis/addtoscheme_blog_v1alpha1.go 
INFO[0001] Created pkg/apis/blog/v1alpha1/register.go   
INFO[0001] Created pkg/apis/blog/v1alpha1/doc.go        
INFO[0001] Created deploy/crds/blog.velotio.com_v1alpha1_bookstore_cr.yaml 
INFO[0009] Created deploy/crds/blog.velotio.com_bookstores_crd.yaml 
INFO[0009] Running deepcopy code-generation for Custom Resource group versions: [blog:[v1alpha1], ] 
INFO[0010] Code-generation complete.                    
INFO[0010] Running OpenAPI code-generation for Custom Resource group versions: [blog:[v1alpha1], ] 
INFO[0011] Created deploy/crds/blog.velotio.com_bookstores_crd.yaml 
INFO[0011] Code-generation complete.                    
INFO[0011] API generation complete.

The above command creates the bookstore-operator folder in our $GOPATH/src, here we have set the –dep-manager as dep which signifies we want to use dep for managing dependencies, by default it uses go modules for managing dependencies. Similar to what we have seen earlier the operator sdk creates all the necessary folder structure for us inside the bookstore-operator folder.

2. Add the custom resource definition

operator-sdk add api –api-version=blog.velotio.com/v1alpha1 –kind=BookStore

The above command creates the CRD and CR for the BookStore type. It also creates the golang structs (pkg/apis/blog/v1alpha1/bookstore_types.go) for BookStore types. It also registers the custom type (pkg/apis/blog/v1alpha1/register.go) with schema and generates deep-copy methods as well. Here we can see that all the generic tasks are being done by the operator framework itself allowing us to focus on building and object and the controller. We will update the spec of our BookStore object later. We will update the spec of BookStore type to include two custom types BookApp and BookDB.

type BookStoreSpec struct {
	BookApp BookApp     `json:"bookApp,omitempty"`
	BookDB  BookDB      `json:"bookDB,omitempty"`
}
type BookApp struct {
	 
	Repository      string             `json:"repository,omitempty"`
	Tag             string             `json:"tag,omitempty"`
	ImagePullPolicy corev1.PullPolicy  `json:"imagePullPolicy,omitempty"`
        Replicas        int32              `json:"replicas,omitempty"`
        Port            int32              `json:"port,omitempty"`
	TargetPort      int                `json:"targetPort,omitempty"`
	ServiceType     corev1.ServiceType `json:"serviceType,omitempty"`
}
type BookDB struct {
	 
	Repository      string            `json:"repository,omitempty"`
	Tag             string            `json:"tag,omitempty"`
	ImagePullPolicy corev1.PullPolicy `json:"imagePullPolicy,omitempty"`
        Replicas        int32             `json:"replicas,omitempty"`
	Port            int32             `json:"port,omitempty"`
	DBSize          resource.Quantity `json:"dbSize,omitempty"`
}

type BookStoreSpec struct {

	BookApp BookApp     `json:"bookApp,omitempty"`
	BookDB  BookDB      `json:"bookDB,omitempty"`
}

type BookApp struct {
	 
	Repository      string             `json:"repository,omitempty"`
	Tag             string             `json:"tag,omitempty"`
	ImagePullPolicy corev1.PullPolicy  `json:"imagePullPolicy,omitempty"`
        Replicas        int32              `json:"replicas,omitempty"`
        Port            int32              `json:"port,omitempty"`
	TargetPort      int                `json:"targetPort,omitempty"`
	ServiceType     corev1.ServiceType `json:"serviceType,omitempty"`
}

type BookDB struct {
	 
	Repository      string            `json:"repository,omitempty"`
	Tag             string            `json:"tag,omitempty"`
	ImagePullPolicy corev1.PullPolicy `json:"imagePullPolicy,omitempty"`
        Replicas        int32             `json:"replicas,omitempty"`
	Port            int32             `json:"port,omitempty"`
	DBSize          resource.Quantity `json:"dbSize,omitempty"`
}

Let’s also update the BookStore CR (blog.velotio.com_v1alpha1_bookstore_cr.yaml)

apiVersion: blog.velotio.com/v1alpha1
kind: BookStore
metadata:name: example-bookstore
spec:
  bookApp: 
    repository: "akash125/pyapp"
    tag: latest
    imagePullPolicy: "IfNotPresent"
    replicas: 1
    port: 80
    targetPort: 3000
    serviceType: "LoadBalancer"
  bookDB:
    repository: "mongo"
    tag: latest
    imagePullPolicy: "IfNotPresent"
    replicas: 1
    port: 27017
    dbSize: 2Gi

apiVersion: blog.velotio.com/v1alpha1
kind: BookStore
metadata:name: example-bookstore
spec:
  bookApp: 
    repository: "akash125/pyapp"
    tag: latest
    imagePullPolicy: "IfNotPresent"
    replicas: 1
    port: 80
    targetPort: 3000
    serviceType: "LoadBalancer"
  bookDB:
    repository: "mongo"
    tag: latest
    imagePullPolicy: "IfNotPresent"
    replicas: 1
    port: 27017
    dbSize: 2Gi

3. Add the bookstore controller

operator-sdk add controller –api-version=blog.velotio.com/v1alpha1 –kind=BookStore

INFO[0000] Generating controller version blog.velotio.com/v1alpha1 for kind BookStore. 
INFO[0000] Created pkg/controller/bookstore/bookstore_controller.go 
INFO[0000] Created pkg/controller/add_bookstore.go      
INFO[0000] Controller generation complete.

INFO[0000] Generating controller version blog.velotio.com/v1alpha1 for kind BookStore. 
INFO[0000] Created pkg/controller/bookstore/bookstore_controller.go 
INFO[0000] Created pkg/controller/add_bookstore.go      
INFO[0000] Controller generation complete.

The above command adds the bookstore controller (pkg/controller/bookstore/bookstore_controller.go) to the project and also adds it to the manager.

If we take a look at the add function in the bookstore_controller.go file we can see that a new controller is created here and added to the manager so that the manager can start the controller when it (manager) comes up, the add(mgr manager.Manager, r reconcile.Reconciler) is called by the public function Add(mgr manager.Manager) which also creates a new reconciler objects and passes it to the add where the controller is associated with the reconciler, in the add function we also set the type of object (BookStore) which the controller will watch.

// Watch for changes to primary resource BookStore
	err = c.Watch(&source.Kind{Type: &blogv1alpha1.BookStore{}}, &handler.EnqueueRequestForObject{})
	if err != nil {
		return err
	}

// Watch for changes to primary resource BookStore
	err = c.Watch(&source.Kind{Type: &blogv1alpha1.BookStore{}}, &handler.EnqueueRequestForObject{})
	if err != nil {
		return err
	}

This ensures that for any events related to any object of BookStore type, a reconcile request (a namespace/name key) is sent to the Reconcile method associated with the reconciler object (ReconcileBookStore) here.

4. Build the reconcile logic

The reconcile logic is implemented inside the Reconcile method of the reconciler object of the custom type which implements the reconcile loop.

As a part of our reconcile logic we will do the following

Create the bookstore app deployment if it doesn’t exist.
Create the bookstore app service if it doesn’t exist.
Create the Mongodb statefulset if it doesn’t exist.
Create the Mongodb service if it doesn’t exist.
Ensure deployments and services match their desired configurations like the replica count, image tag, service port, size of the PV associated with the Mongodb statefulset etc.

There are three possible events that can happen with the BookStore object

The object got created: Whenever an object of kind BookStore is created we create all the k8s resources we mentioned above
The object has been updated: When the object gets updated then we update all the k8s resources associated with it..
The object has been deleted: When the object gets deleted we don’t need to do anything as while creating the K8s objects we will set the `BookStore` type as its owner which will ensure that all the K8s objects associated with it gets automatically deleted when we delete the object.

On receiving the reconcile request the first step if to lookup for the object.

func (r *ReconcileBookStore) Reconcile(request reconcile.Request) (reconcile.Result, error) {
	reqLogger := log.WithValues("Request.Namespace", request.Namespace, "Request.Name", request.Name)
	reqLogger.Info("Reconciling BookStore")
	// Fetch the BookStore instance
	bookstore := &blogv1alpha1.BookStore{}
	err := r.client.Get(context.TODO(), request.NamespacedName, bookstore)

func (r *ReconcileBookStore) Reconcile(request reconcile.Request) (reconcile.Result, error) {
	reqLogger := log.WithValues("Request.Namespace", request.Namespace, "Request.Name", request.Name)
	reqLogger.Info("Reconciling BookStore")

	// Fetch the BookStore instance
	bookstore := &blogv1alpha1.BookStore{}
	err := r.client.Get(context.TODO(), request.NamespacedName, bookstore)

If the object is not found, we assume that it got deleted and don’t requeue the request considering the reconcile to be successful.

If any error occurs while doing the reconcile then we return the error and whenever we return non nil error value then controller requeues the request.

In the reconcile logic we call the BookStore method which creates or updates all the k8s objects associated with the BookStore objects based on whether the object has been created or updated.

func (r *ReconcileBookStore) BookStore(bookstore *blogv1alpha1.BookStore) error {
     reqLogger := log.WithValues("Namespace", bookstore.Namespace)
     mongoDBSvc := getmongoDBSvc(bookstore)
     msvc := &corev1.Service{}
     err := r.client.Get(context.TODO(), types.NamespacedName{Name: "mongodb-service", Namespace: bookstore.Namespace}, msvc)
     if err != nil {
	if errors.IsNotFound(err) {
	   controllerutil.SetControllerReference(bookstore, mongoDBSvc, r.scheme)
	   err = r.client.Create(context.TODO(), mongoDBSvc)
	   if err != nil { return err }
            } else {  return err }
        } else if !reflect.DeepEqual(mongoDBSvc.Spec, msvc.Spec) {
	   mongoDBSvc.ObjectMeta = msvc.ObjectMeta
           controllerutil.SetControllerReference(bookstore, mongoDBSvc, r.scheme)
           err = r.client.Update(context.TODO(), mongoDBSvc)
	   if err != nil { return err }
	      reqLogger.Info("mongodb-service updated")
	   }
   mongoDBSS := getMongoDBStatefulsets(bookstore)
   mss := &appsv1.StatefulSet{}
   err = r.client.Get(context.TODO(), types.NamespacedName{Name: "mongodb", Namespace: bookstore.Namespace}, mss)
   if err != nil {
      if errors.IsNotFound(err) {
	reqLogger.Info("mongodb statefulset not found, will be created")
	controllerutil.SetControllerReference(bookstore, mongoDBSS, r.scheme)
	err = r.client.Create(context.TODO(), mongoDBSS)
	if err != nil { return err }
	} else {
	    reqLogger.Info("failed to get mongodb statefulset")
	    return err
	   }
	} else if !reflect.DeepEqual(mongoDBSS.Spec, mss.Spec) {
             r.UpdateVolume(bookstore)
	     mongoDBSS.ObjectMeta = mss.ObjectMeta
	     mongoDBSS.Spec.VolumeClaimTemplates = mss.Spec.VolumeClaimTemplates
	     controllerutil.SetControllerReference(bookstore, mongoDBSS, r.scheme)
	     err = r.client.Update(context.TODO(), mongoDBSS)
	     if err != nil { return err }
	        reqLogger.Info("mongodb statefulset updated")
        }
   bookStoreSvc := getBookStoreAppSvc(bookstore)
   bsvc := &corev1.Service{}
   err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bookstore-svc", Namespace: bookstore.Namespace}, bsvc)
   if err != nil {
      if errors.IsNotFound(err) {
	  controllerutil.SetControllerReference(bookstore, bookStoreSvc, r.scheme)
	  err = r.client.Create(context.TODO(), bookStoreSvc)
	  if err != nil { return err }
	  } else {
	      reqLogger.Info("failed to get bookstore service")
	      return err
	    }
	} else if !reflect.DeepEqual(bookStoreSvc.Spec, bsvc.Spec) {
	      bookStoreSvc.ObjectMeta = bsvc.ObjectMeta
	      bookStoreSvc.Spec.ClusterIP = bsvc.Spec.ClusterIP
	      controllerutil.SetControllerReference(bookstore, bookStoreSvc, r.scheme)
	      err = r.client.Update(context.TODO(), bookStoreSvc)
	      if err != nil { return err }
	          reqLogger.Info("bookstore service updated")
	  }
  bookStoreDep := getBookStoreDeploy(bookstore)
  bsdep := &appsv1.Deployment{}
  err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bookstore", Namespace: bookstore.Namespace}, bsdep)
  if err != nil {
    if errors.IsNotFound(err) {
	controllerutil.SetControllerReference(bookstore, bookStoreDep, r.scheme)
	err = r.client.Create(context.TODO(), bookStoreDep)
	if err != nil { return err }
	} else {
	   reqLogger.Info("failed to get bookstore deployment")
	     return err
	    }
	} else if !reflect.DeepEqual(bookStoreDep.Spec, bsdep.Spec) {
	       bookStoreDep.ObjectMeta = bsdep.ObjectMeta
	       controllerutil.SetControllerReference(bookstore, bookStoreDep, r.scheme)
	       err = r.client.Update(context.TODO(), bookStoreDep)
	       if err != nil { return err }
			reqLogger.Info("bookstore deployment updated")
	}
  r.client.Status().Update(context.TODO(), bookstore)
  return nil
}

func (r *ReconcileBookStore) BookStore(bookstore *blogv1alpha1.BookStore) error {
     reqLogger := log.WithValues("Namespace", bookstore.Namespace)
     mongoDBSvc := getmongoDBSvc(bookstore)
     msvc := &corev1.Service{}
     err := r.client.Get(context.TODO(), types.NamespacedName{Name: "mongodb-service", Namespace: bookstore.Namespace}, msvc)
     if err != nil {
	if errors.IsNotFound(err) {
	   controllerutil.SetControllerReference(bookstore, mongoDBSvc, r.scheme)
	   err = r.client.Create(context.TODO(), mongoDBSvc)
	   if err != nil { return err }
            } else {  return err }
        } else if !reflect.DeepEqual(mongoDBSvc.Spec, msvc.Spec) {
	   mongoDBSvc.ObjectMeta = msvc.ObjectMeta
           controllerutil.SetControllerReference(bookstore, mongoDBSvc, r.scheme)
           err = r.client.Update(context.TODO(), mongoDBSvc)
	   if err != nil { return err }
	      reqLogger.Info("mongodb-service updated")
	   }
   mongoDBSS := getMongoDBStatefulsets(bookstore)
   mss := &appsv1.StatefulSet{}
   err = r.client.Get(context.TODO(), types.NamespacedName{Name: "mongodb", Namespace: bookstore.Namespace}, mss)
   if err != nil {
      if errors.IsNotFound(err) {
	reqLogger.Info("mongodb statefulset not found, will be created")
	controllerutil.SetControllerReference(bookstore, mongoDBSS, r.scheme)
	err = r.client.Create(context.TODO(), mongoDBSS)
	if err != nil { return err }
	} else {
	    reqLogger.Info("failed to get mongodb statefulset")
	    return err
	   }
	} else if !reflect.DeepEqual(mongoDBSS.Spec, mss.Spec) {
             r.UpdateVolume(bookstore)
	     mongoDBSS.ObjectMeta = mss.ObjectMeta
	     mongoDBSS.Spec.VolumeClaimTemplates = mss.Spec.VolumeClaimTemplates
	     controllerutil.SetControllerReference(bookstore, mongoDBSS, r.scheme)
	     err = r.client.Update(context.TODO(), mongoDBSS)
	     if err != nil { return err }
	        reqLogger.Info("mongodb statefulset updated")
        }
   bookStoreSvc := getBookStoreAppSvc(bookstore)
   bsvc := &corev1.Service{}
   err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bookstore-svc", Namespace: bookstore.Namespace}, bsvc)
   if err != nil {
      if errors.IsNotFound(err) {
	  controllerutil.SetControllerReference(bookstore, bookStoreSvc, r.scheme)
	  err = r.client.Create(context.TODO(), bookStoreSvc)
	  if err != nil { return err }
	  } else {
	      reqLogger.Info("failed to get bookstore service")
	      return err
	    }
	} else if !reflect.DeepEqual(bookStoreSvc.Spec, bsvc.Spec) {
	      bookStoreSvc.ObjectMeta = bsvc.ObjectMeta
	      bookStoreSvc.Spec.ClusterIP = bsvc.Spec.ClusterIP
	      controllerutil.SetControllerReference(bookstore, bookStoreSvc, r.scheme)
	      err = r.client.Update(context.TODO(), bookStoreSvc)
	      if err != nil { return err }
	          reqLogger.Info("bookstore service updated")
	  }
  bookStoreDep := getBookStoreDeploy(bookstore)
  bsdep := &appsv1.Deployment{}
  err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bookstore", Namespace: bookstore.Namespace}, bsdep)
  if err != nil {
    if errors.IsNotFound(err) {
	controllerutil.SetControllerReference(bookstore, bookStoreDep, r.scheme)
	err = r.client.Create(context.TODO(), bookStoreDep)
	if err != nil { return err }
	} else {
	   reqLogger.Info("failed to get bookstore deployment")
	     return err
	    }
	} else if !reflect.DeepEqual(bookStoreDep.Spec, bsdep.Spec) {
	       bookStoreDep.ObjectMeta = bsdep.ObjectMeta
	       controllerutil.SetControllerReference(bookstore, bookStoreDep, r.scheme)
	       err = r.client.Update(context.TODO(), bookStoreDep)
	       if err != nil { return err }
			reqLogger.Info("bookstore deployment updated")
	}
  r.client.Status().Update(context.TODO(), bookstore)
  return nil
}

The implementation of the above method is a bit hacky but gives an idea of the flow. In the above function, we can see that we are setting the BookStore type as an owner for all the resources controllerutil.SetControllerReference(c, bookStoreDep, r.scheme) as we had discussed earlier. If we look at the owner reference for these objects we would see something like this.

ownerReferences:
  - apiVersion: blog.velotio.com/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: BookStore
    name: example-bookstore
    uid: 0ef42889-deb4-11e9-ba56-42010a800256
  resourceVersion: "20295281"

ownerReferences:
  - apiVersion: blog.velotio.com/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: BookStore
    name: example-bookstore
    uid: 0ef42889-deb4-11e9-ba56-42010a800256
  resourceVersion: "20295281"

5. Deploy the operator and verify its working

The approach to deploy and verify the working of the bookstore application is similar to what we did in the previous two blogs the only difference being that now we have deployed the Mongodb as a stateful set and even if we restart the pod we will see that the information that we stored will still be available.

6. Verify volume expansion

For updatingthe volume associated with the mongodb instance we first need to update the size of the volume we specified while creating the bookstore object. In the example above I had set it to 2GB let’s update it to 3GB and update the bookstore object.

Once the bookstore object is updated if we describe the mongodb PVC we will see that it still has 2GB PV but the conditions we will see something like this.

Conditions:
  Type                      Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----                      ------  -----------------                 ------------------                ------  -------
  FileSystemResizePending   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 30 Sep 2019 15:07:01 +0530           Waiting for user to (re-)start a pod to finish file system resize of volume on node.
@velotiotech

Conditions:
  Type                      Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----                      ------  -----------------                 ------------------                ------  -------
  FileSystemResizePending   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 30 Sep 2019 15:07:01 +0530           Waiting for user to (re-)start a pod to finish file system resize of volume on node.
@velotiotech

It is clear from the message that we need to restart the pod for resizing of volume to reflect. Once we delete the pod it will get restarted and the PVC will get updated to reflect the expanded volume size.

The complete code is available here.

Conclusion

Golang based operators are built mostly for stateful applications like databases. The operator can automate complex operational tasks allow us to run applications with ease. At the same time, building and maintaining it can be quite complex and we should build one only when we are fully convinced that our requirements can’t be met with any other type of operator. Operators are an interesting and emerging area in Kubernetes and I hope this blog series on getting started with it help the readers in learning the basics of it.

December 12, 2022

Setting Up A Robust Authentication Environment For OpenSSH Using QR Code PAM

Do you like WhatsApp Web authentication? Well, WhatsApp Web has always fascinated me with the simplicity of QR-Code based authentication. Though there are similar authentication UIs available, I always wondered whether a remote secure shell (SSH) could be authenticated with a QR code with this kind of simplicity while keeping the auth process secure. In this guide, we will see how to write and implement a bare-bones PAM module for OpenSSH Linux-based system.

“OpenSSH is the premier connectivity tool for remote login with the SSH protocol. It encrypts all traffic to eliminate eavesdropping, connection hijacking, and other attacks. In addition, OpenSSH provides a large suite of secure tunneling capabilities, several authentication methods, and sophisticated configuration options.”

– openssh.com

Meet PAM!

PAM, short for “Pluggable Authentication Module,” is a middleware that abstracts authentication features on Linux and UNIX-like operating systems. PAM has been around for more than two decades. The authentication process could be cumbersome with each service looking for authenticating users with a different set of hardware and software, such as username-password, fingerprint module, face recognition, two-factor authentication, LDAP, etc. But the underlining process remains the same, i.e., users must be authenticated as who they say they are. This is where PAM comes into the picture and provides an API to the application layer and provides built-in functions to implement and extend PAM capability.

‍

Source: Redhat

Understand how OpenSSH interacts with PAM

The Linux host OpenSSH (sshd daemon) begins by reading the configuration defined in /etc/pam.conf or alternatively in /etc/pam.d configuration files. The config files are usually defined with service names having various realms (auth, account, session, password). The “auth” realm is what takes care of authenticating users as who they say. A typical sshd PAM service file on Ubuntu OS can be seen below, and you can relate with your own flavor of Linux:

@include common-auth
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

@include common-auth
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

The common-auth file has an “auth” realm with the pam_unix.so PAM module, which is responsible for authenticating the user with a password. Our goal is to write a PAM module that replaces pam_unix.so with our own version.

When OpenSSH makes calls to the PAM module, the very first function it looks for is “pam_sm_authenticate,” along with some other mandatory function such as pam_sm_setcred. Thus, we will be implementing the pam_sm_authenticate function, which will be an entry point to our shared object library. The module should return PAM_SUCCESS (0) as the return code for successful authentication.

Application Architecture

The project architecture has four main applications. The backend is hosted on an AWS cloud with minimal and low-cost infrastructure resources.

1. PAM Module: Provides QR-Code auth prompt to client SSH Login

2. Android Mobile App: Authenticates SSH login by scanning a QR code

3. QR Auth Server API: Backend application to which our Android App connects and communicates and shares authentication payload along with some other meta information

4. WebSocket Server (API Gateway WebSocket, and NodeJS) App: PAM Module and server-side app shares auth message payload in real time

When a user connects to the remote server via SSH, a PAM module is triggered, offering a QR code for authentication. Information is exchanged between the API gateway WebSocket, which in terms saves temporary auth data in DynamoDB. A user then uses an Android mobile app (written in react-native) to scan the QR code.

Upon scanning, the app connects to the API gateway. An API call is first authenticated by AWS Cognito to avoid any intrusion. The request is then proxied to the Lambda function, which authenticates input payload comparing information available in DynamoDB. Upon successful authentication, the Lambda function makes a call to the API gateway WebSocket to inform the PAM to authenticate the user.

Framework and Toolchains

PAM modules are shared object libraries that must be be written in C (although other languages can be used to compile and link or probably make cross programming language calls like python pam or pam_exec). Below are the framework and toolset I am using to serve this project:

1. gcc, make, automake, autoreconf, libpam (GNU dev tools on Ubuntu OS)

2. libqrencode, libwebsockets, libpam, libssl, libcrypto (C libraries)

3. NodeJS, express (for server-side app)

4. API gateway and API Gateway webSocket, AWS Lambda (AWS Cloud Services for hosting serverless server side app)

5. Serverless framework (for easily deploying infrastructure)

6. react-native, react-native-qrcode-scanner (for Android mobile app)

7. AWS Cognito (for authentication)

8. AWS Amplify Library

This guide assumes you have a basic understanding of the Linux OS, C programming language, pointers, and gcc code compilation. For the backend APIs, I prefer to use NodeJS as a primary programming language, but you may opt for the language of your choice for designing HTTP APIs.

Authentication with QR Code PAM Module

When the module initializes, we first want to generate a random string with the help “/dev/urandom” character device. Byte string obtained from this device contains non-screen characters, so we encode them with Base64. Let’s call this string an auth verification string.

void get_random_string(char *random_str,int length)
{
   FILE *fp = fopen("/dev/urandom","r");
   if(!fp){
       perror("Unble to open urandom device");
       exit(EXIT_FAILURE);
   }
   fread(random_str,length,1,fp);
   fclose(fp);
}
 
char random_string[11];
  
  //get random string
   get_random_string(random_string,10);
  //convert random string to base64 coz input string is coming from /dev/urandom and may contain binary chars
   const int encoded_length = Base64encode_len(10);
   base64_string=(char *)malloc(encoded_length+1);
   Base64encode(base64_string,random_string,10);
   base64_string[encoded_length]='�';

void get_random_string(char *random_str,int length)
{
   FILE *fp = fopen("/dev/urandom","r");
   if(!fp){
       perror("Unble to open urandom device");
       exit(EXIT_FAILURE);
   }
   fread(random_str,length,1,fp);
   fclose(fp);
}
 
char random_string[11];
  
  //get random string
   get_random_string(random_string,10);
  //convert random string to base64 coz input string is coming from /dev/urandom and may contain binary chars
   const int encoded_length = Base64encode_len(10);
   base64_string=(char *)malloc(encoded_length+1);
   Base64encode(base64_string,random_string,10);
   base64_string[encoded_length]='';

We then initiate a WebSocket connection with the help of the libwebsockets library and connect to our API Gateway WebSocket endpoint. Once the connection is established, we inform that a user may try to authenticate with auth verification string. The API Gateway WebSocket returns a unique connection ID to our PAM module.

static void connect_client(struct lws_sorted_usec_list *sul)
{
   struct vhd_minimal_client_echo *vhd =
       lws_container_of(sul, struct vhd_minimal_client_echo, sul);
   struct lws_client_connect_info i;
   char host[128];
   lws_snprintf(host, sizeof(host), "%s:%u", *vhd->ads, *vhd->port);
   memset(&i, 0, sizeof(i));
   i.context = vhd->context;
  //i.port = *vhd->port;
   i.port = *vhd->port;
   i.address = *vhd->ads;
   i.path = *vhd->url;
   i.host = host;
   i.origin = host;
   i.ssl_connection = LCCSCF_USE_SSL | LCCSCF_ALLOW_SELFSIGNED | LCCSCF_SKIP_SERVER_CERT_HOSTNAME_CHECK | LCCSCF_PIPELINE;
  //i.ssl_connection = 0;
   if ((*vhd->options) & 2)
       i.ssl_connection |= LCCSCF_USE_SSL;
   i.vhost = vhd->vhost;
   i.iface = *vhd->iface;
  //i.protocol = ;
   i.pwsi = &vhd->client_wsi;
  //lwsl_user("connecting to %s:%d/%s\n", i.address, i.port, i.path);
   log_message(LOG_INFO,ws_applogic.pamh,"About to create connection %s",host);
  //return !lws_client_connect_via_info(&i);
   if (!lws_client_connect_via_info(&i))
       lws_sul_schedule(vhd->context, 0, &vhd->sul,
                connect_client, 10 * LWS_US_PER_SEC);
}

static void connect_client(struct lws_sorted_usec_list *sul)
{
   struct vhd_minimal_client_echo *vhd =
       lws_container_of(sul, struct vhd_minimal_client_echo, sul);
   struct lws_client_connect_info i;
   char host[128];
   lws_snprintf(host, sizeof(host), "%s:%u", *vhd->ads, *vhd->port);
   memset(&i, 0, sizeof(i));
   i.context = vhd->context;
  //i.port = *vhd->port;
   i.port = *vhd->port;
   i.address = *vhd->ads;
   i.path = *vhd->url;
   i.host = host;
   i.origin = host;
   i.ssl_connection = LCCSCF_USE_SSL | LCCSCF_ALLOW_SELFSIGNED | LCCSCF_SKIP_SERVER_CERT_HOSTNAME_CHECK | LCCSCF_PIPELINE;
  //i.ssl_connection = 0;
   if ((*vhd->options) & 2)
       i.ssl_connection |= LCCSCF_USE_SSL;
   i.vhost = vhd->vhost;
   i.iface = *vhd->iface;
  //i.protocol = ;
   i.pwsi = &vhd->client_wsi;
  //lwsl_user("connecting to %s:%d/%s\n", i.address, i.port, i.path);
   log_message(LOG_INFO,ws_applogic.pamh,"About to create connection %s",host);
  //return !lws_client_connect_via_info(&i);
   if (!lws_client_connect_via_info(&i))
       lws_sul_schedule(vhd->context, 0, &vhd->sul,
                connect_client, 10 * LWS_US_PER_SEC);
}

Upon receiving the connection id from the server, the PAM module converts this connection id to SHA1 hash string and finally composes a unique string for generating QR Code. This string consists of three parts separated by colons (:), i.e.,

“qrauth:BASE64(AUTH_VERIFY_STRING):SHA1(CONNECTION_ID).” For example, let’s say a random Base64 encoded string is “UX6t4PcS5doEeA==” and connection id is “KZlfidYvBcwCFFw=”

Then the final encoded string is “qrauth:UX6t4PcS5doEeA==:2fc58b0cc3b13c3f2db49a5b4660ad47c873b81a.”

This string is then encoded to the UTF8 QR code with the help of libqrencode library and the authentication screen is prompted by the PAM module.

char *con_id=strstr(msg,ws_com_strings[READ_WS_CONNECTION_ID]);
           int length = strlen(ws_com_strings[READ_WS_CONNECTION_ID]);
          
           if(!con_id){
               pam_login_status=PAM_AUTH_ERR;
               interrupted=1;
               return;
           }
           con_id+=length;
           log_message(LOG_DEBUG,ws_applogic.pamh,"strstr is %s",con_id);
           string_crypt(ws_applogic.sha_code_hex, con_id);
           sprintf(temp_text,"qrauth:%s:%s",ws_applogic.authkey,ws_applogic.sha_code_hex);
           char *qr_encoded_text=get_qrcode_string(temp_text);
           ws_applogic.qr_encoded_text=qr_encoded_text;
           conv_info(ws_applogic.pamh,"\nSSH Auth via QR Code\n\n");
           conv_info(ws_applogic.pamh, ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"Use Mobile App to Scan \n %s",ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"%s",temp_text);
           ws_applogic.current_action=READ_WS_AUTH_VERIFIED;
           sprintf(temp_text,ws_com_strings[SEND_WS_EXPECT_AUTH],ws_applogic.authkey,ws_applogic.username);
           websocket_write_back(wsi,temp_text,-1);
           conv_read(ws_applogic.pamh,"\n\nUse Mobile SSH QR Auth App to Authentiate SSh Login and Press Enter\n\n",PAM_PROMPT_ECHO_ON);

char *con_id=strstr(msg,ws_com_strings[READ_WS_CONNECTION_ID]);
           int length = strlen(ws_com_strings[READ_WS_CONNECTION_ID]);
          
           if(!con_id){
               pam_login_status=PAM_AUTH_ERR;
               interrupted=1;
               return;
           }
           con_id+=length;
           log_message(LOG_DEBUG,ws_applogic.pamh,"strstr is %s",con_id);
           string_crypt(ws_applogic.sha_code_hex, con_id);
           sprintf(temp_text,"qrauth:%s:%s",ws_applogic.authkey,ws_applogic.sha_code_hex);
           char *qr_encoded_text=get_qrcode_string(temp_text);
           ws_applogic.qr_encoded_text=qr_encoded_text;
           conv_info(ws_applogic.pamh,"\nSSH Auth via QR Code\n\n");
           conv_info(ws_applogic.pamh, ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"Use Mobile App to Scan \n %s",ws_applogic.qr_encoded_text);
           log_message(LOG_INFO,ws_applogic.pamh,"%s",temp_text);
           ws_applogic.current_action=READ_WS_AUTH_VERIFIED;
           sprintf(temp_text,ws_com_strings[SEND_WS_EXPECT_AUTH],ws_applogic.authkey,ws_applogic.username);
           websocket_write_back(wsi,temp_text,-1);
           conv_read(ws_applogic.pamh,"\n\nUse Mobile SSH QR Auth App to Authentiate SSh Login and Press Enter\n\n",PAM_PROMPT_ECHO_ON);

API Gateway WebSocket App

We used a serverless framework for easily creating and deploying our infrastructure resources. With serverless cli, we use aws-nodejs template (serverless create –template aws-nodejs). You can find a detailed guide on Serverless, API Gateway WebSocket, and DynamoDB here. Below is the template YAML definition. Note that the DynamoDB resource has TTL set to expires_at property. This field holds the UNIX epoch timestamp.

What this means is that any record that we store is automatically deleted as per the epoch time set. We plan to keep the record only for 5 minutes. This also means the user must authenticate themselves within 5 minutes of the authentication request to the remote SSH server.

service: ssh-qrapp-websocket
frameworkVersion: '2'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 websocketsApiName: ssh-qrapp-websocket
 websocketsApiRouteSelectionExpression: $request.body.action
 region: ap-south-1
  iam:
   role:
     statements:
       - Effect: Allow
         Action:
           - "dynamodb:query"
           - "dynamodb:GetItem"
           - "dynamodb:PutItem"
         Resource:
           - Fn::GetAtt: [ SSHAuthDB, Arn ]
  environment:
   REGION: ${env:REGION}
   DYNAMODB_TABLE: SSHAuthDB
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
   NODE_ENV: ${env:NODE_ENV}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
plugins:
 - serverless-dotenv-plugin
layers:
 sshQRAPPLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 connectionHandler:
   handler: handler.connectHandler
   timeout: 60
   memorySize: 256
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: $connect
        routeResponseSelectionExpression: $default
 disconnectHandler:
   handler: handler.disconnectHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $disconnect
 defaultHandler:
   handler: handler.defaultHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $default
 customQueryHandler:
   handler: handler.queryHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: expectauth
        routeResponseSelectionExpression: $default
     - websocket:
        route: getconid
        routeResponseSelectionExpression: $default
     - websocket:
        route: verifyauth
        routeResponseSelectionExpression: $default
 resources:
 Resources:
   SSHAuthDB:
     Type: AWS::DynamoDB::Table
     Properties:
       TableName: ${env:DYNAMODB_TABLE}
       AttributeDefinitions:
         - AttributeName: authkey
           AttributeType: S
       KeySchema:
         - AttributeName: authkey
           KeyType: HASH
       TimeToLiveSpecification:
         AttributeName: expires_at
         Enabled: true
       ProvisionedThroughput:
         ReadCapacityUnits: 2
         WriteCapacityUnits: 2

service: ssh-qrapp-websocket
frameworkVersion: '2'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 websocketsApiName: ssh-qrapp-websocket
 websocketsApiRouteSelectionExpression: $request.body.action
 region: ap-south-1
  iam:
   role:
     statements:
       - Effect: Allow
         Action:
           - "dynamodb:query"
           - "dynamodb:GetItem"
           - "dynamodb:PutItem"
         Resource:
           - Fn::GetAtt: [ SSHAuthDB, Arn ]
  environment:
   REGION: ${env:REGION}
   DYNAMODB_TABLE: SSHAuthDB
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
   NODE_ENV: ${env:NODE_ENV}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
plugins:
 - serverless-dotenv-plugin
layers:
 sshQRAPPLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 connectionHandler:
   handler: handler.connectHandler
   timeout: 60
   memorySize: 256
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: $connect
        routeResponseSelectionExpression: $default
 disconnectHandler:
   handler: handler.disconnectHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $disconnect
 defaultHandler:
   handler: handler.defaultHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket: $default
 customQueryHandler:
   handler: handler.queryHandler
   memorySize: 256
   timeout: 60
   layers:
     - {Ref: SshQRAPPLibsLambdaLayer}
   events:
     - websocket:
        route: expectauth
        routeResponseSelectionExpression: $default
     - websocket:
        route: getconid
        routeResponseSelectionExpression: $default
     - websocket:
        route: verifyauth
        routeResponseSelectionExpression: $default
 resources:
 Resources:
   SSHAuthDB:
     Type: AWS::DynamoDB::Table
     Properties:
       TableName: ${env:DYNAMODB_TABLE}
       AttributeDefinitions:
         - AttributeName: authkey
           AttributeType: S
       KeySchema:
         - AttributeName: authkey
           KeyType: HASH
       TimeToLiveSpecification:
         AttributeName: expires_at
         Enabled: true
       ProvisionedThroughput:
         ReadCapacityUnits: 2
         WriteCapacityUnits: 2

The API Gateway WebSocket has three custom events. These events come as an argument to the lambda function in “event.body.action.” API Gateway WebSocket calls them as route selection expressions. These custom events are:

The “expectauth” event is sent by the PAM module to WebSocket informing that a client has asked for authentication and mobile application may try to authenticate by scanning QR code. During this event, the WebSocket handler stores the connection ID along with auth verification string. This key acts as a primary key to our DynamoDB table.
The “getconid” event is sent to retrieve the current connection ID so that the PAM can generate a SHA1 sum and provide a QR Code prompt.
The “verifyauth” event is sent by the PAM module to confirm and verify authentication. During this event, even the WebSocket server expects random challenge response text. WebSocket server retrieves data payload from DynamoDB with auth verification string as primary key, and tries to find the key “authVerified” marked as “true” (more on this later).

queryHandler: async (event,context) => {
   const payload = JSON.parse(event.body);
   const documentClient = new DynamoDB.DocumentClient({
     region : process.env.REGION
   });
   try {
     switch(payload.action){
       case 'expectauth':
        
         const expires_at = parseInt(new Date().getTime() / 1000) + 300;
  
         await documentClient.put({
           TableName : process.env.DYNAMODB_TABLE,
           Item: {
             authkey : payload.authkey,
             connectionId : event.requestContext.connectionId,
             username : payload.username,
             expires_at : expires_at,
             authVerified: false
           }
         }).promise();
         return {
           statusCode: 200,
           body : "OK"
         };
       case 'getconid':
         return {
           statusCode: 200,
           body: `connectionid:${event.requestContext.connectionId}`
         };
       case 'verifyauth':
         const data = await documentClient.get({
           TableName : process.env.DYNAMODB_TABLE,
           Key : {
             authkey : payload.authkey
           }
         }).promise();
         if(!("Item" in data)){
           throw "Failed to query data";
         }
         if(data.Item.authVerified === true){
           return {
             statusCode: 200,
             body: `authverified:${payload.challengeText}`
           }
         }
         throw "auth verification failed";
     }
   } catch (error) {
     console.log(error);
   }
   return {
     statusCode:  200,
     body : "ok"
    };
  
 }

queryHandler: async (event,context) => {
   const payload = JSON.parse(event.body);
   const documentClient = new DynamoDB.DocumentClient({
     region : process.env.REGION
   });
   try {
     switch(payload.action){
       case 'expectauth':
        
         const expires_at = parseInt(new Date().getTime() / 1000) + 300;
  
         await documentClient.put({
           TableName : process.env.DYNAMODB_TABLE,
           Item: {
             authkey : payload.authkey,
             connectionId : event.requestContext.connectionId,
             username : payload.username,
             expires_at : expires_at,
             authVerified: false
           }
         }).promise();
         return {
           statusCode: 200,
           body : "OK"
         };
       case 'getconid':
         return {
           statusCode: 200,
           body: `connectionid:${event.requestContext.connectionId}`
         };
       case 'verifyauth':
         const data = await documentClient.get({
           TableName : process.env.DYNAMODB_TABLE,
           Key : {
             authkey : payload.authkey
           }
         }).promise();
         if(!("Item" in data)){
           throw "Failed to query data";
         }
         if(data.Item.authVerified === true){
           return {
             statusCode: 200,
             body: `authverified:${payload.challengeText}`
           }
         }
         throw "auth verification failed";
     }
   } catch (error) {
     console.log(error);
   }
   return {
     statusCode:  200,
     body : "ok"
    };
  
 }

Android App: SSH QR Code Auth

The Android app consists of two parts. App login and scanning the QR code for authentication. The AWS Cognito and Amplify library ease out the process of a secure login. Just wrapping your react-native app with “withAutheticator” component you get ready to use “Login Screen.” We then use the react-native-qrcode-scanner component to scan the QR Code.

This component returns decoded string on the successful scan. Application logic then breaks the string and finds the validity of the string decoded. If the decoded string is a valid application string, an API call is made to the server with the appropriate payload.

render(){
   return (
     <View style={styles.container}>
       {this.state.authQRCode ?
       <AuthQRCode
        hideAuthQRCode = {this.hideAuthQRCode}
        qrScanData = {this.qrScanData}
       />
       :
       <View style={{marginVertical: 10}}>
       <Button title="Auth SSH Login" onPress={this.showAuthQRCode} />
       <View style={{margin:10}} />
       <Button title="Sign Out" onPress={this.signout} />
       </View>
      
       }
     </View>
   );
 }
     const scanCode = e.data.split(':');
     if(scanCode.length <3){
       throw "invalid qr code";
     }
     const [appstring,authcode,shacode] = scanCode;
     if(appstring !== "qrauth"){
       throw "Not a valid app qr code";
     }
     const authsession = await Auth.currentSession();
     const jwtToken = authsession.getIdToken().jwtToken;
     const response = await axios({
       url : "https://API_GATEWAY_URL/v1/app/sshqrauth/qrauth",
       method : "post",
       headers : {
         Authorization : jwtToken,
         'Content-Type' : 'application/json'
       },
       responseType: "json",
       data : {
         authcode,
         shacode
       }
     });
     if(response.data.status === 200){
       rescanQRCode=false;
       setTimeout(this.hideAuthQRCode, 1000);
     }

render(){
   return (
     <View style={styles.container}>
       {this.state.authQRCode ?
       <AuthQRCode
        hideAuthQRCode = {this.hideAuthQRCode}
        qrScanData = {this.qrScanData}
       />
       :
       <View style={{marginVertical: 10}}>
       <Button title="Auth SSH Login" onPress={this.showAuthQRCode} />
       <View style={{margin:10}} />
       <Button title="Sign Out" onPress={this.signout} />
       </View>
      
       }
     </View>
   );
 }
     const scanCode = e.data.split(':');
     if(scanCode.length <3){
       throw "invalid qr code";
     }
     const [appstring,authcode,shacode] = scanCode;
     if(appstring !== "qrauth"){
       throw "Not a valid app qr code";
     }
     const authsession = await Auth.currentSession();
     const jwtToken = authsession.getIdToken().jwtToken;
     const response = await axios({
       url : "https://API_GATEWAY_URL/v1/app/sshqrauth/qrauth",
       method : "post",
       headers : {
         Authorization : jwtToken,
         'Content-Type' : 'application/json'
       },
       responseType: "json",
       data : {
         authcode,
         shacode
       }
     });
     if(response.data.status === 200){
       rescanQRCode=false;
       setTimeout(this.hideAuthQRCode, 1000);
     }

This guide does not cover how to deploy react-native Android applications. You may refer to the official react-native guide to deploy your application to the Android mobile device.

QR Auth API

The QR Auth API is built using a serverless framework with aws-nodejs template. It uses API Gateway as HTTP API and AWS Cognito for authorizing input requests. The serverless YAML definition is defined below.

service: ssh-qrauth-server
frameworkVersion: '2 || 3'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 deploymentBucket:
   name: ${env:DEPLOYMENT_BUCKET_NAME}
 httpApi:
   authorizers:
     cognitoJWTAuth:
       identitySource: $request.header.Authorization
       issuerUrl: ${env:COGNITO_ISSUER}
       audience:
         - ${env:COGNITO_AUDIENCE}
 region: ap-south-1
 iam:
   role:
     statements:
     - Effect: "Allow"
       Action:
         - "dynamodb:Query"
         - "dynamodb:PutItem"
         - "dynamodb:GetItem"
       Resource:
         - ${env:DYNAMO_DB_ARN}
     - Effect: "Allow"
       Action:
         - "execute-api:Invoke"
         - "execute-api:ManageConnections"
       Resource:
         - ${env:API_GATEWAY_WEBSOCKET_API_ARN}/*
 environment:
   REGION: ${env:REGION}
   COGNITO_ISSUER: ${env:COGNITO_ISSUER}
   DYNAMODB_TABLE: ${env:DYNAMODB_TABLE}
   COGNITO_AUDIENCE: ${env:COGNITO_AUDIENCE}
   POOLID: ${env:POOLID}
   COGNITOIDP: ${env:COGNITOIDP}
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
   - '!.env'
   - '!test.http'
plugins:
 - serverless-deployment-bucket
 - serverless-dotenv-plugin
layers:
 qrauthLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 sshauthqrcode:
   handler: handler.authqrcode
   memorySize: 256
   timeout: 30
   layers:
     - {Ref: QrauthLibsLambdaLayer}
   events:
     - httpApi:
         path: /v1/app/sshqrauth/qrauth
         method: post
         authorizer:
           name: cognitoJWTAuth

service: ssh-qrauth-server
frameworkVersion: '2 || 3'
useDotenv: true
provider:
 name: aws
 runtime: nodejs12.x
 lambdaHashingVersion: 20201221
 deploymentBucket:
   name: ${env:DEPLOYMENT_BUCKET_NAME}
 httpApi:
   authorizers:
     cognitoJWTAuth:
       identitySource: $request.header.Authorization
       issuerUrl: ${env:COGNITO_ISSUER}
       audience:
         - ${env:COGNITO_AUDIENCE}
 region: ap-south-1
 iam:
   role:
     statements:
     - Effect: "Allow"
       Action:
         - "dynamodb:Query"
         - "dynamodb:PutItem"
         - "dynamodb:GetItem"
       Resource:
         - ${env:DYNAMO_DB_ARN}
     - Effect: "Allow"
       Action:
         - "execute-api:Invoke"
         - "execute-api:ManageConnections"
       Resource:
         - ${env:API_GATEWAY_WEBSOCKET_API_ARN}/*
 environment:
   REGION: ${env:REGION}
   COGNITO_ISSUER: ${env:COGNITO_ISSUER}
   DYNAMODB_TABLE: ${env:DYNAMODB_TABLE}
   COGNITO_AUDIENCE: ${env:COGNITO_AUDIENCE}
   POOLID: ${env:POOLID}
   COGNITOIDP: ${env:COGNITOIDP}
   WEBSOCKET_ENDPOINT: ${env:WEBSOCKET_ENDPOINT}
package:
 patterns:
   - '!node_modules/**'
   - handler.js
   - '!package.json'
   - '!package-lock.json'
   - '!.env'
   - '!test.http'
plugins:
 - serverless-deployment-bucket
 - serverless-dotenv-plugin
layers:
 qrauthLibs:
   path: layer
   compatibleRuntimes:
     - nodejs12.x
functions:
 sshauthqrcode:
   handler: handler.authqrcode
   memorySize: 256
   timeout: 30
   layers:
     - {Ref: QrauthLibsLambdaLayer}
   events:
     - httpApi:
         path: /v1/app/sshqrauth/qrauth
         method: post
         authorizer:
           name: cognitoJWTAuth

Once the API Gateway authenticates the incoming requests, control is handed over to the serverless-express router. At this stage, we verify the payload for the auth verify string, which is scanned by the Android mobile app. This auth verify string must be available in the DynamoDB table. Upon retrieving the record pointed by auth verification string, we read the connection ID property and convert it to SHA1 hash. If the hash matches with the hash available in the request payload, we update the record “authVerified” as “true” and inform the PAM module via API Gateway WebSocket API. PAM Module then takes care of further validation via challenge response text.

The entire authentication flow is depicted in a flow diagram, and the architecture is depicted in the cover post of this blog.

Compiling and Installing PAM module

Unlike any other C programs, PAM modules are shared libraries. Therefore, the compiled code when loaded in memory may go at this arbitrary place. Thus, the module must be compiled as position independent. With gcc while compiling, we must pass -fPIC option. Further while linking and generating shared object binary, we should use -shared flag.

gcc -I$PWD -fPIC -c $(ls *.c)
gcc -shared -o pam_qrapp_auth.so $(ls *.o) -lpam -lqrencode -lssl -lcrypto -lpthread -lwebsockets

gcc -I$PWD -fPIC -c $(ls *.c)
gcc -shared -o pam_qrapp_auth.so $(ls *.o) -lpam -lqrencode -lssl -lcrypto -lpthread -lwebsockets

To ease this process of compiling and validating libraries, I prefer to use the autoconf tool. The entire project is checked out at my GitHub repository along with autoconf scripts.

Once the shared object file is generated (pam_qrapp_auth.so), copy this file to the “/usr/lib64/security/” directory and run ldconfig command to inform OS new shared library is available. Remove common-auth (from /etc/pam.d/sshd if applicable) or any line that uses “auth” realm with pam_unix.so module recursively used in /etc/pam.d/sshd. pam_unix.so module enforces a password or private key authentication. We then need to add our module to the auth realm (“auth required pam_qrapp_auth.so”). Depending upon your Linux flavor, your /etc/pam.d/sshd file may look similar to below:

auth       required     pam_qrapp_auth.so
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

auth       required     pam_qrapp_auth.so
account    required     pam_nologin.so
@include common-account
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so close
session    required     pam_loginuid.so
session    optional     pam_keyinit.so force revoke
@include common-session
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate
session    optional     pam_mail.so standard noenv # [1]
session    required     pam_limits.so
session    required     pam_env.so # [1]
session    required     pam_env.so user_readenv=1 envfile=/etc/default/locale
session [success=ok ignore=ignore module_unknown=ignore default=bad]        pam_selinux.so open
@include common-password

Finally, we need to configure our sshd daemon configuration file to allow challenge response authentication. Open file /etc/ssh/sshd_config and add “ChallengeResponseAuthentication yes” if already not available or commented or set to “no.” Reload the sshd service by issuing the command “systemctl reload sshd.” Voila, and we are done here.

Conclusion

This guide was a barebones tutorial and not meant for production use. There are certain flaws to this PAM module. For example, our module should prompt for changing the password if the password is expired or login should be denied if an account is a locked and similar feature that addresses security. Also, the Android mobile app should be bound with ssh username so that, AWS Cognito user bound with ssh username could only authenticate.

One known limitation to this PAM module is we have to always hit enter after scanning the QR Code via Android Mobile App. This limitation is because of how OpenSSH itself is implemented. OpenSSH server blocks all the informational text unless user input is required. In our case, the informational text is UTF8 QR Code itself.

However, no such input is required from the interactive device, as the authentication event comes from the WebSocket to PAM module. If we do not ask the user to exclusively press enter after scanning the QR Code our QR Code will never be displayed. Thus input here is a dummy. This is a known issue for OpenSSH PAM_TEXT_INFO. Find more about the issue here.

References

– Pluggable authentication module

– An introduction to Pluggable Authentication Modules (PAM) in Linux

– Custom PAM for SSHD in C

– google-authenticator-libpam

– PAM_TEXT_INFO and PAM_ERROR_MSG conversation not honoured during PAM authentication

December 12, 2022

Category: Cloud & DevOps

Why Backup and Recovery?

What to Backup?

How to Backup?

Taking etcd backup:

Backup Strategy for Internal Etcd Cluster:

Backup Strategy for External Etcd Cluster:

Disaster Recovery

Restore Strategy for Internal Etcd Cluster:

Restore Strategy for External Etcd Cluster

Conclusion

Introduction:

What is Sharding?

Understanding the Need for ProxySQL:

‍Installation & Setup:‍

Inside ProxySQL Container:

IMPORTANT:

Conclusion:

References:

Introduction:

Prerequisites:

Overview of Velero:

Strategic Plan Overview:

Step-by-Step Guide:

Migration via Velero and CSI Integration:

Step 1 : Storage Class for Premium SSD v2

Step 2: Volume Snapshot Class

Step 3: Update Velero Deployment and Daemonset

Step 4: Take Velero Backup

Step 5: ConfigMap Deployment

Step 6: Velero Restore Operation

Step 7: Verification & Testing

Step 8: Post-Migration Cleanup

Impact:

Conclusion:

Introduction

Understanding NATS

A. Definition and Overview

B. Architecture Diagram

C. Key Features

D. Use Cases and Applications

E. Concepts

F. System Requirements

Getting Started with NATS

Building the Foundation

A. Understanding NATS Subjects

B. Exploring Messages, Publishers, and Subscribers

C. Implementing Basic Pub/Sub Pattern

D. JetStream

E. Single Cluster vs. SuperCluster

Implementation

Prerequisite

Installing NatsCLI and Nats-server:

Local Machine Setup for JetStream:

Next, we will create the service file in the servers at /etc/systemd/system/nats.service

Creating conf file at all the servers at /nats directory

Recap on Conf File Changes

Starting Service

Validation:

Advanced Messaging Patterns with NATS

A. Request-Reply Pattern

B. Publish-Subscribe Pattern with Wildcards

C. Queue Groups for Load Balancing and Fault Tolerance

Overcoming Real-World Challenges

A. Scalability and High Availability in NATS

B. Securing NATS Communication

C. Monitoring and Debugging Techniques

Recovery Scenarios in NATS

Summary

‍Introduction:

Pod Priority:

Priority Classes:

Understanding Priority Values:

Pod Preemption:

Example Scenario: The Enchanted Shop

Priority Class:

Pods:

Commands to Witness the Magic:

Conclusion:

Steps for completing this tutorial: