Category: Type

Making Your Terminal More Productive With Z-Shell (ZSH)
When working with servers or command-line-based applications, we spend most of our time on the command line. A good-looking and productive terminal is better in many aspects than a GUI (Graphical User Interface) environment since the command line takes less time for most use cases. Today, we’ll look at some of the features that make a terminal cool and productive.

You can use the following steps on Ubuntu 20.04. If you are using a different operating system, your commands will likely differ. If you’re using Windows, you can choose between Cygwin, WSL, and Git Bash.

Prerequisites

Let’s upgrade the system and install some basic tools needed.
```
sudo apt update && sudo apt upgrade
sudo apt install build-essential curl wget git
```
Z-Shell (ZSH)

Zsh is an extended Bourne shell with many improvements, including some features of Bash and other shells.

Let’s install Z-Shell:
```
sudo apt install zsh
```
Make it our default shell for our terminal:
```
chsh -s $(which zsh)
```
Now restart the system and open the terminal again to be welcomed by ZSH. Unlike other shells like Bash, ZSH requires some initial configuration, so it asks for some configuration options the first time we start it and saves them in a file called .zshrc in the home directory (/home/user) where the user is the current system user.

For now, we’ll skip the manual work and get a head start with the default configuration. Press 2, and ZSH will populate the .zshrc file with some default options. We can change these later.

The initial configuration setup can be run again as shown in the below image:

Oh-My-ZSH

Oh-My-ZSH is a community-driven, open-source framework to manage your ZSH configuration. It comes with many plugins and helpers. It can be installed with one single command as below.

Installation
```
sh -c "$(wget https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh -O -)"
```
It’d take a backup of our existing .zshrc in a file zshrc.pre-oh-my-zsh, so whenever you uninstall it, the backup would be restored automatically.

Font

A good terminal needs some good fonts, we’d use Terminess nerd font to make our terminal look awesome, which can be downloaded here. Once downloaded, extract and move them to ~/.local/share/fonts to make them available for the current user or to /usr/share/fonts to be available for all the users.
```
tar -xvf Terminess.zip
mv *.ttf ~/.local/share/fonts 
```
Once the font is installed, it will look like:

Among all the things Oh-My-ZSH provides, 2 things are community favorites, plugins, and themes.

Theme

My go-to ZSH theme is powerlevel10k because it’s flexible, provides everything out of the box, and is easy to install with one command as shown below:
```
git clone --depth=1 https://github.com/romkatv/powerlevel10k.git ${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/themes/powerlevel10k
```
To set this theme in .zshrc:

Close the terminal and start it again. Powerlevel10k will welcome you with the initial setup, go through the setup with the options you want. You can run this setup again by executing the below command:
```
p10k configure
```
Tools and plugins we can’t live without

Plugins can be added to the plugins array in the .zshrc file. For all the plugins you want to use from the below list, add those to the plugins array in the .zshrc file like so:

ZSH-Syntax-Highlighting

This enables the highlighting of commands as you type and helps you catch syntax errors before you execute them:

As you can see, “ls” is in green but “lss” is in red.

Execute below command to install it:
```
git clone https://github.com/zsh-users/zsh-syntax-highlighting.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting
```
ZSH Autosuggestions

This suggests commands as you type based on your history:

The below command is how you can install it by cloning the git repo:
```
git clone https://github.com/zsh-users/zsh-autosuggestions ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions
```
ZSH Completions

For some extra Zsh completion scripts, execute below command
```
git clone https://github.com/zsh-users/zsh-completions ${ZSH_CUSTOM:=~/.oh-my-zsh/custom}/plugins/zsh-completions 
```
autojump

It’s a faster way of navigating the file system; it works by maintaining a database of directories you visit the most. More details can be found here.
```
sudo apt install autojump 
```
You can also use the plugin Z as an alternative if you’re not able to install autojump or for any other reason.

Internal Plugins

Some plugins come installed with oh-my-zsh, and they can be included directly in .zshrc file without any installation.

copyfile

It copies the content of a file to the clipboard.
```
copyfile test.txt
```
copypath

It copies the absolute path of the current directory to the clipboard.

copybuffer

This plugin copies the command that is currently typed in the command prompt to the clipboard. It works with the keyboard shortcut CTRL + o.

sudo

Sometimes, we forget to prefix a command with sudo, but that can be done in just a second with this plugin. When you hit the ESC key twice, it will prefix the command you’ve typed in the terminal with sudo.

web-search

This adds some aliases for searching with Google, Wikipedia, etc. For example, if you want to web-search with Google, you can execute the below command:
```
google oh my zsh
```
Doing so will open this search in Google:

More details can be found here.

Remember, you’d have to add each of these plugins in the .zshrc file as well. So, in the end, this is how the plugins array in .zshrc file should look like:
```
plugins=(
        zsh-autosuggestions
        zsh-syntax-highlighting
        zsh-completions
        autojump
        copyfile
        copydir
        copybuffer
        history
        dirhistory
        sudo
        web-search
        git
) 
```
You can add more plugins, like docker, heroku, kubectl, npm, jsontools, etc., if you’re a developer. There are plugins for system admins as well or for anything else you need. You can explore them here.

Enhancd

Enhancd is the next-gen method to navigate file system with cli. It works with a fuzzy finder, we’ll install it fzf for this purpose.
```
sudo apt install fzf
```
Enhancd can be installed with zplug plugin manager for Zsh, so first we’ll install zplug with the below command:
```
$ curl -sL --proto-redir -all,https https://raw.githubusercontent.com/zplug/installer/master/installer.zsh | zsh
```
Append the following to .zshrc:
```
source ~/.zplug/init.zsh
zplug load
```
Now close your terminal, open it again, and use zplug to install enhanced
```
zplug "b4b4r07/enhancd", use:init.sh
```
Aliases

As a developer, I need to execute git commands many times a day, typing each command every time is too cumbersome, so we can use aliases for them. Aliases need to be added .zshrc, and here’s how we can add them.
```
alias gs='git status'
alias ga='git add .'
alias gf='git fetch'
alias gr='git rebase'
alias gp='git push'
alias gd='git diff'
alias gc='git commit'
alias gh='git checkout'
alias gst='git stash'
alias gl='git log --oneline --graph'
```
You can add these anywhere in the .zshrc file.

Colorls

Another tool that makes you say wow is Colorls. This tool colorizes the output of the ls command. This is how it looks once you install it:

It works with Ruby, below is how you can install both Ruby and Colors:
```
sudo apt install ruby ruby-dev ruby-colorize
sudo gem install colorls
```
Now, restart your terminal and execute the command colors in your terminal to see the magic!

Bonus – We can add some aliases as well if we want the same output of Colorls when we execute the command ls. Note that we’re adding another alias for ls to make it available as well.
```
alias cl='ls'
alias ls='colorls'
alias la='colorls -a'
alias ll='colorls -l'
alias lla='colorls -la'
```
These are the tools and plugins I can’t live without now, Let me know if I’ve missed anything.

Automation

Do you wanna repeat this process again, if let’s say, you’ve bought a new laptop and want the same setup?

You can automate all of this if your answer is no, and that’s why I’ve created Project Automator. This project does a lot more than just setting up a terminal: it works with Arch Linux as of now but you can take the parts you need and make it work with almost any *nix system you like.

Explaining how it works is beyond the scope of this article, so I’ll have to leave you guys here to explore it on your own.

Conclusion

We need to perform many tasks on our systems, and using a GUI(Graphical User Interface) tool for a task can consume a lot of your time, especially if you repeat the same task on a daily basis like converting a media stream, setting up tools on a system, etc.

Using a command-line tool can save you a lot of time and you can automate repetitive tasks with scripting. It can be a great tool for your arsenal.
November 19, 2025

A Practical Guide to HashiCorp Consul – Part 1

This is part 1 of 2 part series on A Practical Guide to HashiCorp Consul. This part is primarily focused on understanding the problems that Consul solves and how it solves them. The second part is more focused on a practical application of Consul in a real-life example. Let’s get started.

How about setting up discoverable, configurable, and secure service mesh using a single tool?

What if we tell you this tool is platform-agnostic and cloud-ready ?

And comes as a single binary download.

All this is true. The tool we are talking about is HashiCorp Consul.

Consul provides service discovery, health checks, load balancing, service grap h, identity enforcement via TLS, and distributed service configuration management.

Let’s learn about Consul in details below and see how it solves these complex challenges and makes the life of a distributed system operator easy.

Introduction

Microservices and other distributed systems can enable faster, simpler software development. But there’s a trade-off resulting in greater operational complexity around inter-service communication, configuration management, and network segmentation.

Monolithic Application (representational) – with different subsystems A, B, C and D

Distributed Application (representational) – with different services A, B, C and D

HashiCorp Consul is an open source tool that solves these new complexities by providing service discovery, health checks, load balancing, a service graph, mutual TLS identity enforcement, and a configuration key-value store. These Consul features make it an ideal control plane for a service mesh.

HashiCorp Consul supports Service Discovery, Service Configuration, and Service Segmentation

HashiCorp announced Consul in April 2014 and it has since then got a good community acceptance.

This guide is aimed at discussing some of these crucial problems and exploring the various solutions provided by HashiCorp Consul to tackle these problems.

Let’s rundown through the topics that we are going to cover in this guide. The topics are written to be self-content. You can jump directly to a specific topic if you want to.

Brief Background on Monolithic vs. Service-oriented Architectures (SOA)

Looking at traditional architectures of application delivery, what we find is a classic monolith. When we talk about monolith, we have a single application deployment.

Even if it is a single application, typically it has multiple different sub-components.

One of the examples that HashiCorp’s CTO Armon Dadgar gave during his introductory video for Consul was about – delivering desktop banking application. It has a discrete set of sub-components – for example, authentication (say subsystem A), account management (subsystem B), fund transfer (subsystem C), and foreign exchange (subsystem D).

Now, although these are independent functions – system A authentication vs system C fund transfer – we deploy it as a single, monolith app.

Over the last few years, we have seen a trend away from this kind of architectures. There are several reasons for this shift.

Challenge with monolith is: Suppose there is a bug in one of the subsystems, system A, related to authentication.

Representational bug in Subsystem A in our monolithic application

We can’t just fix it in system A and update it in production.

Representational bug fix in Subsystem A in our monolithic application

We have to update system A and do a redeploy of the whole application, which we need deployment of subsystems B, C, and D as well.

Bug fix in one subsystem results in redeployment of the whole monolithic application

This whole redeployment is not ideal. Instead, we would like to do a deployment of individual services.

The same monolithic app delivered as a set of individual, discrete services.

Dividing monolithic application into individual services

So, if there is a bug fix in one of our services:

Representational bug in one of service, in this case Service A of our SOA application

and we fix that bug:

Representational bug fix in Service A of our SOA application

We can do the redeployment of that service without coordinating the deployment with other services. What we are essentially talking about is one form of microservices.

Bug fix will result into redeployment of only Service A within our whole application

This gives a big boost to our development agility. We don’t need to coordinate our development efforts across different development teams or even systems. We will have freedom of developing and deploying independently. One service on a weekly basis and other on quarterly. This is going to be a big advantage to the development teams.

But, as you know, there is ever such a thing as a “free” lunch.

The development efficiency we have gained introduces its own set of operational challenges. Let’s look at some of those.

Service discovery in a monolith, its challenges in a distributed system, and Consul’s solution

Monolithic applications

Assuming two services in a single application want to talk to one another. One way is to expose a method, make it public and allow other services to call it. In a monolithic application, it is a single app, and the services would expose public functions and it would simply mean function calls across services.

Subsystems talk to each other via function call within our monolithic application

As this is a function call within a process, it has happened in-memory. Thus, it’s fast, and we need not worry about how our data was moved and if it was secure or not.

Distributed Systems

In the distributed world, service A is no longer delivered as the same application as service B. So, how does service A finds service B if it wants to talk to B?

Service A tries to find Service B to establish communication

Service A might not even be on the same machine as service B. So, there is a network in play. And it is not as fast and there is a latency that we can measure on the lines of milliseconds, as compared to nanoseconds of a simple function call.

Challenges

As we already know by now, two services on a distributed system have to discover one-another to interact. One of the traditional ways of solving this is by using load balancers.

A load balancer sits between services to allow them to talk to each other

Load balancers would sit in front of each service with a static IP known to all other services.

A load balancer between two services allows two way traffic

This gives an ability to add multiple instances of the same service behind the load balancer and it would direct the traffic accordingly. But this load balancer IP is static and hard-coded within all other services, so services can skip discovery.

Load balancers allow communication between multiple instances of same service

The challenge is now to maintain a set of load balancers for each individual services. And we can safely assume, there was originally a load balancer for the whole application as well. The cost and effort for maintaining these load balancers have increased.

With load balancers in front of the services, they are a single point of failures. Even when we have multiple instances of service behind the load balancer if it is down our service is down. No matter how many instances of that service are running.

Load balancers also increase the latency of inter-service communication. If service A wish to talk to service B, request from A will have to first talk to the load balancer of service B and then reach B. The response from B will also have to go through the same drill.

Maintaining an entry of a service instances on an application-wide load balancer

And by nature, load balancers are manually managed in most cases. If we add another instance of service, it will not be readily available. We will need to register that service into the load balancer to make it accessible to the world. This would mean manual effort and time.

Consul’s Solutions

Consul’s solution to service discovery problem in distributed systems is a central service registry.

Consul maintains a central registry which contains the entry for all the upstream services. When a service instance starts, it is registered on the central registry. The registry is populated with all the upstream instances of the service.

Consul’s Service Registry helps Service A find Service B and establish communication

When a service A wants to talk to service B, it will discover and communicate with B by querying the registry about the upstream service instances of B. So, instead of talking to a load balancer, the service can directly talk to the desired destination service instance.

Consul also provides health-checks on these service instances. If one of the service instances or service itself is unhealthy or fails its health-check, the registry would then know about this scenario and would avoid returning the service’s address. The work that load-balancer would do is handled by the registry in this case.

Also, if there are multiple instances of the same service, Consul would send the traffic randomly to different instances. Thus, leveling the load among different instances.

Consul has handled our challenges of failure detection and load distribution across multiple instances of services without a necessity of deploying a centralized load balancer.

Traditional problem of slow and manually managed load balancers is taken care of here. Consul programmatically manages registry, which gets updated when any new service registers itself and becomes available for receiving traffic.

This helps with scaling the services with ease.

Configuration Management in a monolith, its challenges in a distributed environment, and Consul’s solution

Monolithic Applications

When we look at the configuration for a monolithic application, they tend to be somewhere along the lines of giant YAML, XML or JSON files. That configuration is supposed to configure the entire application.

Single configuration file shared across different parts of our monolithic application

Given a single file, all of our subsystems in our monolithic application would now consume the configuration from the same file. Thus creating a consistent view of all our subsystems or services.

If we wish to change the state of the application using configuration update, it would be easily available to all the subsystems. The new configuration is simultaneously consumed by all the components of our application.

Distributed Systems

Unlike monolith, distributed services would not have a common view on configuration. The configuration is now distributed and there every individual service would need to be configured separately.

A copy of application configuration is distributed across different services

Challenges in Distributed Systems

Configuration is to be spread across different services. Maintaining consistency between the configuration on different services after each update is a challenge.
Moreover, the challenge grows when we expect the configuration to be updated dynamically.

Consul’s Solutions

Consul’s solution for configuration management in distributed environment is the central Key-Value store.

Consul’s KV store allows seamless configuration mapping on each service

Consul solves this challenge in a unique way. Instead of spreading the configuration across different distributed service as configuration pieces, it pushes the whole configuration to all the services and configures them dynamically on the distributed system.

Let’s take an example of state change in configuration. The changed state is pushed across all the services in real-time. The configuration is consistently present with all the services.

Network segmentation in a monolith, its challenges in distributed systems, and Consul’s solutions

Monolithic Applications

When we look at our classic monolithic architecture, the network is typically divided in three different zones.

The first zone in our network is publicly accessible. The traffic coming to our application via the internet and reaching our load balancers.

The second zone is the traffic from our load balancers to our application. Mostly an internal network zone without direct public access.

The third zone is the closed network zone, primarily designated for data. This is considered to be an isolated zone.

Different network zones in typical application

Only the load balancers zone can reach into the application zone and only the application zone can reach into the data zone. It is a straightforward zoning system, simple to implement and manage.

Distributed Systems

The pattern changes drastically for distributed services.

Complex pattern of network traffic and routes across different services

Within our application network zone, there are multiple services which talk to each other within this network, making it a complicated traffic pattern.

Challenges

The primary challenge is that the traffic is not in any sequential flow. Unlike monolithic architecture, where the flow was defined from load balancers to the application and application to data.
Depending on the access pattern we want to support, the traffic might come from different endpoints and reaching different services.

Client essentially talks to each service within the application directly or indirectly

Given multiple services and the ability to support multiple endpoints allows us to deploy multiple service consumers and providers.
Due to the nature of the system, security is our next challenge. Services should be able to identify that the traffic they are receiving is from a verified and trusted entity in the network.

SOA demands control over trusted and untrusted sources of traffic

Controlling the flow of traffic and segmenting the network into groups or chunks will become a bigger issue. Also, making sure we have strict rules that guide us with partitioning the network based on who should be allowed to talk to whom and vice versa is also vital.

Consul’s Solutions

Consul’s solution to overall network segmentation challenge in distributed systems is by implementing service graph and mutual TLS.

Service-level policy enforcement to define traffic pattern and segmentation using Consul

Consul solves the problem of network segmentation by centrally managing the definition around who can talk to whom. Consul has a dedicated feature for this called Consul Connect.

Consul Connect enrolls these policies of inter-service communication that we desire and implements it as part of the service graph. So, a policy might say service A can talk to service B, but B cannot talk to C, for example.

The higher benefit of this is, it is not IP restricted. Rather it’s service level. This makes it scalable. The policy will be enforced on all instances of service and there will be no hard bound firewall rule specific to a service’s IP. Making us independent of the scale of our distributed network.

Consul Connect also handles service identity using popular TLS protocol. It distributes the TLS certificate associated with a service.

These certificates help other services securely identify each other. TLS is also helpful in making the communication between the services secure. This makes for trusted network implementation.

Consul enforces TLS using an agent-based proxy attached to each service instance. This proxy acts as a sidecar. Use of proxy, in this case, prevents us from making any change into the code of original service.

This allows for the higher level benefit of enforcing encryptions on data at rest and data in transit. Moreover, it will assist with fulfilling compliances required by laws around privacy and user identity.

Basic Architecture of Consul

Consul is a distributed and highly available system.

Consul is shipped as a single binary download for all popular platforms. The executable can run as a client as well as server.

Each node that provides services to Consul runs a Consul agent. Each of these agents talk to one or more Consul servers.

Consul agent is responsible for health-checking the services on the node as the health-check of the node itself. It is not responsible for service discovery or maintaining key/value data.

Consul data and config storage and replication happens via Consul Servers.

Consul can run with single server, but it is recommended by HashiCorp to run a set of 3 to 5 servers to avoid failures. As all the data is stored at Consul server side, with a single server, the failure could cause a data loss.

With multi-servers cluster, they elect a leader among themselves. It is also recommended by HashiCorp to have cluster of servers per datacenter.

During the discovery process, any service in search for other service can query the Consul servers or even Consul agents. The Consul agents forward the queries to Consul servers automatically.

Consul Agent sits on a node and talks to other agents on the network synchronizing all service-level information

If the query is cross-datacenter, the queries are forwarded by the Consul server to the remote Consul servers. The results from remote Consul servers are returned to the original Consul server.

Getting Started with Consul

This section is dedicated to closely looking at Consul as a tool, with some hands-on experience.

Download and Install

As discussed above, Consul ships as a single binary downloaded from HashiCorps website or from Consul’s GitHub repo releases section.

This binary exhibits the role of both the Consul Server and Consul Client Agent. One can run them in either configuration.

You can download Consul from here – Download Consul page.

Various download options for Consul on different operating systems

We will download Consul on command line using the link from download page

$ wget https://releases.hashicorp.com/consul/1.4.3/consul_1.4.3_linux_amd64.zip -O consul.zip

--2019-03-10 00:14:07--  https://releases.hashicorp.com/consul/1.4.3/consul_1.4.3_linux_amd64.zip
Resolving releases.hashicorp.com (releases.hashicorp.com)... 151.101.37.183, 2a04:4e42:9::439
Connecting to releases.hashicorp.com (releases.hashicorp.com)|151.101.37.183|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34777003 (33M) [application/zip]
Saving to: ‘consul.zip’

consul.zip             100%[============================>]  33.17M  4.46MB/s    in 9.2s    

2019-03-10 00:14:17 (3.60 MB/s) - ‘consul.zip’ saved [34777003/34777003]

$ wget https://releases.hashicorp.com/consul/1.4.3/consul_1.4.3_linux_amd64.zip -O consul.zip

--2019-03-10 00:14:07--  https://releases.hashicorp.com/consul/1.4.3/consul_1.4.3_linux_amd64.zip
Resolving releases.hashicorp.com (releases.hashicorp.com)... 151.101.37.183, 2a04:4e42:9::439
Connecting to releases.hashicorp.com (releases.hashicorp.com)|151.101.37.183|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34777003 (33M) [application/zip]
Saving to: ‘consul.zip’

consul.zip             100%[============================>]  33.17M  4.46MB/s    in 9.2s    

2019-03-10 00:14:17 (3.60 MB/s) - ‘consul.zip’ saved [34777003/34777003]

Unzip the downloaded zip file.

$ unzip consul.zip

Archive:  consul.zip
  inflating: consul

$ unzip consul.zip

Archive:  consul.zip
  inflating: consul

Add it to PATH.

$ export PATH="$PATH:/path/to/consul"

$ export PATH="$PATH:/path/to/consul"

Use Consul

Once you unzip the compressed file and put the binary under your PATH, you can run it like this.

$ consul agent -dev

==> Starting Consul agent...
==> Consul agent running!
           Version: 'v1.4.2'
           Node ID: 'ef46ebb7-3496-346f-f67a-30117cfec0ad'
         Node name: 'devcube'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
      Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false

==> Log data will now stream in as it occurs:

    2019/03/04 00:38:01 [DEBUG] agent: Using random ID "ef46ebb7-3496-346f-f67a-30117cfec0ad" as node ID
    2019/03/04 00:38:01 [INFO] raft: Initial configuration (index=1): [{Suffrage:Voter ID:ef46ebb7-3496-346f-f67a-30117cfec0ad Address:127.0.0.1:8300}]
    2019/03/04 00:38:01 [INFO] raft: Node at 127.0.0.1:8300 [Follower] entering Follower state (Leader: "")
    2019/03/04 00:38:01 [INFO] serf: EventMemberJoin: devcube.dc1 127.0.0.1
    2019/03/04 00:38:01 [INFO] serf: EventMemberJoin: devcube 127.0.0.1
    2019/03/04 00:38:01 [INFO] consul: Adding LAN server devcube (Addr: tcp/127.0.0.1:8300) (DC: dc1)
    2019/03/04 00:38:01 [INFO] consul: Handled member-join event for server "devcube.dc1" in area "wan"
    2019/03/04 00:38:01 [DEBUG] agent/proxy: managed Connect proxy manager started
    2019/03/04 00:38:01 [WARN] raft: Heartbeat timeout from "" reached, starting election
    2019/03/04 00:38:01 [INFO] raft: Node at 127.0.0.1:8300 [Candidate] entering Candidate state in term 2
    2019/03/04 00:38:01 [DEBUG] raft: Votes needed: 1
    2019/03/04 00:38:01 [DEBUG] raft: Vote granted from ef46ebb7-3496-346f-f67a-30117cfec0ad in term 2. Tally: 1
    2019/03/04 00:38:01 [INFO] raft: Election won. Tally: 1
    2019/03/04 00:38:01 [INFO] raft: Node at 127.0.0.1:8300 [Leader] entering Leader state
    2019/03/04 00:38:01 [INFO] consul: cluster leadership acquired
    2019/03/04 00:38:01 [INFO] consul: New leader elected: devcube
    2019/03/04 00:38:01 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
    2019/03/04 00:38:01 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
    2019/03/04 00:38:01 [INFO] agent: Started HTTP server on 127.0.0.1:8500 (tcp)
    2019/03/04 00:38:01 [INFO] agent: Started gRPC server on 127.0.0.1:8502 (tcp)
    2019/03/04 00:38:01 [INFO] agent: started state syncer
    2019/03/04 00:38:01 [INFO] connect: initialized primary datacenter CA with provider "consul"
    2019/03/04 00:38:01 [DEBUG] consul: Skipping self join check for "devcube" since the cluster is too small
    2019/03/04 00:38:01 [INFO] consul: member 'devcube' joined, marking health alive
    2019/03/04 00:38:01 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:38:01 [INFO] agent: Synced node info
    2019/03/04 00:38:01 [DEBUG] agent: Node info in sync
    2019/03/04 00:38:01 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:38:01 [DEBUG] agent: Node info in sync

$ consul agent -dev

==> Starting Consul agent...
==> Consul agent running!
           Version: 'v1.4.2'
           Node ID: 'ef46ebb7-3496-346f-f67a-30117cfec0ad'
         Node name: 'devcube'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
      Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false

==> Log data will now stream in as it occurs:

    2019/03/04 00:38:01 [DEBUG] agent: Using random ID "ef46ebb7-3496-346f-f67a-30117cfec0ad" as node ID
    2019/03/04 00:38:01 [INFO] raft: Initial configuration (index=1): [{Suffrage:Voter ID:ef46ebb7-3496-346f-f67a-30117cfec0ad Address:127.0.0.1:8300}]
    2019/03/04 00:38:01 [INFO] raft: Node at 127.0.0.1:8300 [Follower] entering Follower state (Leader: "")
    2019/03/04 00:38:01 [INFO] serf: EventMemberJoin: devcube.dc1 127.0.0.1
    2019/03/04 00:38:01 [INFO] serf: EventMemberJoin: devcube 127.0.0.1
    2019/03/04 00:38:01 [INFO] consul: Adding LAN server devcube (Addr: tcp/127.0.0.1:8300) (DC: dc1)
    2019/03/04 00:38:01 [INFO] consul: Handled member-join event for server "devcube.dc1" in area "wan"
    2019/03/04 00:38:01 [DEBUG] agent/proxy: managed Connect proxy manager started
    2019/03/04 00:38:01 [WARN] raft: Heartbeat timeout from "" reached, starting election
    2019/03/04 00:38:01 [INFO] raft: Node at 127.0.0.1:8300 [Candidate] entering Candidate state in term 2
    2019/03/04 00:38:01 [DEBUG] raft: Votes needed: 1
    2019/03/04 00:38:01 [DEBUG] raft: Vote granted from ef46ebb7-3496-346f-f67a-30117cfec0ad in term 2. Tally: 1
    2019/03/04 00:38:01 [INFO] raft: Election won. Tally: 1
    2019/03/04 00:38:01 [INFO] raft: Node at 127.0.0.1:8300 [Leader] entering Leader state
    2019/03/04 00:38:01 [INFO] consul: cluster leadership acquired
    2019/03/04 00:38:01 [INFO] consul: New leader elected: devcube
    2019/03/04 00:38:01 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
    2019/03/04 00:38:01 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
    2019/03/04 00:38:01 [INFO] agent: Started HTTP server on 127.0.0.1:8500 (tcp)
    2019/03/04 00:38:01 [INFO] agent: Started gRPC server on 127.0.0.1:8502 (tcp)
    2019/03/04 00:38:01 [INFO] agent: started state syncer
    2019/03/04 00:38:01 [INFO] connect: initialized primary datacenter CA with provider "consul"
    2019/03/04 00:38:01 [DEBUG] consul: Skipping self join check for "devcube" since the cluster is too small
    2019/03/04 00:38:01 [INFO] consul: member 'devcube' joined, marking health alive
    2019/03/04 00:38:01 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:38:01 [INFO] agent: Synced node info
    2019/03/04 00:38:01 [DEBUG] agent: Node info in sync
    2019/03/04 00:38:01 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:38:01 [DEBUG] agent: Node info in sync

This will start the agent in development mode.

Consul Members

While the above command is running, you can check for all the members in Consul’s network.

$ consul members

Node     Address         Status  Type    Build  Protocol  DC   Segment
devcube  127.0.0.1:8301  alive   server  1.4.0  2         dc1  <all>

$ consul members

Node     Address         Status  Type    Build  Protocol  DC   Segment
devcube  127.0.0.1:8301  alive   server  1.4.0  2         dc1  <all>

Given we only have one node running, it is treated as server by default. You can designate an agent as a server by supplying server as command line parameter or server as configuration parameter to Consul’s config.

The output of the above command is based on the gossip protocol and is eventually consistent.

Consul HTTP API

For strongly consistent view of the Consul’s agent network, we can use HTTP API provided out of the box by Consul.

$ curl localhost:8500/v1/catalog/nodes

[
    {
        "ID": "ef46ebb7-3496-346f-f67a-30117cfec0ad",
        "Node": "devcube",
        "Address": "127.0.0.1",
        "Datacenter": "dc1",
        "TaggedAddresses": {
            "lan": "127.0.0.1",
            "wan": "127.0.0.1"
        },
        "Meta": {
            "consul-network-segment": ""
        },
        "CreateIndex": 9,
        "ModifyIndex": 10
    }
]

$ curl localhost:8500/v1/catalog/nodes

[
    {
        "ID": "ef46ebb7-3496-346f-f67a-30117cfec0ad",
        "Node": "devcube",
        "Address": "127.0.0.1",
        "Datacenter": "dc1",
        "TaggedAddresses": {
            "lan": "127.0.0.1",
            "wan": "127.0.0.1"
        },
        "Meta": {
            "consul-network-segment": ""
        },
        "CreateIndex": 9,
        "ModifyIndex": 10
    }
]

Consul DNS Interface

Consul also provides a DNS interface to query nodes. It serves DNS on 8600 port by default. That port is configurable.

$ dig @127.0.0.1 -p 8600 devcube.node.consul

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 devcube.node.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42215
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;devcube.node.consul.		IN	A

;; ANSWER SECTION:
devcube.node.consul.	0	IN	A	127.0.0.1

;; ADDITIONAL SECTION:
devcube.node.consul.	0	IN	TXT	"consul-network-segment="

;; Query time: 19 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 00:45:44 IST 2019
;; MSG SIZE  rcvd: 100

$ dig @127.0.0.1 -p 8600 devcube.node.consul

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 devcube.node.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42215
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;devcube.node.consul.		IN	A

;; ANSWER SECTION:
devcube.node.consul.	0	IN	A	127.0.0.1

;; ADDITIONAL SECTION:
devcube.node.consul.	0	IN	TXT	"consul-network-segment="

;; Query time: 19 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 00:45:44 IST 2019
;; MSG SIZE  rcvd: 100

Registering a service on Consul can be achieved either by writing a service definition or by sending a request over an appropriate HTTP API.

Consul Service Definition

Service definition is one of the popular ways of registering a service. Let’s take a look at one of such service definition examples.

To host our service definitions we will add a configuration directory, conventionally names as consul.d – ‘.d’ represents that there are set of configuration files under this directory, instead of single config under name consul.

$ mkdir ./consul.d

$ mkdir ./consul.d

Write the service definition for a fictitious Django web application running on port 80 on localhost.

$ echo '{"service": {"name": "web", "tags": ["django"], "port": 80}}' \
    > ./consul.d/web.json

$ echo '{"service": {"name": "web", "tags": ["django"], "port": 80}}' \
    > ./consul.d/web.json

To make our consul agent aware of this service definition, we can supply the configuration directory to it.

$ consul agent -dev -config-dir=./consul.d

==> Starting Consul agent...
==> Consul agent running!
           Version: 'v1.4.2'
           Node ID: '810f4804-dbce-03b1-056a-a81269ca90c1'
         Node name: 'devcube'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
      Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false

==> Log data will now stream in as it occurs:

    2019/03/04 00:55:28 [DEBUG] agent: Using random ID "810f4804-dbce-03b1-056a-a81269ca90c1" as node ID
    2019/03/04 00:55:28 [INFO] raft: Initial configuration (index=1): [{Suffrage:Voter ID:810f4804-dbce-03b1-056a-a81269ca90c1 Address:127.0.0.1:8300}]
    2019/03/04 00:55:28 [INFO] raft: Node at 127.0.0.1:8300 [Follower] entering Follower state (Leader: "")
    2019/03/04 00:55:28 [INFO] serf: EventMemberJoin: devcube.dc1 127.0.0.1
    2019/03/04 00:55:28 [INFO] serf: EventMemberJoin: devcube 127.0.0.1
    2019/03/04 00:55:28 [INFO] consul: Adding LAN server devcube (Addr: tcp/127.0.0.1:8300) (DC: dc1)
    2019/03/04 00:55:28 [DEBUG] agent/proxy: managed Connect proxy manager started
    2019/03/04 00:55:28 [INFO] consul: Handled member-join event for server "devcube.dc1" in area "wan"
    2019/03/04 00:55:28 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
    2019/03/04 00:55:28 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
    2019/03/04 00:55:28 [INFO] agent: Started HTTP server on 127.0.0.1:8500 (tcp)
    2019/03/04 00:55:28 [INFO] agent: started state syncer
    2019/03/04 00:55:28 [INFO] agent: Started gRPC server on 127.0.0.1:8502 (tcp)
    2019/03/04 00:55:28 [WARN] raft: Heartbeat timeout from "" reached, starting election
    2019/03/04 00:55:28 [INFO] raft: Node at 127.0.0.1:8300 [Candidate] entering Candidate state in term 2
    2019/03/04 00:55:28 [DEBUG] raft: Votes needed: 1
    2019/03/04 00:55:28 [DEBUG] raft: Vote granted from 810f4804-dbce-03b1-056a-a81269ca90c1 in term 2. Tally: 1
    2019/03/04 00:55:28 [INFO] raft: Election won. Tally: 1
    2019/03/04 00:55:28 [INFO] raft: Node at 127.0.0.1:8300 [Leader] entering Leader state
    2019/03/04 00:55:28 [INFO] consul: cluster leadership acquired
    2019/03/04 00:55:28 [INFO] consul: New leader elected: devcube
    2019/03/04 00:55:28 [INFO] connect: initialized primary datacenter CA with provider "consul"
    2019/03/04 00:55:28 [DEBUG] consul: Skipping self join check for "devcube" since the cluster is too small
    2019/03/04 00:55:28 [INFO] consul: member 'devcube' joined, marking health alive
    2019/03/04 00:55:28 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:55:28 [INFO] agent: Synced service "web"
    2019/03/04 00:55:28 [DEBUG] agent: Node info in sync
    2019/03/04 00:55:29 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:55:29 [DEBUG] agent: Service "web" in sync
    2019/03/04 00:55:29 [DEBUG] agent: Node info in sync

$ consul agent -dev -config-dir=./consul.d

==> Starting Consul agent...
==> Consul agent running!
           Version: 'v1.4.2'
           Node ID: '810f4804-dbce-03b1-056a-a81269ca90c1'
         Node name: 'devcube'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
      Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false

==> Log data will now stream in as it occurs:

    2019/03/04 00:55:28 [DEBUG] agent: Using random ID "810f4804-dbce-03b1-056a-a81269ca90c1" as node ID
    2019/03/04 00:55:28 [INFO] raft: Initial configuration (index=1): [{Suffrage:Voter ID:810f4804-dbce-03b1-056a-a81269ca90c1 Address:127.0.0.1:8300}]
    2019/03/04 00:55:28 [INFO] raft: Node at 127.0.0.1:8300 [Follower] entering Follower state (Leader: "")
    2019/03/04 00:55:28 [INFO] serf: EventMemberJoin: devcube.dc1 127.0.0.1
    2019/03/04 00:55:28 [INFO] serf: EventMemberJoin: devcube 127.0.0.1
    2019/03/04 00:55:28 [INFO] consul: Adding LAN server devcube (Addr: tcp/127.0.0.1:8300) (DC: dc1)
    2019/03/04 00:55:28 [DEBUG] agent/proxy: managed Connect proxy manager started
    2019/03/04 00:55:28 [INFO] consul: Handled member-join event for server "devcube.dc1" in area "wan"
    2019/03/04 00:55:28 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
    2019/03/04 00:55:28 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
    2019/03/04 00:55:28 [INFO] agent: Started HTTP server on 127.0.0.1:8500 (tcp)
    2019/03/04 00:55:28 [INFO] agent: started state syncer
    2019/03/04 00:55:28 [INFO] agent: Started gRPC server on 127.0.0.1:8502 (tcp)
    2019/03/04 00:55:28 [WARN] raft: Heartbeat timeout from "" reached, starting election
    2019/03/04 00:55:28 [INFO] raft: Node at 127.0.0.1:8300 [Candidate] entering Candidate state in term 2
    2019/03/04 00:55:28 [DEBUG] raft: Votes needed: 1
    2019/03/04 00:55:28 [DEBUG] raft: Vote granted from 810f4804-dbce-03b1-056a-a81269ca90c1 in term 2. Tally: 1
    2019/03/04 00:55:28 [INFO] raft: Election won. Tally: 1
    2019/03/04 00:55:28 [INFO] raft: Node at 127.0.0.1:8300 [Leader] entering Leader state
    2019/03/04 00:55:28 [INFO] consul: cluster leadership acquired
    2019/03/04 00:55:28 [INFO] consul: New leader elected: devcube
    2019/03/04 00:55:28 [INFO] connect: initialized primary datacenter CA with provider "consul"
    2019/03/04 00:55:28 [DEBUG] consul: Skipping self join check for "devcube" since the cluster is too small
    2019/03/04 00:55:28 [INFO] consul: member 'devcube' joined, marking health alive
    2019/03/04 00:55:28 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:55:28 [INFO] agent: Synced service "web"
    2019/03/04 00:55:28 [DEBUG] agent: Node info in sync
    2019/03/04 00:55:29 [DEBUG] agent: Skipping remote check "serfHealth" since it is managed automatically
    2019/03/04 00:55:29 [DEBUG] agent: Service "web" in sync
    2019/03/04 00:55:29 [DEBUG] agent: Node info in sync

The relevant information in the log here are the sync statements related to the “web” service. Consul agent as accepted our config and synced it across all nodes. In this case one node.

Consul DNS Service Query

We can query the service with DNS, as we did with node. Like so:

$ dig @127.0.0.1 -p 8600 web.service.consul

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 web.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51488
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;web.service.consul.		IN	A

;; ANSWER SECTION:
web.service.consul.	0	IN	A	127.0.0.1

;; ADDITIONAL SECTION:
web.service.consul.	0	IN	TXT	"consul-network-segment="

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 00:59:32 IST 2019
;; MSG SIZE  rcvd: 99

$ dig @127.0.0.1 -p 8600 web.service.consul

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 web.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51488
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;web.service.consul.		IN	A

;; ANSWER SECTION:
web.service.consul.	0	IN	A	127.0.0.1

;; ADDITIONAL SECTION:
web.service.consul.	0	IN	TXT	"consul-network-segment="

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 00:59:32 IST 2019
;; MSG SIZE  rcvd: 99

We can also query DNS for service records that give us more info into the service specifics like port and node.

$ dig @127.0.0.1 -p 8600 web.service.consul SRV

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 web.service.consul SRV
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 712
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;web.service.consul.		IN	SRV

;; ANSWER SECTION:
web.service.consul.	0	IN	SRV	1 1 80 devcube.node.dc1.consul.

;; ADDITIONAL SECTION:
devcube.node.dc1.consul. 0	IN	A	127.0.0.1
devcube.node.dc1.consul. 0	IN	TXT	"consul-network-segment="

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 00:59:43 IST 2019
;; MSG SIZE  rcvd: 142

$ dig @127.0.0.1 -p 8600 web.service.consul SRV

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 web.service.consul SRV
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 712
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;web.service.consul.		IN	SRV

;; ANSWER SECTION:
web.service.consul.	0	IN	SRV	1 1 80 devcube.node.dc1.consul.

;; ADDITIONAL SECTION:
devcube.node.dc1.consul. 0	IN	A	127.0.0.1
devcube.node.dc1.consul. 0	IN	TXT	"consul-network-segment="

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 00:59:43 IST 2019
;; MSG SIZE  rcvd: 142

You can also use the TAG that we supplied in the service definition to query a specific tag:

$ dig @127.0.0.1 -p 8600 django.web.service.consul

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 django.web.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12278
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;django.web.service.consul.	IN	A

;; ANSWER SECTION:
django.web.service.consul. 0	IN	A	127.0.0.1

;; ADDITIONAL SECTION:
django.web.service.consul. 0	IN	TXT	"consul-network-segment="

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 01:01:17 IST 2019
;; MSG SIZE  rcvd: 106

$ dig @127.0.0.1 -p 8600 django.web.service.consul

; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> @127.0.0.1 -p 8600 django.web.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12278
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;django.web.service.consul.	IN	A

;; ANSWER SECTION:
django.web.service.consul. 0	IN	A	127.0.0.1

;; ADDITIONAL SECTION:
django.web.service.consul. 0	IN	TXT	"consul-network-segment="

;; Query time: 0 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Mon Mar 04 01:01:17 IST 2019
;; MSG SIZE  rcvd: 106

Consul Service Catalog Over HTTP API

Service could similarly be queried using HTTP API:

$ curl http://localhost:8500/v1/catalog/service/web

[
    {
        "ID": "810f4804-dbce-03b1-056a-a81269ca90c1",
        "Node": "devcube",
        "Address": "127.0.0.1",
        "Datacenter": "dc1",
        "TaggedAddresses": {
            "lan": "127.0.0.1",
            "wan": "127.0.0.1"
        },
        "NodeMeta": {
            "consul-network-segment": ""
        },
        "ServiceKind": "",
        "ServiceID": "web",
        "ServiceName": "web",
        "ServiceTags": [
            "django"
        ],
        "ServiceAddress": "",
        "ServiceWeights": {
            "Passing": 1,
            "Warning": 1
        },
        "ServiceMeta": {},
        "ServicePort": 80,
        "ServiceEnableTagOverride": false,
        "ServiceProxyDestination": "",
        "ServiceProxy": {},
        "ServiceConnect": {},
        "CreateIndex": 10,
        "ModifyIndex": 10
    }
]

$ curl http://localhost:8500/v1/catalog/service/web

[
    {
        "ID": "810f4804-dbce-03b1-056a-a81269ca90c1",
        "Node": "devcube",
        "Address": "127.0.0.1",
        "Datacenter": "dc1",
        "TaggedAddresses": {
            "lan": "127.0.0.1",
            "wan": "127.0.0.1"
        },
        "NodeMeta": {
            "consul-network-segment": ""
        },
        "ServiceKind": "",
        "ServiceID": "web",
        "ServiceName": "web",
        "ServiceTags": [
            "django"
        ],
        "ServiceAddress": "",
        "ServiceWeights": {
            "Passing": 1,
            "Warning": 1
        },
        "ServiceMeta": {},
        "ServicePort": 80,
        "ServiceEnableTagOverride": false,
        "ServiceProxyDestination": "",
        "ServiceProxy": {},
        "ServiceConnect": {},
        "CreateIndex": 10,
        "ModifyIndex": 10
    }
]

We can filter the services based on health-checks on HTTP API:

$ curl http://localhost:8500/v1/catalog/service/web?passing

[
    {
        "ID": "810f4804-dbce-03b1-056a-a81269ca90c1",
        "Node": "devcube",
        "Address": "127.0.0.1",
        "Datacenter": "dc1",
        "TaggedAddresses": {
            "lan": "127.0.0.1",
            "wan": "127.0.0.1"
        },
        "NodeMeta": {
            "consul-network-segment": ""
        },
        "ServiceKind": "",
        "ServiceID": "web",
        "ServiceName": "web",
        "ServiceTags": [
            "django"
        ],
        "ServiceAddress": "",
        "ServiceWeights": {
            "Passing": 1,
            "Warning": 1
        },
        "ServiceMeta": {},
        "ServicePort": 80,
        "ServiceEnableTagOverride": false,
        "ServiceProxyDestination": "",
        "ServiceProxy": {},
        "ServiceConnect": {},
        "CreateIndex": 10,
        "ModifyIndex": 10
    }
]

$ curl http://localhost:8500/v1/catalog/service/web?passing

[
    {
        "ID": "810f4804-dbce-03b1-056a-a81269ca90c1",
        "Node": "devcube",
        "Address": "127.0.0.1",
        "Datacenter": "dc1",
        "TaggedAddresses": {
            "lan": "127.0.0.1",
            "wan": "127.0.0.1"
        },
        "NodeMeta": {
            "consul-network-segment": ""
        },
        "ServiceKind": "",
        "ServiceID": "web",
        "ServiceName": "web",
        "ServiceTags": [
            "django"
        ],
        "ServiceAddress": "",
        "ServiceWeights": {
            "Passing": 1,
            "Warning": 1
        },
        "ServiceMeta": {},
        "ServicePort": 80,
        "ServiceEnableTagOverride": false,
        "ServiceProxyDestination": "",
        "ServiceProxy": {},
        "ServiceConnect": {},
        "CreateIndex": 10,
        "ModifyIndex": 10
    }
]

Update Consul Service Definition

If you wish to update the service definition on a running Consul agent, it is very simple.

There are three ways to achieve this. You can send a SIGHUP signal to the process, reload Consul which internally sends SIGHUP on the node or you can call HTTP API dedicated to service definition updates that will internally reload the agent configuration.

$ ps aux | grep [c]onsul

pranav   21289  2.4  0.3 177012 54924 pts/2    Sl+  00:55   0:22 consul agent -dev -config-dir=./consul.d

$ ps aux | grep [c]onsul

pranav   21289  2.4  0.3 177012 54924 pts/2    Sl+  00:55   0:22 consul agent -dev -config-dir=./consul.d

Send SIGHUP to 21289

$ kill -SIGHUP 21289

$ kill -SIGHUP 21289

Or reload Consul

$ consul reload

$ consul reload

Configuration reload triggered

You should see this in your Consul log.

...
    2019/03/04 01:10:46 [INFO] agent: Caught signal:  hangup
    2019/03/04 01:10:46 [INFO] agent: Reloading configuration...
    2019/03/04 01:10:46 [DEBUG] agent: removed service "web"
    2019/03/04 01:10:46 [INFO] agent: Synced service "web"
    2019/03/04 01:10:46 [DEBUG] agent: Node info in sync
...

...
    2019/03/04 01:10:46 [INFO] agent: Caught signal:  hangup
    2019/03/04 01:10:46 [INFO] agent: Reloading configuration...
    2019/03/04 01:10:46 [DEBUG] agent: removed service "web"
    2019/03/04 01:10:46 [INFO] agent: Synced service "web"
    2019/03/04 01:10:46 [DEBUG] agent: Node info in sync
...

Consul Web UI

Consul provides a beautiful web user interface out-of-the-box. You can access it on port 8500.

In this case at http://localhost:8500. Let’s look at some of the screens.

The home page for the Consul UI is services with all the relevant information related to a Consul agent and web service check.

Exploring defined services on Consul Web UI

Going into further details on a given service, we get a service dashboard with all the nodes and their health for that service.

Exploring node-level information for each service on Consul Web U

On each individual node, we can look at the health-checks, services, and sessions.

Exploring node-specific health-check information, services information, and sessions information on Consul Web UI

Overall, Consul Web UI is really impressive and a great companion for the command line tools that Consul provides.

How is Consul Different From Zookeeper, doozerd, and etcd?

Consul has a first-class support for service discovery, health-check, key-value storage, multi data centers.

Zookeeper, doozerd, and etcd are primarily based on key-value store mechanism. To achieve something beyond such key-value, store needs additional tools, libraries, and custom development around them.

All these tools, including Consul, uses server nodes that require quorum of nodes to operate and are strongly consistent.

More or less, they all have similar semantics for key/value store management.

These semantics are attractive for building service discovery systems. Consul has out-of the box support for service discovery, which the other systems lack at.

A service discovery systems also requires a way to perform health-checks. As it is important to check for service’s health before allowing others to discover it. Some systems use heartbeats with periodic updates and TTL. The work for these health checks grows with scale and requires fixed infra. The failure detection window is as least as long as TTL.

Unlike Zookeeper, Consul has client agents sitting on each node in the cluster, talking to each other in gossip pool. This allows the clients to be thin, gives better health-checking ability, reduces client-side complexity, and solves debugging challenges.

Also, Consul provides native support for HTTP or DNS interfaces to perform system-wide, node-wide, or service-wide operations. Other systems need those being developed around the exposed primitives.

Consul’s website gives a good commentary on comparisons between Consul and other tools.

Open Source Tools Around HashiCorp Consul

HashiCorp and the community has built several tools around Consul

These tools are built and maintained by the HashiCorp developers:

Consul Template (3.3k stars) – Generic template rendering and notifications with Consul. Template rendering, notifier, and supervisor for @hashicorp Consul and Vault data. It provides a convenient way to populate values from Consul into the file system using the consul-template daemon.

Envconsul (1.2k stars) – Read and set environmental variables for processes from Consul. Envconsul provides a convenient way to launch a subprocess with environment variables populated from HashiCorp Consul and Vault.

Consul Replicate (360 stars) – Consul cross-DC KV replication daemon. This project provides a convenient way to replicate values from one Consul datacenter to another using the consul-replicate daemon.

Consul Migrate – Data migration tool to handle Consul upgrades to 0.5.1+.

The community around Consul has also built several tools to help with registering services and managing service configuration, I would like to mention some of the popular and well-maintained ones –

Confd (5.9k stars) – Manage local application configuration files using templates and data from etcd or consul.

Fabio (5.4k stars) – Fabio is a fast, modern, zero-conf load balancing HTTP(S) and TCP router for deploying applications managed by consul. Register your services in consul, provide a health check and fabio will start routing traffic to them. No configuration required.

Registrator (3.9k stars) – Service registry bridge for Docker with pluggable adapters. Registrator automatically registers and deregisters services for any Docker container by inspecting containers as they come online.

Hashi-UI (871 stars) – A modern user interface for HashiCorp Consul & Nomad.

Git2consul (594 stars) – Mirrors the contents of a git repository into Consul KVs. git2consul takes one or many git repositories and mirrors them into Consul KVs. The goal is for organizations of any size to use git as the backing store, audit trail, and access control mechanism for configuration changes and Consul as the delivery mechanism.

Spring-cloud-consul (503 stars) – This project provides Consul integrations for Spring Boot apps through autoconfiguration and binding to the Spring Environment and other Spring programming model idioms. With a few simple annotations, you can quickly enable and configure the common patterns inside your application and build large distributed systems with Consul based components.

Crypt (453 stars) – Store and retrieve encrypted configs from etcd or consul.

Mesos-Consul (344 stars) – Mesos to Consul bridge for service discovery. Mesos-consul automatically registers/deregisters services run as Mesos tasks.

Consul-cli (228 stars) – Command line interface to Consul HTTP API.

Conclusion

Distributed systems are not easy to build and setup. Maintaining them and keeping them running is an altogether another piece of work. HashiCorp Consul makes the life of engineers facing such challenges easier.

As we went through different aspects of Consul, we learnt how straightforward it would become for us to develop and deploy application with distributed or microservices architecture.

Robust production ready code (most crucial), well written detailed documentation, user friendliness, and solid community, helps in adopting HashiCorp Consul and introducing it in our technology stack straight forward.

We hope it was an informative ride on the journey of Consul. Our journey has not yet ended, this was just the first half. We will meet you again with the second part of this article that walks us through practical example close to real-life applications. Find the Practical Guide To HashiCorp Consul – Part 2 here.

Let’s us know what you would like to hear from us more or if you have any questions around the topic, we will be more than happy to answer those.

Reference

November 19, 2025

A Practical Guide To HashiCorp Consul – Part 2

This is part 2 of 2 part series on A Practical Guide to HashiCorp Consul. The previous part was primarily focused on understanding the problems that Consul solves and how it solves them. This part is focused on a practical application of Consul in a real-life example. Let’s get started.

With most of the theory covered in the previous part, let’s move on to Consul’s practical example.

What are we Building?

We are going to build a Django Web Application that stores its persistent data in MongoDB. We will containerize both of them using Docker. Build and run them using Docker Compose.

To show how our web app would scale in this context, we are going to run two instances of Django app. Also, to make this even more interesting, we will run MongoDB as a Replica Set with one primary node and two secondary nodes.

Given we have two instances of Django app, we will need a way to balance a load among those two instances, so we are going to use Fabio, a Consul aware load-balancer, to reach Django app instances.

This example will roughly help us simulate a real-world practical application.

Example Application nodes and services deployed on them

‍

The complete source code for this application is open-sourced and is available on GitHub – pranavcode/consul-demo.

Note: The architecture we are discussing here is not specifically constraint with any of the technologies used to create app or data layers. This example could very well be built using a combination of Ruby on Rails and Postgres, or Node.js and MongoDB, or Laravel and MySQL.

How Does Consul Come into the Picture?

We are deploying both, the app and the data, layers with Docker containers. They are going to be built as services and will talk to each other over HTTP.

Thus, we will use Consul for Service Discovery. This will allow Django servers to find MongoDB Primary node. We are going to use Consul to resolve services via C onsul’s DNS in terface for this example.

Consul will also help us with the auto-configuration of Fabio as load-balancer to reach instances of our Django app.

We are also using the health-check feature of Consul to monitor the health of each of our instances in the whole infrastructure.

Consul provides a beautiful user interface, as part of its Web UI, out of the box to show all the services on a single dashboard. We will use it to see how our services are laid out.

Let’s begin.

Setup: MongoDB, Django, Consul, Fabio, and Dockerization

We will keep this as simple and minimal as possible to the extent it fulfills our need for a demonstration.

MongoDB

The MongoDB setup we are targeting is in the form of MongoDB Replica Set. One primary node and two secondary nodes.

The primary node will manage all the write operations and the oplog to maintain the sequence of writes, and replicate the data across secondaries. We are also configuring the secondaries for the read operations. You can learn more about MongoDB Replica Set on their official documentation.

We will call our replication set as ‘consuldemo’.

We will run MongoDB on a standard port 27017 and supply the name of the replica set on the command line using the parameter ‘–replSet’.

As you may read from the documentation MongoDB also allows configuring replica set name via configuration file with the parameter for replication as below:

replication:
    replSetName: "consuldemo"

replication:
    replSetName: "consuldemo"

In our case, the replication set configuration that we will apply on one of the MongoDB nodes, once all the nodes are up and running is as given below:

var config = {
    _id: "consuldemo",
    version: 1,
    members: [{
        _id: 0,
        host: "mongo_1:27017",
    }, {
        _id: 1,
        host: "mongo_2:27017",
    }, {
        _id: 2,
        host: "mongo_3:27017",
    }],
    settings: { 
        chainingAllowed: true 
    }
};
rs.initiate(config, { force: true });
rs.slaveOk();
db.getMongo().setReadPref("nearest");

var config = {
    _id: "consuldemo",
    version: 1,
    members: [{
        _id: 0,
        host: "mongo_1:27017",
    }, {
        _id: 1,
        host: "mongo_2:27017",
    }, {
        _id: 2,
        host: "mongo_3:27017",
    }],
    settings: { 
        chainingAllowed: true 
    }
};
rs.initiate(config, { force: true });
rs.slaveOk();
db.getMongo().setReadPref("nearest");

This configuration will be applied to one of the pre-defined nodes and MongoDB will decide which node will be primary and secondary.

Note: We are not forcing the set creation with any pre-defined designations on who becomes primary and secondary to allow the dynamism in service discovery. Normally, the nodes would be defined for a specific role.

We are allowing slave reads and reads from the nearest node as a Read Preferenc e.

We will start MongoDB on all nodes with the following command:

mongod --bind_ip_all --port 27017 --dbpath /data/db --replSet "consuldemo"

mongod --bind_ip_all --port 27017 --dbpath /data/db --replSet "consuldemo"

This gives us a MongoDB Replica Set with one primary instance and two secondary instances, running and ready to accept connections.

We will discuss containerizing the MongoDB service in the latter part of this article.

Django

We will create a simple Django project that represents Blog application and containerizes it with Docker.

Building the Django app from scratc h is beyond the scope of this tutorial, we recommend you to refer to Django’s official documentation to get started with Django project. But, we will still go through some important aspects.

As we need our Django app to talk to MongoDB, we will use a MongoDB connector for Django ORM, Djongo. We will set up our Django settings to use Djongo and connect with our MongoDB. Djongo is pretty straightforward in configuration.

For a local MongoDB installation it would only take two lines of code:

...
DATABASES = {
    'default': {
        'ENGINE': 'djongo',
        'NAME': 'db',
    }
}
...

...

DATABASES = {
    'default': {
        'ENGINE': 'djongo',
        'NAME': 'db',
    }
}

...

In our case, as we will need to access MongoDB over another container, our config would look like this:

...
DATABASES = {
    'default': {
        'ENGINE': 'djongo',
        'NAME': 'db',
        'HOST': 'mongodb://mongo-primary.service.consul',
        'PORT': 27017,
    }
}
...
@velotiotech

...

DATABASES = {
    'default': {
        'ENGINE': 'djongo',
        'NAME': 'db',
        'HOST': 'mongodb://mongo-primary.service.consul',
        'PORT': 27017,
    }
}

...
@velotiotech

Details:

ENGINE: The database connector to use for Django ORM.
NAME: Name of the database.
HOST: Host address that has MongoDB running on it.
PORT: Which port is your MongoDB listening for requests.

Djongo internally talks to PyMongo and uses MongoClient for executing queries on Mongo. We can also use other MongoDB connectors available for Django to achieve this, like, for instance, djan go-mongodb-engine or pymo ngo directly, based on our needs.

Note: We are currently reading and writing via Django to a single MongoDB host, the primary one, but we can configure Djongo to also talk to secondary hosts for read-only operations. That is not in the scope of our discussion. You can refer to Djongo’s official documentation to achieve exactly this.

Continuing our Django app building process, we need to define our models. As we are building a blog-like application, our models would look like this:

from djongo import models
class Blog(models.Model):
    name = models.CharField(max_length=100)
    tagline = models.TextField()
    class Meta:
        abstract = True
class Entry(models.Model):
    blog = models.EmbeddedModelField(
        model_container=Blog,
    )
    
    headline = models.CharField(max_length=255)

from djongo import models

class Blog(models.Model):
    name = models.CharField(max_length=100)
    tagline = models.TextField()

    class Meta:
        abstract = True

class Entry(models.Model):
    blog = models.EmbeddedModelField(
        model_container=Blog,
    )
    
    headline = models.CharField(max_length=255)

We can run a local MongoDB instance and create migrations for these models. Also, register these models into our Django Admin, like so:

from django.contrib import admin
from.models import Entry
admin.site.register(Entry)

from django.contrib import admin
from.models import Entry

admin.site.register(Entry)

We can play with the Entry model’s CRUD operations via Django Admin for this example.

Also, to realize the Django-MongoDB connectivity we will create a custom View and Template that displays information about MongoDB setup and currently connected MongoDB host.

Our Django views look like this:

from django.shortcuts import render
from pymongo import MongoClient
def home(request):
    client = MongoClient("mongo-primary.service.consul")
    replica_set = client.admin.command('ismaster')
    return render(request, 'home.html', { 
        'mongo_hosts': replica_set['hosts'],
        'mongo_primary_host': replica_set['primary'],
        'mongo_connected_host': replica_set['me'],
        'mongo_is_primary': replica_set['ismaster'],
        'mongo_is_secondary': replica_set['secondary'],
    })

from django.shortcuts import render
from pymongo import MongoClient

def home(request):
    client = MongoClient("mongo-primary.service.consul")
    replica_set = client.admin.command('ismaster')

    return render(request, 'home.html', { 
        'mongo_hosts': replica_set['hosts'],
        'mongo_primary_host': replica_set['primary'],
        'mongo_connected_host': replica_set['me'],
        'mongo_is_primary': replica_set['ismaster'],
        'mongo_is_secondary': replica_set['secondary'],
    })

Our URLs or routes configuration for the app looks like this:

from django.urls import path
from tweetapp import views
urlpatterns = [
    path('', views.home, name='home'),
]

from django.urls import path
from tweetapp import views

urlpatterns = [
    path('', views.home, name='home'),
]

And for the project – the app URLs are included like so:

from django.contrib import admin
from django.urls import path, include
from django.conf import settings
from django.conf.urls.static import static
urlpatterns = [
    path('admin/', admin.site.urls),
    path('web', include('tweetapp.urls')),
] + static(settings.STATIC_URL, document_root=settings.STATIC_ROOT)

from django.contrib import admin
from django.urls import path, include
from django.conf import settings
from django.conf.urls.static import static

urlpatterns = [
    path('admin/', admin.site.urls),
    path('web', include('tweetapp.urls')),
] + static(settings.STATIC_URL, document_root=settings.STATIC_ROOT)

Our Django template, ‘templates/home.html’ looks like this:

<!doctype html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
    <link href="https://fonts.googleapis.com/css?family=Armata" rel="stylesheet">
    <title>Django-Mongo-Consul</title>
</head>
<body class="bg-dark text-white p-5" style="font-family: Armata">
    <div class="p-4 border">
        <div class="m-2">
            <b>Django Database Connection</b>
        </div>
        <table class="table table-dark">
            <thead>
                <tr>
                    <th scope="col">#</th>
                    <th scope="col">Property</th>
                    <th scope="col">Value</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>1</td>
                    <td>Mongo Hosts</td>
                    <td>{% for host in mongo_hosts %}{{ host }}<br/>{% endfor %}</td>
                </tr>
                <tr>
                    <td>2</td>
                    <td>Mongo Primary Address</td>
                    <td>{{ mongo_primary_host }}</td>
                </tr>
                <tr>
                    <td>3</td>
                    <td>Mongo Connected Address</td>
                    <td>{{ mongo_connected_host }}</td>
                </tr>
                <tr>
                    <td>4</td>
                    <td>Mongo - Is Primary?</td>
                    <td>{{ mongo_is_primary }}</td>
                </tr>
                <tr>
                    <td>5</td>
                    <td>Mongo - Is Secondary?</td>
                    <td>{{ mongo_is_secondary }}</td>
                </tr>
            </tbody>
        </table>
    </div>
    
    <script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
    <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
</body>
</html>

<!doctype html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
    <link href="https://fonts.googleapis.com/css?family=Armata" rel="stylesheet">

    <title>Django-Mongo-Consul</title>
</head>
<body class="bg-dark text-white p-5" style="font-family: Armata">
    <div class="p-4 border">
        <div class="m-2">
            <b>Django Database Connection</b>
        </div>
        <table class="table table-dark">
            <thead>
                <tr>
                    <th scope="col">#</th>
                    <th scope="col">Property</th>
                    <th scope="col">Value</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>1</td>
                    <td>Mongo Hosts</td>
                    <td>{% for host in mongo_hosts %}{{ host }}<br/>{% endfor %}</td>
                </tr>
                <tr>
                    <td>2</td>
                    <td>Mongo Primary Address</td>
                    <td>{{ mongo_primary_host }}</td>
                </tr>
                <tr>
                    <td>3</td>
                    <td>Mongo Connected Address</td>
                    <td>{{ mongo_connected_host }}</td>
                </tr>
                <tr>
                    <td>4</td>
                    <td>Mongo - Is Primary?</td>
                    <td>{{ mongo_is_primary }}</td>
                </tr>
                <tr>
                    <td>5</td>
                    <td>Mongo - Is Secondary?</td>
                    <td>{{ mongo_is_secondary }}</td>
                </tr>
            </tbody>
        </table>
    </div>
    
    <script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
    <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
</body>
</html>

To run the app we need to migrate the database first using the command below:

python ./manage.py migrate

python ./manage.py migrate

And also collect all the static assets into static directory:

python ./manage.py collectstatic --noinput

python ./manage.py collectstatic --noinput

Now run the Django app with Gun icorn, a WSGI HTTP server, as given below:

gunicorn --bind 0.0.0.0:8000 --access-logfile - tweeter.wsgi:application

gunicorn --bind 0.0.0.0:8000 --access-logfile - tweeter.wsgi:application

This gives us a basic blog-like Django app that connects to MongoDB backend.

We will discuss containerizing this Django web application in the latter part of this article.

Consul

We place a Cons ul agent on every service as part of our Consul setup.

The Consul agent is responsible for service discovery by registering the service on the Consul cluster and also monitors the health of every service instance.

Consul on nodes running MongoDB Replica Set

We will discuss Consul setup in the context of MongoDB Replica Set first – as it solves an interesting problem. At any given point of time, one of the MongoDB instances can either be a Primary or a Secondary.

The Consul agent registering and monitoring our MongoDB instance within a Replica Set has a unique mechanism – dynamically registering and deregistering MongoDB service as a Primary instance or a Secondary instance based on what Replica Set has designated it.

We achieve this dynamism by writing and running a shell script after an interval that toggles the Consul service definition for MongoDB Primary and MongoDB Secondary on the instance node’s Consul Agent.

The service definitions for MongoDB services are stored as JSON files on the Consul’s config directory ‘/etc/config.d’.

Service definition for MongoDB Primary instance:

{
    "service": {
        "name": "mongo-primary",
        "port": 27017,
        "tags": [
            "nosql",
            "database"
        ],
        "check": {
            "id": "mongo_primary_status",
            "name": "Mongo Primary Status",
            "args": ["/etc/consul.d/check_scripts/mongo_primary_check.sh"],
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

{
    "service": {
        "name": "mongo-primary",
        "port": 27017,
        "tags": [
            "nosql",
            "database"
        ],
        "check": {
            "id": "mongo_primary_status",
            "name": "Mongo Primary Status",
            "args": ["/etc/consul.d/check_scripts/mongo_primary_check.sh"],
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

If you look closely, the service definition allows us to get a DNS entry specific to MongoDB Primary, rather than a generic MongoDB instance. This allows us to send the database writes to a specific MongoDB instance. In the case of Replica Set, the writes are maintained by MongoDB Primary.

Thus, we are able to achieve both service discovery as well as health monitoring for Primary instance of MongoDB.

Similarly, with a slight change the service definition for MongoDB Secondary instance goes like this:

{
    "service": {
        "name": "mongo-secondary",
        "port": 27017,
        "tags": [
            "nosql",
            "database"
        ],
        "check": {
            "id": "mongo_secondary_status",
            "name": "Mongo Secondary Status",
            "args": ["/etc/consul.d/check_scripts/mongo_secondary_check.sh"],
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

{
    "service": {
        "name": "mongo-secondary",
        "port": 27017,
        "tags": [
            "nosql",
            "database"
        ],
        "check": {
            "id": "mongo_secondary_status",
            "name": "Mongo Secondary Status",
            "args": ["/etc/consul.d/check_scripts/mongo_secondary_check.sh"],
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

Given all this context, can you think of the way we can dynamically switch these service definitions?

We can identify if the given MongoDB insta nce is primary or not by running command `db.isMaster()` on MongoDB shell.

The check can we drafted as a shell script as:

#!/bin/bash
mongo_primary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .ismaster 2> /dev/null)
if [[ $mongo_primary == false ]]; then
    exit 1
fi
echo "Mongo primary healthy and reachable"

#!/bin/bash

mongo_primary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .ismaster 2> /dev/null)
if [[ $mongo_primary == false ]]; then
    exit 1
fi

echo "Mongo primary healthy and reachable"

Similarly, the non-master or non-primary instances of MongoDB can also be checked against the same command, by checking a `secondary` value:

#!/bin/bash
mongo_secondary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .secondary 2> /dev/null)
if [[ $mongo_secondary == false ]]; then
    exit 1
fi
echo "Mongo secondary healthy and reachable"

#!/bin/bash

mongo_secondary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .secondary 2> /dev/null)
if [[ $mongo_secondary == false ]]; then
    exit 1
fi

echo "Mongo secondary healthy and reachable"

Note: We are using jq – a lightweight and flexible command-line JSON processor – to process the JSON encoded output of MongoDB shell commands.

One way of writing a script that does this dynamic switch looks like this:

#!/bin/bash
# Wait until Mongo starts
while [[ $(ps aux | grep [m]ongod | wc -l) -ne 1 ]]; do
    sleep 5
done
REGISTER_MASTER=0
REGISTER_SECONDARY=0
mongo_primary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .ismaster 2> /dev/null)
mongo_secondary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .secondary 2> /dev/null)  
if [[ $mongo_primary == false && $mongo_secondary == true ]]; then
  # Deregister as Mongo Master
  if [[ -a /etc/consul.d/check_scripts/mongo_primary_check.sh && -a /etc/consul.d/mongo_primary.json ]]; then
    rm -f /etc/consul.d/check_scripts/mongo_primary_check.sh
    rm -f /etc/consul.d/mongo_primary.json
    REGISTER_MASTER=1
  fi
  # Register as Mongo Secondary
  if [[ ! -a /etc/consul.d/check_scripts/mongo_secondary_check.sh && ! -a /etc/consul.d/mongo_secondary.json ]]; then
    cp -u /opt/checks/check_scripts/mongo_secondary_check.sh /etc/consul.d/check_scripts/
    cp -u /opt/checks/mongo_secondary.json /etc/consul.d/
    REGISTER_SECONDARY=1
  fi
else
  # Register as Mongo Master
  if [[ ! -a /etc/consul.d/check_scripts/mongo_primary_check.sh && ! -a /etc/consul.d/mongo_primary.json ]]; then
    cp -u /opt/checks/check_scripts/mongo_primary_check.sh /etc/consul.d/check_scripts/
    cp -u /opt/checks/mongo_primary.json /etc/consul.d/
    REGISTER_MASTER=2
  fi
  # Deregister as Mongo Secondary
  if [[ -a /etc/consul.d/check_scripts/mongo_secondary_check.sh && -a /etc/consul.d/mongo_secondary.json ]]; then
    rm -f /etc/consul.d/check_scripts/mongo_secondary_check.sh
    rm -f /etc/consul.d/mongo_secondary.json
    REGISTER_SECONDARY=2
  fi
fi
if [[ $REGISTER_MASTER -ne 0 && $REGISTER_SECONDARY -ne 0 ]]; then
  consul reload
fi

#!/bin/bash

# Wait until Mongo starts
while [[ $(ps aux | grep [m]ongod | wc -l) -ne 1 ]]; do
    sleep 5
done

REGISTER_MASTER=0
REGISTER_SECONDARY=0

mongo_primary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .ismaster 2> /dev/null)
mongo_secondary=$(mongo --quiet --eval 'JSON.stringify(db.isMaster())' | jq -r .secondary 2> /dev/null)  

if [[ $mongo_primary == false && $mongo_secondary == true ]]; then

  # Deregister as Mongo Master
  if [[ -a /etc/consul.d/check_scripts/mongo_primary_check.sh && -a /etc/consul.d/mongo_primary.json ]]; then
    rm -f /etc/consul.d/check_scripts/mongo_primary_check.sh
    rm -f /etc/consul.d/mongo_primary.json

    REGISTER_MASTER=1
  fi

  # Register as Mongo Secondary
  if [[ ! -a /etc/consul.d/check_scripts/mongo_secondary_check.sh && ! -a /etc/consul.d/mongo_secondary.json ]]; then
    cp -u /opt/checks/check_scripts/mongo_secondary_check.sh /etc/consul.d/check_scripts/
    cp -u /opt/checks/mongo_secondary.json /etc/consul.d/

    REGISTER_SECONDARY=1
  fi

else

  # Register as Mongo Master
  if [[ ! -a /etc/consul.d/check_scripts/mongo_primary_check.sh && ! -a /etc/consul.d/mongo_primary.json ]]; then
    cp -u /opt/checks/check_scripts/mongo_primary_check.sh /etc/consul.d/check_scripts/
    cp -u /opt/checks/mongo_primary.json /etc/consul.d/

    REGISTER_MASTER=2
  fi

  # Deregister as Mongo Secondary
  if [[ -a /etc/consul.d/check_scripts/mongo_secondary_check.sh && -a /etc/consul.d/mongo_secondary.json ]]; then
    rm -f /etc/consul.d/check_scripts/mongo_secondary_check.sh
    rm -f /etc/consul.d/mongo_secondary.json

    REGISTER_SECONDARY=2
  fi

fi

if [[ $REGISTER_MASTER -ne 0 && $REGISTER_SECONDARY -ne 0 ]]; then
  consul reload
fi

Note: This is an example script, but we can be more creative and optimize the script further.

Once we are done with our service definitions we can run the Consul agent on each MongoDB nodes. To run a agent we will use the following command:

consul agent -bind 33.10.0.3 
    -advertise 33.10.0.3 
    -join consul_server 
    -node mongo_1 
    -dns-port 53 
    -data-dir /data 
    -config-dir /etc/consul.d 
    -enable-local-script-checks

consul agent -bind 33.10.0.3 
    -advertise 33.10.0.3 
    -join consul_server 
    -node mongo_1 
    -dns-port 53 
    -data-dir /data 
    -config-dir /etc/consul.d 
    -enable-local-script-checks

Here, ‘consul_server’ represents the Consul Server running host. Similarly, we can run such agents on each of the other MongoDB instance nodes.

Note: If we have multiple MongoDB instances running on the same host, the service definition would change to reflect the different ports used by each instance to uniquely identify, discover and monitor individual MongoDB instance.

Consul on nodes running Django App

For the Django application, Consul setup will be very simple. We only need to monitor Django app’s port on which Gunicorn is listening for requests.

The Consul service definition would look like this:

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

Once we have the Consul service definition for Django app in place, we can run the Consul agent sitting on the node Django app is running as a service. To run the Consul agent we would fire the following command:

consul agent -bind 33.10.0.10 
    -advertise 33.10.0.10 
    -join consul_server 
    -node web_1 
    -dns-port 53 
    -data-dir /data 
    -config-dir /etc/consul.d 
    -enable-local-script-checks

consul agent -bind 33.10.0.10 
    -advertise 33.10.0.10 
    -join consul_server 
    -node web_1 
    -dns-port 53 
    -data-dir /data 
    -config-dir /etc/consul.d 
    -enable-local-script-checks

Consul Server

We are running the Consul cluster with a dedicated Consul server node. The Consul server node can easily host, discover and monitor services running on it, exactly the same way as we did in the above sections for MongoDB and Django app.

To run Consul in server mode and allow agents to connect to it, we will fire the following command on the node that we want to run our Consul server:

consul agent -server 
    -bind 33.10.0.2 
    -advertise 33.10.0.2 
    -node consul_server 
    -client 0.0.0.0 
    -dns-port 53 
    -data-dir /data 
    -ui -bootstrap

consul agent -server 
    -bind 33.10.0.2 
    -advertise 33.10.0.2 
    -node consul_server 
    -client 0.0.0.0 
    -dns-port 53 
    -data-dir /data 
    -ui -bootstrap

There are no services on our Consul server node for now, so there are no service definitions associated with this Consul agent configuration.

Fabio

We are using the power of Fabio to be auto-confi gurable and being Consul-aware.

This makes our task of load-balancing the traffic to our Django app instances very easy.

To allow Fabio to auto-detect the services via Consul, one of the ways is to add a tag or update a tag in the service de finition with a prefix and a service identifier `urlprefix-/<service>`. Our Consul’s service definition for Django app would now look like this:</service>

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

In our case, the Django app or service is the only service that will need load-balancing, thus this Consul service definition change completes the requirement on Fabio setup.

Dockerization

Our whole app is going to be deployed as a set of Docker containers. Let’s talk about how we are achieving it in the context of Consul.

Dockerizing MongoDB Replica Set along with Consul Agent

We need to run a Consul agent as described above alongside MongoDB on the same Docker container, so we will need to run a custom ENTRYPOINT on the container to allow running two processes.

Note: This can also be achieved using Docker container level checks in Consul. So, you will be free to run a Consul agent on the host and check across service running in Docker container. Which, will essentially exec into the container to monitor the service.

To achieve this we will use a tool similar to Fore man. It is a lifecycle management tool for physical and virtual servers – including provisioning, monitoring and configuring.

To be precise, we will use the Golang adoption of Foreman, Gore man. It takes the configuration in the form of Heroku’s Procfile to maintain which processes to be kept alive on the host.

In our case, the Procfile looks like this:

# Mongo
mongo: /opt/mongo.sh
# Consul Client Agent
consul: /opt/consul.sh
# Consul Client Health Checks
consul_check: while true; do /opt/checks_toggler.sh && sleep 10; done

# Mongo
mongo: /opt/mongo.sh

# Consul Client Agent
consul: /opt/consul.sh

# Consul Client Health Checks
consul_check: while true; do /opt/checks_toggler.sh && sleep 10; done

The `consul_check` at the end of the Profile maintains the dynamism between both Primary and Secondary MongoDB node checks, based on who is voted for which role within MongoDB Replica Set.

The shell scripts that are executed by the respective keys on the Procfile are as defined previously in this discussion.

Our Dock erfile, with some additional tools for debug and diagnostics, would look like:

FROM ubuntu:18.04
RUN apt-get update && 
    apt-get install -y 
    bash curl nano net-tools zip unzip 
    jq dnsutils iputils-ping
# Install MongoDB
RUN apt-get install -y mongodb
RUN mkdir -p /data/db
VOLUME data:/data/db
# Setup Consul and Goreman
ADD https://releases.hashicorp.com/consul/1.4.4/consul_1.4.4_linux_amd64.zip /tmp/consul.zip
RUN cd /bin && unzip /tmp/consul.zip && chmod +x /bin/consul && rm /tmp/consul.zip
ADD https://github.com/mattn/goreman/releases/download/v0.0.10/goreman_linux_amd64.zip /tmp/goreman.zip
RUN cd /bin && unzip /tmp/goreman.zip && chmod +x /bin/goreman && rm /tmp/goreman.zip
RUN mkdir -p /etc/consul.d/check_scripts
ADD ./config/mongod /etc
RUN mkdir -p /etc/checks
ADD ./config/checks /opt/checks
ADD checks_toggler.sh /opt
ADD mongo.sh /opt
ADD consul.sh /opt
ADD Procfile /root/Procfile
EXPOSE 27017
# Launch both MongoDB server and Consul
ENTRYPOINT [ "goreman" ]
CMD [ "-f", "/root/Procfile", "start" ]

FROM ubuntu:18.04

RUN apt-get update && 
    apt-get install -y 
    bash curl nano net-tools zip unzip 
    jq dnsutils iputils-ping

# Install MongoDB
RUN apt-get install -y mongodb

RUN mkdir -p /data/db
VOLUME data:/data/db

# Setup Consul and Goreman
ADD https://releases.hashicorp.com/consul/1.4.4/consul_1.4.4_linux_amd64.zip /tmp/consul.zip
RUN cd /bin && unzip /tmp/consul.zip && chmod +x /bin/consul && rm /tmp/consul.zip

ADD https://github.com/mattn/goreman/releases/download/v0.0.10/goreman_linux_amd64.zip /tmp/goreman.zip
RUN cd /bin && unzip /tmp/goreman.zip && chmod +x /bin/goreman && rm /tmp/goreman.zip

RUN mkdir -p /etc/consul.d/check_scripts
ADD ./config/mongod /etc

RUN mkdir -p /etc/checks
ADD ./config/checks /opt/checks

ADD checks_toggler.sh /opt
ADD mongo.sh /opt
ADD consul.sh /opt

ADD Procfile /root/Procfile

EXPOSE 27017

# Launch both MongoDB server and Consul
ENTRYPOINT [ "goreman" ]
CMD [ "-f", "/root/Procfile", "start" ]

Note: We have used bare Ubuntu 18.04 image here for our purposes, but you can use offic ial MongoDB image and adapt it to run Consul alongside MongoDB or even do Consul checks on Docker container level as mentioned in the official documentation.

Dockerizing Django Web Application along with Consul Agent

We also need to run a Consul agent alongside our Django App on the same Docker container as we had with MongoDB container.

# Django
django: /web/tweeter.sh
# Consul Client Agent
consul: /opt/consul.sh

# Django
django: /web/tweeter.sh

# Consul Client Agent
consul: /opt/consul.sh

Similarly, we will have the Dockerfile for Django Web Application as we had for our MongoDB containers.

FROM python:3.7
RUN apt-get update && 
    apt-get install -y 
    bash curl nano net-tools zip unzip 
    jq dnsutils iputils-ping
# Python Environment Setup
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
# Setup Consul and Goreman
RUN mkdir -p /data/db /etc/consul.d
ADD https://releases.hashicorp.com/consul/1.4.4/consul_1.4.4_linux_amd64.zip /tmp/consul.zip
RUN cd /bin && unzip /tmp/consul.zip && chmod +x /bin/consul && rm /tmp/consul.zip
ADD https://github.com/mattn/goreman/releases/download/v0.0.10/goreman_linux_amd64.zip /tmp/goreman.zip
RUN cd /bin && unzip /tmp/goreman.zip && chmod +x /bin/goreman && rm /tmp/goreman.zip
ADD ./consul /etc/consul.d
ADD Procfile /root/Procfile
# Install pipenv
RUN pip3 install --upgrade pip
RUN pip3 install pipenv
# Setting workdir
ADD consul.sh /opt
ADD . /web
WORKDIR /web/tweeter
# Exposing appropriate ports
EXPOSE 8000/tcp
# Install dependencies
RUN pipenv install --system --deploy --ignore-pipfile
# Migrates the database, uploads staticfiles, run API server and background tasks
ENTRYPOINT [ "goreman" ]
CMD [ "-f", "/root/Procfile", "start" ]

FROM python:3.7

RUN apt-get update && 
    apt-get install -y 
    bash curl nano net-tools zip unzip 
    jq dnsutils iputils-ping

# Python Environment Setup
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# Setup Consul and Goreman
RUN mkdir -p /data/db /etc/consul.d

ADD https://releases.hashicorp.com/consul/1.4.4/consul_1.4.4_linux_amd64.zip /tmp/consul.zip
RUN cd /bin && unzip /tmp/consul.zip && chmod +x /bin/consul && rm /tmp/consul.zip

ADD https://github.com/mattn/goreman/releases/download/v0.0.10/goreman_linux_amd64.zip /tmp/goreman.zip
RUN cd /bin && unzip /tmp/goreman.zip && chmod +x /bin/goreman && rm /tmp/goreman.zip

ADD ./consul /etc/consul.d
ADD Procfile /root/Procfile

# Install pipenv
RUN pip3 install --upgrade pip
RUN pip3 install pipenv

# Setting workdir
ADD consul.sh /opt
ADD . /web
WORKDIR /web/tweeter

# Exposing appropriate ports
EXPOSE 8000/tcp

# Install dependencies
RUN pipenv install --system --deploy --ignore-pipfile

# Migrates the database, uploads staticfiles, run API server and background tasks
ENTRYPOINT [ "goreman" ]
CMD [ "-f", "/root/Procfile", "start" ]

Dockerizing Consul Server

We are maintaining the same flow with Consul server node to run it with custom ENTRYPOINT. It is not a requirement, but we are maintaining a consistent view of different Consul run files.

Also, we are using Ubuntu 18.04 image for the demonstration. You can very well use Cons ul’s official image for this, that accepts all the custom parameters as are mentioned here.

FROM ubuntu:18.04
RUN apt-get update && 
    apt-get install -y 
    bash curl nano net-tools zip unzip 
    jq dnsutils iputils-ping
ADD https://releases.hashicorp.com/consul/1.4.4/consul_1.4.4_linux_amd64.zip /tmp/consul.zip
RUN cd /bin && unzip /tmp/consul.zip && chmod +x /bin/consul && rm /tmp/consul.zip
# Consul ports
EXPOSE 8300 8301 8302 8400 8500
ADD consul_server.sh /opt
RUN mkdir -p /data
VOLUME /data
CMD ["/opt/consul_server.sh"]

FROM ubuntu:18.04

RUN apt-get update && 
    apt-get install -y 
    bash curl nano net-tools zip unzip 
    jq dnsutils iputils-ping

ADD https://releases.hashicorp.com/consul/1.4.4/consul_1.4.4_linux_amd64.zip /tmp/consul.zip
RUN cd /bin && unzip /tmp/consul.zip && chmod +x /bin/consul && rm /tmp/consul.zip

# Consul ports
EXPOSE 8300 8301 8302 8400 8500

ADD consul_server.sh /opt
RUN mkdir -p /data
VOLUME /data

CMD ["/opt/consul_server.sh"]

Docker Compose

We are using Compose to run all our Docker containers in a desired, repeatable form.

Our Compose file is written to denote all the aspects that we mentioned above and utilize the power of Docker Compose tool to achieve those in a seamless fashion.

Docker Compose file would look like the one given below:

version: "3.6"
services:
  consul_server:
    build:
      context: consul_server
      dockerfile: Dockerfile
    image: consul_server
    ports:
      - 8300:8300
      - 8301:8301
      - 8302:8302
      - 8400:8400
      - 8500:8500
    environment:
      - NODE=consul_server
      - PRIVATE_IP_ADDRESS=33.10.0.2
    networks:
      consul_network:
        ipv4_address: 33.10.0.2
  load_balancer:
    image: fabiolb/fabio
    ports:
      - 9998:9998
      - 9999:9999
    command: -registry.consul.addr="33.10.0.2:8500"
    networks:
      consul_network:
        ipv4_address: 33.10.0.100
  mongo_1:
    build:
      context: mongo
      dockerfile: Dockerfile
    image: mongo_consul
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    environment:
      - NODE=mongo_1
      - MONGO_PORT=27017
      - PRIMARY_MONGO=33.10.0.3
      - PRIVATE_IP_ADDRESS=33.10.0.3
    restart: always
    ports:
      - 27017:27017
      - 28017:28017
    depends_on:
      - consul_server
      - mongo_2
      - mongo_3
    networks:
      consul_network:
        ipv4_address: 33.10.0.3
  mongo_2:
    build:
      context: mongo
      dockerfile: Dockerfile
    image: mongo_consul
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    environment:
      - NODE=mongo_2
      - MONGO_PORT=27017
      - PRIMARY_MONGO=33.10.0.3
      - PRIVATE_IP_ADDRESS=33.10.0.4
    restart: always
    ports:
      - 27018:27017
      - 28018:28017
    depends_on:
      - consul_server
    networks:
      consul_network:
        ipv4_address: 33.10.0.4
  mongo_3:
    build:
      context: mongo
      dockerfile: Dockerfile
    image: mongo_consul
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    environment:
      - NODE=mongo_3
      - MONGO_PORT=27017
      - PRIMARY_MONGO=33.10.0.3
      - PRIVATE_IP_ADDRESS=33.10.0.5
    restart: always
    ports:
      - 27019:27017
      - 28019:28017
    depends_on:
      - consul_server
    networks:
      consul_network:
        ipv4_address: 33.10.0.5
  web_1:
    build:
      context: django
      dockerfile: Dockerfile
    image: web_consul
    ports:
      - 8080:8000
    environment:
      - NODE=web_1
      - PRIMARY=1
      - LOAD_BALANCER=33.10.0.100
      - PRIVATE_IP_ADDRESS=33.10.0.10
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    depends_on:
      - consul_server
      - mongo_1
    volumes:
      - ./django:/web
    cap_add:
      - NET_ADMIN
    networks:
      consul_network:
        ipv4_address: 33.10.0.10
  web_2:
    build:
      context: django
      dockerfile: Dockerfile
    image: web_consul
    ports:
      - 8081:8000
    environment:
      - NODE=web_2
      - LOAD_BALANCER=33.10.0.100
      - PRIVATE_IP_ADDRESS=33.10.0.11
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    depends_on:
      - consul_server
      - mongo_1
    volumes:
      - ./django:/web
    cap_add:
      - NET_ADMIN
    networks:
      consul_network:
        ipv4_address: 33.10.0.11
networks:
  consul_network:
    driver: bridge
    ipam:
     config:
       - subnet: 33.10.0.0/16

version: "3.6"

services:

  consul_server:
    build:
      context: consul_server
      dockerfile: Dockerfile
    image: consul_server
    ports:
      - 8300:8300
      - 8301:8301
      - 8302:8302
      - 8400:8400
      - 8500:8500
    environment:
      - NODE=consul_server
      - PRIVATE_IP_ADDRESS=33.10.0.2
    networks:
      consul_network:
        ipv4_address: 33.10.0.2

  load_balancer:
    image: fabiolb/fabio
    ports:
      - 9998:9998
      - 9999:9999
    command: -registry.consul.addr="33.10.0.2:8500"
    networks:
      consul_network:
        ipv4_address: 33.10.0.100

  mongo_1:
    build:
      context: mongo
      dockerfile: Dockerfile
    image: mongo_consul
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    environment:
      - NODE=mongo_1
      - MONGO_PORT=27017
      - PRIMARY_MONGO=33.10.0.3
      - PRIVATE_IP_ADDRESS=33.10.0.3
    restart: always
    ports:
      - 27017:27017
      - 28017:28017
    depends_on:
      - consul_server
      - mongo_2
      - mongo_3
    networks:
      consul_network:
        ipv4_address: 33.10.0.3

  mongo_2:
    build:
      context: mongo
      dockerfile: Dockerfile
    image: mongo_consul
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    environment:
      - NODE=mongo_2
      - MONGO_PORT=27017
      - PRIMARY_MONGO=33.10.0.3
      - PRIVATE_IP_ADDRESS=33.10.0.4
    restart: always
    ports:
      - 27018:27017
      - 28018:28017
    depends_on:
      - consul_server
    networks:
      consul_network:
        ipv4_address: 33.10.0.4

  mongo_3:
    build:
      context: mongo
      dockerfile: Dockerfile
    image: mongo_consul
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    environment:
      - NODE=mongo_3
      - MONGO_PORT=27017
      - PRIMARY_MONGO=33.10.0.3
      - PRIVATE_IP_ADDRESS=33.10.0.5
    restart: always
    ports:
      - 27019:27017
      - 28019:28017
    depends_on:
      - consul_server
    networks:
      consul_network:
        ipv4_address: 33.10.0.5

  web_1:
    build:
      context: django
      dockerfile: Dockerfile
    image: web_consul
    ports:
      - 8080:8000
    environment:
      - NODE=web_1
      - PRIMARY=1
      - LOAD_BALANCER=33.10.0.100
      - PRIVATE_IP_ADDRESS=33.10.0.10
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    depends_on:
      - consul_server
      - mongo_1
    volumes:
      - ./django:/web
    cap_add:
      - NET_ADMIN
    networks:
      consul_network:
        ipv4_address: 33.10.0.10

  web_2:
    build:
      context: django
      dockerfile: Dockerfile
    image: web_consul
    ports:
      - 8081:8000
    environment:
      - NODE=web_2
      - LOAD_BALANCER=33.10.0.100
      - PRIVATE_IP_ADDRESS=33.10.0.11
    dns:
      - 127.0.0.1
      - 8.8.8.8
      - 8.8.4.4
    depends_on:
      - consul_server
      - mongo_1
    volumes:
      - ./django:/web
    cap_add:
      - NET_ADMIN
    networks:
      consul_network:
        ipv4_address: 33.10.0.11

networks:
  consul_network:
    driver: bridge
    ipam:
     config:
       - subnet: 33.10.0.0/16

That brings us to the end of the whole environment setup. We can now run Docker Compose to build and run the containers.

Service Discovery using Consul

When all the services are up and running the Consul Web UI gives us a nice glance at our overall setup.

Consul Web UI showing the set of services we are running and their current state

‍

The MongoDB service is available for Django app to discover by virtue of Consul’s DNS interface.

root@82857c424b15:/web/tweeter# dig @127.0.0.1 mongo-primary.service.consul
; <<>> DiG 9.10.3-P4-Debian <<>> @127.0.0.1 mongo-primary.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8369
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;mongo-primary.service.consul.	IN	A
;; ANSWER SECTION:
mongo-primary.service.consul. 0	IN	A	33.10.0.3
;; ADDITIONAL SECTION:
mongo-primary.service.consul. 0	IN	TXT	"consul-network-segment="
;; Query time: 139 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Apr 01 11:50:45 UTC 2019
;; MSG SIZE  rcvd: 109

root@82857c424b15:/web/tweeter# dig @127.0.0.1 mongo-primary.service.consul

; <<>> DiG 9.10.3-P4-Debian <<>> @127.0.0.1 mongo-primary.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8369
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;mongo-primary.service.consul.	IN	A

;; ANSWER SECTION:
mongo-primary.service.consul. 0	IN	A	33.10.0.3

;; ADDITIONAL SECTION:
mongo-primary.service.consul. 0	IN	TXT	"consul-network-segment="

;; Query time: 139 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Apr 01 11:50:45 UTC 2019
;; MSG SIZE  rcvd: 109

Django App can now connect MongoDB Primary instance and start writing data to it.

We can use Fabio load-balancer to connect to Django App instance by auto-discovering it via Consul registry using specialized service tags and render the page with all the database connection information we are talking about.

Our load-balancer is sitting at ‘33.10.0.100’ and ‘/web’ is configured to be routed to one of our Django application instances running behind the load-balancer.

Fabio auto-detecting the Django Web Application end-points

As you can see from the auto-detection and configuration of Fabio load-balancer from its UI above, it has weighted the Django Web Application end-points equally. This will help balance the request or traffic load on the Django application instances.

When we visit our Fabio URL ‘33.10.0.100:9999’ and use the source route as ‘/web’ we are routed to one of the Django instances. So, visiting ‘33.10.0.100:9999/web’ gives us following output.

Django Web Application renders the MongoDB connection status on the home page

We are able to restrict Fabio to only load-balance Django app instances by only adding required tags to Consul’s service definitions of Django app services.

This MongoDB Primary instance discovery helps Django app to do database migration and app deployment.

One can explore Consul Web UI to see all the instances of Django web application services.

Django Web Application services as seen on Consul’s Web UI

‍

Similarly, see how MongoDB Replica Set instances are laid out.

MongoDB Replica Set Primary service as seen on Consul’s Web UI

MongoDB Replica Set Secondary services as seen on Consul’s Web UI

Let’s see how Consul helps with health-checking services and discovering only the alive services.

We will stop the current MongoDB Replica Set Primary (‘mongo_2’) container, to see what happens.

MongoDB Primary service being swapped with one of the MongoDB Secondary instances

MongoDB Secondary instance set is now left with only one service instance

Consul has started failing the health-check for previous MongoDB Primary service. MongoDB Replica Set has also detected that the node is down and the re-election of Primary node needs to be done. Thus, getting us a new MongoDB Primary (‘mongo_3’) automatically.

Our checks toggle has kicked-in and swapped the check on ‘mongo_3’ from MongoDB Secondary check to MongoDB Primary check.

When we take a look at the view from the Django app, we see it is now connected to a new MongoDB Primary service (‘mongo_3’).

Switching of the MongoDB Primary is also reflected in the Django Web Application

Let’s see how this plays out when we bring back the stopped MongoDB instance.

Failing MongoDB Primary service instance is now cleared out from service instances as it is now healthy MongoDB Secondary service instance

Previously failed MongoDB Primary service instance is now re-adopted as MongoDB Secondary service instance as it has become healthy again

Similarly, if we stop the service instances of Django application, Fabio would now be able to detect only a healthy instance and would only route the traffic to that instance.

Fabio is able to auto-configure itself using Consul’s service registry and detecting alive service instances

This is how one can use Consul’s service discovery capability to discover, monitor and health-check services.

Service Configuration using Consul

Currently, we are configuring Django application instances directly either from environment variables set within the containers by Docker Compose and consuming them in Django project settings or by hard-coding the configuration parameters directly.

We can use Consul’s Key/Value store to share configuration across both the instances of Django app.

We can use Consul’s HTTP interface to store key/value pair and retrieve them within the app using the open-source Python client for Consul, called python-consul. You may also use any other Python library that can interact with Consul’s KV store if you want.

Let’s begin by looking at how we can set a key/value pair in Consul using its HTTP interface.

# Flag to run Django app in debug mode
curl -X PUT -d 'True' consul_server:8500/v1/kv/web/debug

# Dynamic entries into Django app configuration 
# to denote allowed set of hosts
curl -X PUT -d 'localhost, 33.10.0.100' consul_server:8500/v1/kv/web/allowed_hosts

# Dynamic entries into Django app configuration
# to denote installed apps
curl -X PUT -d 'tweetapp' consul_server:8500/v1/kv/web/installed_apps

# Flag to run Django app in debug mode
curl -X PUT -d 'True' consul_server:8500/v1/kv/web/debug

# Dynamic entries into Django app configuration 
# to denote allowed set of hosts
curl -X PUT -d 'localhost, 33.10.0.100' consul_server:8500/v1/kv/web/allowed_hosts

# Dynamic entries into Django app configuration
# to denote installed apps
curl -X PUT -d 'tweetapp' consul_server:8500/v1/kv/web/installed_apps

Once we set the KV store we can consume it on Django app instances to configure it with these values.

Let’s install python-consul and add it as a project dependency.

$ pipenv shell
Launching subshell in virtual environment…
 . /home/pranav/.local/share/virtualenvs/tweeter-PYSn2zRU/bin/activate

$  . /home/pranav/.local/share/virtualenvs/tweeter-PYSn2zRU/bin/activate

(tweeter) $ pipenv install python-consul
Installing python-consul…
Adding python-consul to Pipfile's [packages]…
✔ Installation Succeeded 
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
✔ Success! 
Updated Pipfile.lock (9590cc)!
Installing dependencies from Pipfile.lock (9590cc)…
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 14/14 — 00:00:20

$ pipenv shell
Launching subshell in virtual environment…
 . /home/pranav/.local/share/virtualenvs/tweeter-PYSn2zRU/bin/activate

$  . /home/pranav/.local/share/virtualenvs/tweeter-PYSn2zRU/bin/activate

(tweeter) $ pipenv install python-consul
Installing python-consul…
Adding python-consul to Pipfile's [packages]…
✔ Installation Succeeded 
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
✔ Success! 
Updated Pipfile.lock (9590cc)!
Installing dependencies from Pipfile.lock (9590cc)…
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 14/14 — 00:00:20

We will need to connect our app to Consul using python-consul.

import consul

consul_client = consul.Consul(
    host='consul_server',
    port=8500,
)

import consul

consul_client = consul.Consul(
    host='consul_server',
    port=8500,
)

We can capture and configure our Django app accordingly using the ‘python-consul’ library.

# Set DEBUG flag using Consul KV store
index, data = consul_client.kv.get('web/debug')
DEBUG = data.get('Value', True)

# Set ALLOWED_HOSTS dynamically using Consul KV store
ALLOWED_HOSTS = []

index, hosts = consul_client.kv.get('web/allowed_hosts')
ALLOWED_HOSTS.append(hosts.get('Value'))

# Set INSTALLED_APPS dynamically using Consul KV store
INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
]

index, apps = consul_client.kv.get('web/installed_apps')
INSTALLED_APPS += (bytes(apps.get('Value')).decode('utf-8'),)

# Set DEBUG flag using Consul KV store
index, data = consul_client.kv.get('web/debug')
DEBUG = data.get('Value', True)

# Set ALLOWED_HOSTS dynamically using Consul KV store
ALLOWED_HOSTS = []

index, hosts = consul_client.kv.get('web/allowed_hosts')
ALLOWED_HOSTS.append(hosts.get('Value'))

# Set INSTALLED_APPS dynamically using Consul KV store
INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
]

index, apps = consul_client.kv.get('web/installed_apps')
INSTALLED_APPS += (bytes(apps.get('Value')).decode('utf-8'),)

These key/value pair from Consul’s KV store can also be viewed and updated from its Web UI.

Consul KV store as seen on Consul Web UI with Django app configuration parameters

The code used as part of this guide for Consul’s service configuration section is available on ‘service-configuration’ branch of pranavcode/consul-demo project.

That is how one can use Consul’s KV store and configure individual services in their architecture with ease.

Service Segmentation using Consul

As part of Consul’s Service Segmentation we are going to look at Consul Connect intentions and data center distribution.

Connect provides service-to-service connection authorization and encryption using mutual TLS.

To use Consul you need to enable it in the server configuration. Connect needs to be enabled across the Consul cluster for proper functioning of the cluster.

{
    "connect": {
        "enabled": true
    }
}

{
    "connect": {
        "enabled": true
    }
}

In our context, we can define that the communication is to be TLS identified and secured we will define an upstream sidecar service with a proxy on Django app for its communication with MongoDB Primary instance.

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "connect": {
            "sidecar_service": {
                "proxy": {
                    "upstreams": [{
                        "destination_name": "mongo-primary",
                        "local_bind_port": 5501
                    }]
                }
            }
        },
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "connect": {
            "sidecar_service": {
                "proxy": {
                    "upstreams": [{
                        "destination_name": "mongo-primary",
                        "local_bind_port": 5501
                    }]
                }
            }
        },
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

Along with Connect configuration of sidecar proxy, we will also need to run the Connect proxy for Django app as well. This could be achieved by running the following command.

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "connect": {
            "sidecar_service": {
                "proxy": {
                    "upstreams": [{
                        "destination_name": "mongo-primary",
                        "local_bind_port": 5501
                    }]
                }
            }
        },
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

{
    "service": {
        "name": "web",
        "port": 8000,
        "tags": [
            "web",
            "application",
            "urlprefix-/web"
        ],
        "connect": {
            "sidecar_service": {
                "proxy": {
                    "upstreams": [{
                        "destination_name": "mongo-primary",
                        "local_bind_port": 5501
                    }]
                }
            }
        },
        "check": {
            "id": "web_app_status",
            "name": "Web Application Status",
            "tcp": "localhost:8000",
            "interval": "30s",
            "timeout": "20s"
        }
    }
}

We can add Consul Connect Intentions to create a service graph across all the services and define traffic pattern. We can create intentions as shown below:

$ consul connect proxy -sidecar-for web

$ consul connect proxy -sidecar-for web

Intentions for service graph can also be managed from Consul Web UI.

Define access control for services via Connect and service connection restrictions

This defines the service connection restrictions to allow or deny them to talk via Connect.

We have also added ability on Consul agents to denote which datacenters they belong to and be accessible via one or more Consul servers in a given datacenter.

The code used as part of this guide for Consul’s service segmentation section is available on ‘service-segmentation’ branch of velotiotech/consul-demo project.

That is how one can use Consul’s service segmentation feature and configure service level connection access control.

Conclusion

Having an ability to seamlessly control the service mesh that Consul provides makes the life of an operator very easy. We hope you have learnt how Consul can be used for service discovery, configuration, and segmentation with its practical implementation.

As usual, we hope it was an informative ride on the journey of Consul. This was the final piece of this two part series. This part tries to cover most of the aspects of Consul architecture and how it fits into your current project. In case you miss the first part, find it here.

We will continue our endeavors with different technologies and get you the most valuable information that we possibly can in every interaction. Let’s us know what you would like to hear from us more or if you have any questions around the topic, we will be more than happy to answer those.

References

November 19, 2025

Creating a Smarter, Safer World: Developing a VoIP-Enabled Audio/Video Call Mobile App for Smart Buildings
In today’s fast-evolving world, technology has redefined how we interact with our environments. From smart homes to automated workplaces, the demand for integrated systems that offer safety, convenience and enhanced user experience is higher than ever. As part of our ongoing mission to designs and build next-gen products, we recently took on a fascinating project: developing a mobile app (Android and iOS) for audio and video calls that works seamlessly with voice entry panels supporting VoIP (Voice over Internet Protocol).

The Client’s Vision: Safety, Accessibility and Innovation

The client approached us and requested an app that could serve as a comprehensive communication hub between residents, visitors, and building security systems. They wanted a mobile application that would:
- Enable smooth audio and video communication between users and VoIP-enabled voice entry panels.
- Improve building security by allowing residents to verify visitors before granting access.
- Increase convenience by integrating this communication into a single, easy-to-use app.
- Allow the mobile device to provide keyless entry.
- Should be available for both iOS and Android devices.
- In essence, the app would become the digital key to a smarter, safer living space.
Why VoIP? The Backbone of the Future

Voice over Internet Protocol (VoIP) has revolutionized modern communication by allowing voice and multimedia sessions over the internet. Its use in smart buildings, particularly in voice entry panels, provides several key benefits:
- Real-time communication: VoIP ensures fast, reliable, real-time communication between users and security systems.
- Cost-effectiveness: Unlike traditional landlines, VoIP uses existing internet infrastructure, lowering costs and providing more flexibility for building owners.
- Scalability: VoIP systems can easily be scaled up to accommodate new users and features as a building’s needs evolve.
By leveraging VoIP, we were able to design an app that offers superior connectivity, reliability and security.

Key Features of Our Audio/Video Call App

Our team focused on creating a solution that would offer a seamless user experience while enhancing security and control over building access. Below are the key features of the app we developed:
- Two-Way Audio/Video Communication
The app allows residents to receive calls from visitors at building entry panels, enabling real-time audio and video communication. This allows users to verify visitors’ identities visually and audibly, adding an extra layer of security.
- Remote Door Access
After verifying a visitor, the user can grant or deny access to the building via the app. This feature is particularly useful for those who may not be home but still need to authorize entry, such as in the case of package deliveries or housekeepers.
- VoIP-Based Calling
Since the app is built around VoIP technology, users benefit from high-quality calls that bypass traditional phone networks. The app integrates with voice entry panels that support VoIP, ensuring clear communication even over long distances.
- BLE Functionality: Turning Your Mobile Device into a Key
One of this app’s most groundbreaking features is the integration of Bluetooth Low Energy (BLE) technology, which allows the mobile device to function as a digital key. BLE is known for its ability to facilitate short-range communication with minimal power consumption, making it an ideal solution for mobile access control.

Instead of using traditional keycards, fobs, or physical keys, employees and residents can use their smartphones to open gates and doors. BLE communicates with the gate system within proximity, enabling automatic or one-tap entry. This improves convenience and reduces the need for physical keys, which can be lost or stolen.
- Cloud-Based Validation: Security in the Digital Age
To ensure that only authorized individuals can gain access, the app integrates with highly secure cloud servers for communication and validation. Every time a user attempts to open a gate or door using the app, the system verifies their credentials in real-time through cloud communication. This means that building managers or security teams can update or revoke access permissions instantly, without needing to reissue physical keys or cards.

Cloud-based validation offers multiple security layers, including encryption and authentication, ensuring that access control is both flexible and safe. It also allows for seamless monitoring and reporting, as every access attempt is logged in the system, providing valuable data for building management and security audits.

This feature also lets users manage communication logs, call history, and security footage from anywhere in the world. This feature is critical for building managers who want a real-time overview of all entry points.
- Streamlined Access for Employees and Residents
This app is designed with both employees and residents in mind. In office settings, employees can easily move through secured areas, using the app to access parking garages, secure floors, and meeting rooms. For residents of gated communities or apartment buildings, the app provides a hassle-free solution to enter gates and shared spaces like gyms or pools, without needing to fumble with keys or keycards and verifying visitors’ identities visually and audibly.
- Customizable Alerts and Notifications
Users can set up custom notifications for different types of events, such as when a delivery arrives or a family member enters the building. These alerts ensure that users are always aware of what’s happening around their living space.
- Scalability
This app can be scaled for use in various environments, from small residential buildings to large corporate complexes.

Enhancing the Future of Smart Buildings

This app’s development is part of a broader trend towards creating intelligent living environments where technology and security work hand-in-hand. Smart buildings equipped with VoIP-enabled voice entry panels, paired with mobile applications like ours not only enhance both security and user experience but also offer a glimpse into the future-a future where buildings are not just physical spaces, but adaptive, interactive ecosystems designed around the needs of their occupants.

We’re proud to contribute to this vision of safer, smarter, and more livable buildings. This project is a step forward in reshaping how residents interact with their living spaces, making life more convenient while ensuring peace of mind.

The Road Ahead

As we continue to refine and improve our app, we are excited about the potential to incorporate even more innovative features, such as Early Media Access, AI-driven facial recognition, advanced analytics for building managers and further integrations with home automation systems like Emergency Lighting System, Apartment Intercom System, Fire Detection and Alarm System including Smoke detectors, heat detectors and carbon monoxide detectors.

Our commitment remains strong: to create digital solutions which empower our clients and make the world a better, safer place to live in.

Stay tuned for more updates as we continue to push the boundaries of what smart building technology can achieve!

‍
November 19, 2025
The Data Lake Revolution: Unleashing the Power of Delta Lake
Once upon a time, in the vast and ever-expanding world of data storage and processing, a new hero emerged. Its name? Delta Lake. This unsung champion was about to revolutionize the way organizations handled their data, and its journey was nothing short of remarkable.

The Need for a Data Savior

In this world, data was king, and it resided in various formats within the mystical realm of data lakes. Two popular formats, Parquet and Hive, had served their purposes well, but they harbored limitations that often left data warriors frustrated.

Enterprises faced a conundrum: they needed to make changes, updates, or even deletions to individual records within these data lakes. But it wasn’t as simple as it sounded. Modifying schemas was a perilous endeavor that could potentially disrupt the entire data kingdom.

Why? Because these traditional table formats lacked a vital attribute: ACID transactions. Without these safeguards, every change was a leap of faith.

The Rise of Delta Lake

Amidst this data turmoil, a new contender emerged: Delta Lake. It was more than just a format; it was a game-changer.

Delta Lake brought with it the power of ACID transactions. Every data operation within the kingdom was now imbued with atomicity, consistency, isolation, and durability. It was as if Delta Lake had handed data warriors an enchanted sword, making them invincible in the face of chaos.

But that was just the beginning of Delta Lake’s enchantment.

The Secrets of Delta Lake

Delta Lake was no ordinary table format; it was a storage layer that transcended the limits of imagination. It integrated seamlessly with Spark APIs, offering features that left data sorcerers in awe.
- Time Travel: Delta Lake allowed users to peer into the past, accessing previous versions of data. The transaction log became a portal to different eras of data history.
- Schema Evolution: It had the power to validate and evolve schemas as data changed. A shapeshifter of sorts, it embraced change effortlessly.
- Change Data Feed: With this feature, it tracked data changes at the granular level. Data sorcerers could now decipher the intricate dance of inserts, updates, and deletions.
- Data Skipping with Z-ordering: Delta Lake mastered the art of optimizing data retrieval. It skipped irrelevant files, ensuring that data requests were as swift as a summer breeze.
- DML Operations: It wielded the power of SQL-like data manipulation language (DML) operations. Updates, deletes, and merges were but a wave of its hand.
Delta Lake’s Allies

Delta Lake didn’t stand alone; it forged alliances with various data processing tools and platforms. Apache Spark, Apache Flink, Presto, Trino, Hive, DBT, and many others joined its cause. They formed a coalition to champion the cause of efficient data processing.

In the vast landscape of data management, Delta Lake stands as a beacon of innovation, offering a plethora of features that elevate your data handling capabilities to new heights. In this exhilarating adventure, we’ll explore the key features of Delta Lake and how they triumph over the limitations of traditional file formats, all while embracing the ACID properties.

ACID Properties: A Solid Foundation

In the realm of data, ACID isn’t just a chemical term; it’s a set of properties that ensure the reliability and integrity of your data operations. Let’s break down how Delta Lake excels in this regard.

A for Atomicity: All or Nothing

Imagine a tightrope walker teetering in the middle of their performance—either they make it to the other side, or they don’t. Atomicity operates on the same principle: either all changes happen, or none at all. In the world of Spark, this principle often takes a tumble. When a write operation fails midway, the old data is removed, and the new data is lost in the abyss. Delta Lake, however, comes to the rescue. It creates a transaction log, recording all changes made along with their versions. In case of a failure, data loss is averted, and your system remains consistent.

C for Consistency: The Guardians of Validity

Consistency is the gatekeeper of data validity. It ensures that your data remains rock-solid and valid at all times. Spark sometimes falters here. Picture this: your Spark job fails, leaving your system with invalid data remnants. Consistency crumbles. Delta Lake, on the other hand, is your data’s staunch guardian. With its transaction log, it guarantees that even in the face of job failure, data integrity is preserved.

I for Isolation: Transactions in Solitude

Isolation is akin to individual bubbles, where multiple transactions occur in isolation, without interfering with one another. Spark might struggle with this concept. If two Spark jobs manipulate the same dataset concurrently, chaos can ensue. One job overwrites the dataset while the other is still using it—no isolation, no guarantees. Delta Lake, however, introduces order into the chaos. Through its versioning system and transaction log, it ensures that transactions proceed in isolation, mitigating conflicts and ensuring the data’s integrity.

D for Durability: Unyielding in the Face of Failure

Durability means that once changes are made, they are etched in stone, impervious to system failures. Spark’s Achilles’ heel lies in its vulnerability to data loss during job failures. Delta Lake, however, boasts a different tale. It secures your data with unwavering determination. Every change is logged, and even in the event of job failure, data remains intact—a testament to true durability.

Time Travel: Rewriting the Past

Now, let’s embark on a fascinating journey through time. Delta Lake introduces a feature that can only be described as “time travel.” With this feature, you can revisit previous versions of your data, just like rewinding a movie. All of this magical history is stored in the transaction log, encapsulated within the mystical “_delta_log” folder. When you write data to a Delta table, it’s not just the present that’s captured; the past versions are meticulously preserved, waiting for your beck and call.

In conclusion, Delta Lake emerges as the hero of the data world, rewriting the rules of traditional file formats and conquering the challenges of the ACID properties. With its robust transaction log, versioning system, and the ability to traverse time, Delta Lake opens up a new dimension in data management. So, if you’re on a quest for data reliability, integrity, and a touch of magic, Delta Lake is your trusted guide through this thrilling journey beyond convention.

More Features of Delta Lake Are:
- UPSERT
- Schema Evolution
- Change Data Feed
- Data Skipping with Z-ordering
- DML Operations
The Quest for Delta Lake

Setting up Delta Lake was like embarking on a quest. Data adventurers ventured into the cloud, AWS, GCP, Azure, or even their local domains. They armed themselves with the delta-spark spells and summoned the JARs of delta-core, delta-contrib, and delta-storage, tailored to their Spark versions.

Requirements:
- Python
- Delta-spark
- Delta jars
You can configure in a Spark session and define the package name so it will be downloaded at the run time. As I said, I am using Spark version 3.3. We will require these things: delta-core, delta-contribs, delta-storage. You can download them from here: https://github.com/delta-io/delta/releases/

To use and configure various cloud storage options, there are separate .jars you can use: https://docs.delta.io/latest/delta-storage.html. Here, you can find .jars for AWS, GSC, and Azure to configure and use their data storage medium.

Run this command to install delta-spark first:

pip install delta-spark

(If you are using dataproc or EMR, you can install this while creating cluster as a startup action, and if you are using serverless env like Glue or dataproc batches, you can create docker build or pass the .whl file for this package.)

You must also do this while downloading the .jar. If it is serverless, download the .jar, store it in cloud storage, like S3 or GS, and use that path while running the job. If it is a cluster like dataproc or EMR, you can download this on the cluster.

One can also download these .jars at the run time while creating the Spark session as well.

Now, create the Spark session, and you are ready to play with Delta tables.

Environment Setup

How do you add the Delta Lake dependencies to your environment?
1. You can directly add them while initializing the Spark session for Delta Lake by passing the specific version, and these packages or dependencies will be downloaded during run time.
2. You can place the required .jar files in your cluster and provide the reference while initializing the Spark session.
3. You can download the .jar files and store them in cloud storage, and you can pass them as a run time argument if you don’t want to download the dependencies on your cluster.
# Initialize Spark Session import pyspark from delta import * from pyspark.sql.types import * from delta.tables import * from pyspark.sql.functions import * builder = pyspark.sql.SparkSession.builder.appName("My App") \ .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \ .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \ .config("spark.jars.packages", "io.delta:delta-core_2.12:2.2.0") \ # or if jar is already there builder = pyspark.sql.SparkSession.builder.appName("My App") \ .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \ .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") spark = builder.getOrCreate()
```
# Initialize Spark Session
import pyspark
from delta import *
from pyspark.sql.types import *
from delta.tables import *
from pyspark.sql.functions import *

builder = pyspark.sql.SparkSession.builder.appName("My App") \
      .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
        .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \


         .config("spark.jars.packages", "io.delta:delta-core_2.12:2.2.0") \



# or if jar is already there

builder = pyspark.sql.SparkSession.builder.appName("My App") \
      .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
        .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")




spark = builder.getOrCreate()
```
You have to add the following properties to use delta in Spark:-
- Spark.sql.extensions
- Spark.sql.catalog.spark_catalog
You can see these values in the above code snippet. If you want to use cloud storage like reading and writing data from S3, GS, or Blob storage, then we have to set some more configs as well in the Spark session. Here, I am providing examples for AWS and GSC only.

The next thing that will come to your mind: how will you be able to read or write the data into cloud storage?

For different cloud storages, there are certain .jar files available that are used to connect and to do IO operations on the cloud storage. See the examples below.

You can use the above approach to make this .jar available for Spark sessions either by downloading at a run time or storing them on the cluster itself.

AWS

spark_jars_packages = “com.amazonaws:aws-java-sdk:1.12.246,org.apache.hadoop:hadoop-aws:3.2.2,io.delta:delta-core_2.12:2.2.0”

spark = SparkSession.builder.appName(‘delta’)
.config(“spark.jars.packages”, spark_jars_packages)
.config(“spark.sql.extensions”, “io.delta.sql.DeltaSparkSessionExtension”)
.config(“spark.sql.catalog.spark_catalog”, “org.apache.spark.sql.delta.catalog.DeltaCatalog”)
.config(‘spark.hadoop.fs.s3a.aws.credentials.provider’, ‘org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider’)
.config(“spark.hadoop.fs.s3.impl”, “org.apache.hadoop.fs.s3a.S3AFileSystem”)
.config(“spark.hadoop.fs.AbstractFileSystem.s3.impl”, “org.apache.hadoop.fs.s3a.S3AFileSystem”)
.config(“spark.delta.logStore.class”, “org.apache.spark.sql.delta.storage.S3SingleDriverLogStore”)

spark = builder.getOrCreate()

‍

GCS

spark_session = SparkSession.builder.appName(‘delta’).builder.getOrCreate()

‍

spark_session.conf.set(“fs.gs.impl”, “com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem”)

spark_session.conf.set(“spark.hadoop.fs.gs.impl”, “com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem”)

spark_session.conf.set(“fs.gs.auth.service.account.enable”, “true”)

spark_session.conf.set(“fs.AbstractFileSystem.gs.impl”, “com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS”)

spark_session.conf.set(“fs.gs.project.id”, project_id)

spark_session.conf.set(“fs.gs.auth.service.account.email”, credential[“client_email”])

spark_session.conf.set(“fs.gs.auth.service.account.private.key.id”, credential[“private_key_id”])

spark_session.conf.set(“fs.gs.auth.service.account.private.key”, credential[“private_key”])‍

Write into Delta Tables: In the following example, we are using a local system only for reading and writing the data into and from delta lake tables.

Data Set Used: https://media.githubusercontent.com/media/datablist/sample-csv-files/main/files/organizations/organizations-100.zip

For reference, I have downloaded this file in my local machine and unzipped the data:
```
df = spark.read.option("header", "true").csv("organizations-100.csv")
df.write.mode('overwrite').format("delta").partitionBy(partition_keys).save("./Documents/DE/Delta/test-db/organisatuons")
```
There are two modes available in Delta Lake and Spark (Append and Overwrite) while writing the data in the Delta tables from any source.

For now, we have enabled the Delta catalog to store all metadata-related information. We can also use the hive meta store to store the metadata information and to directly run the SQL queries over the delta tables. You can use the cloud storage path as well.

Read data from the Delta tables:
```
delta_df = spark.read.format("delta").load("./Documents/DE/Delta/test-db/organisatuons")
delta_df.show()
```
‍

Here, you can see the folder structure, and after writing the data into Delta Tables, it creates one delta log file, which keeps track of metadata, partitions, and files.

Option 2: Create Delta Table and insert data using Spark SQL.
```
spark.sql("CREATE TABLE orgs_data(index String, c_name String, organization_id String, name String, website String, country String, description String, founded String, industry String, num_of_employees String, remarks String) USING DELTA")
```
Insert the data:
```
df.write.mode('append').format("delta").option("mergeSchema", "true").saveAsTable("orgs_data")
spark.sql("SELECT * FROM orgs_data").show()
spark.sql("DESCRIBE TABLE orgs_data").show()
```
This way, we can read the Delta table, and you can use SQL as well if you have enabled the hive.

Schema Enforcement: Safeguarding Your Data

In the realm of data management, maintaining the integrity of your dataset is paramount. Delta Lake, with its schema enforcement capabilities, ensures that your data is not just welcomed with open arms but also closely scrutinized for compatibility. Let’s dive into the meticulous checks

Delta Lake performs when validating incoming data against the existing schema:

Column Presence: Delta Lake checks that every column in your DataFrame matches the columns in the target Delta table. If there’s a single mismatch, it won’t let the data in and, instead, will raise a flag in the form of an exception.

Data Types Harmony: Data types are the secret language of your dataset. Delta Lake insists that the data types in your incoming DataFrame align harmoniously with those in the target Delta table. Any discord in data types will result in a raised exception.

Name Consistency: In the world of data, names matter. Delta Lake meticulously examines that the column names in your incoming DataFrame are an exact match to those in the target Delta table. No aliases allowed. Any discrepancies will lead to, you guessed it, an exception.

This meticulous schema validation guarantees that your incoming data seamlessly integrates with the target Delta table. If any aspect of your data doesn’t meet these strict criteria, it won’t find a home in the Delta Lake, and you’ll be greeted by an error message and a raised exception.

Schema Evolution: Adapting to Changing Data

In the dynamic landscape of data, change is the only constant. Delta Lake’s schema evolution comes to the rescue when you need to adapt your table’s schema to accommodate incoming data. This powerful feature offers two distinct approaches:

Overwrite Schema: You can choose to boldly overwrite the existing schema with the schema of your incoming data. This is an excellent option when your data’s structure undergoes significant changes. Just set the “overwriteSchema” option to true, and voila, your table is reborn with the new schema.

Merge Schema: In some cases, you might want to embrace the new while preserving the old. Delta Lake’s “Merge Schema” property lets you merge the incoming data’s schema with the existing one. This means that if an extra column appears in your data, it elegantly melds into the target table without throwing any schema-related tantrums.

Should you find the need to tweak column names or data types to better align with the incoming data, Delta Lake’s got you covered. The schema evolution capabilities ensure your dataset stays in tune with the ever-changing data landscape. It’s a smooth transition, no hiccups, and no surprises, just data management at its finest.
```
spark.read.table(...) 
  .withColumn("birthDate", col("birthDate").cast("date")) 
  .write 
  .format("delta") 
  .mode("overwrite")
  .option("overwriteSchema", "true") 
  .saveAsTable(...)
```
The above code will overwrite the existing delta table with the new schema along with the new data.

Delta Lake has support for automatic schema evolution. For instance, if you have added two more columns in the Delta Lake tables and still tried to access the existing table, you will be able to read the data without any error.

There is another way as well. For example, if you have three columns in a Delta table but the incoming table has four columns, you can set up spark.databricks.delta.schema.autoMerge.enabled is true. It can be done for the entire cluster as well.
```
spark.sql("DESCRIBE TABLE orgs_data").show()
```
Let’s add one more column and try to access the data again:

spark.sql(“alter table orgs_data add column(exta_col String)”)

spark.sql(“describe table orgs_data”).show()

As you can see, that column has been added but has not impacted the data. You can still smoothly and seamlessly read the data. It will set null to a newly created column.

What happens if we receive the extra column in an incoming CSV that we want to append to the existing delta table? You have to set up one config here for that:
```
input_df = spark.read.format('csv').option('header', 'true').load("../Desktop/Data-Engineering/data-samples/input-data/organizations-11111.csv")
input_df.printSchema()
input_df.write.mode('append').format("delta").option("mergeSchema", "true").saveAsTable("orgs_data")
spark.sql("SELECT * FROM orgs_data").show()
spark.sql("DESCRIBE TABLE orgs_data").show()
```
You have to add this config mergeSchem=true while appending the data. It will merge the schema of incoming data that is receiving some extra columns.

The first figure shows the schema of incoming data, and in the previous one, we have already seen the schema of our delta tables.

Here, we can see that the new column that was coming in the incoming data is merged with the existing schema of the table. Now, the delta table has the latest updated schema.

Time Travel

Basically, Delta Lake keeps track of all the changes in _delta_log by creating a log file. By using this, we can fetch the data of the previous version by specifying the version number.
```
df = spark.read.format("delta").option("versionAsOf", 0).load("orgs_data")
df.show()
```
Here, we can see the first version of the data, where we have not added any columns. As we know, the Delta table maintains the delta log file, which contains the information of each commit so that we can fetch the data till the particular commit.

Upsert, Delete, and Merge

Unlocking the Power of Upsert with Delta Lake

In the exhilarating realm of data management, upserting shines as a vital operation, allowing you to seamlessly merge new data with your existing dataset. It’s the magic wand that updates, inserts, or even deletes records based on their status in the incoming data. However, for this enchanting process to work its wonders, you need a key—a primary key, to be precise. This key acts as the linchpin for merging data, much like a conductor orchestrating a symphony.

A Missing Piece: Copy on Write and Merge on Read

Now, before we delve into the mystical world of upserting with Delta Lake, it’s worth noting that Delta Lake dances to its own tune. Unlike some other table formats like Hudi and Iceberg, Delta Lake doesn’t rely on the concepts of Copy on Write and Merge on Read. These techniques are used elsewhere to speed up data operations.

Two Paths to Merge: SQL and Spark API

To harness the power of upserting in Delta Lake, you have two pathways at your disposal: SQL and Spark API. The choice largely depends on your Delta version. In the latest Delta version, 2.2.0, you can seamlessly execute merge operations using Spark API. It’s a breeze. However, if you’re working with an earlier Delta version, say 1.0.0, then Spark SQL is your trusty steed for upserts and merges. Remember, using the right Delta version is crucial, or you might find yourself grappling with the cryptic “Method not found” error, which can turn into a debugging labyrinth.

In the snippet below, we showcase the elegance of upserting using Spark SQL, a technique that ensures your data management journey is smooth and error-free:
```
-- Insert new data and update existing data based on the specified key
MERGE INTO targetTable AS target
USING sourceTable AS source
ON target.id = source.id
WHEN MATCHED THEN
  UPDATE SET *
WHEN NOT MATCHED THEN
  INSERT *
WHEN NOT MATCHED BY SOURCE THEN
  DELETE;
```
```
today_data_df = spark.read.format('csv').option('header', 'true').load("../Desktop/Data-Engineering/data-samples/input-data/organizations-11111.csv")
today_data_df.show()


spark.sql("select * from orgs_data where organization_id = 'FAB0d41d5b5ddd'").show()


# Reading Existing Delta table
deltaTable = DeltaTable.forPath(spark, "orgs_data")

today_data_df.createOrReplaceTempView("incoming_data")
```
Here, we are loading the incoming data and showing what is inside. The existing data with the same primary key appears in the Delta table so that we can compare after upserting or merging the data.
```
spark.sql(
"""
MERGE INTO orgs_data
USING incoming_data
ON orgs_data.organization_id = incoming_data.organization_id
WHEN MATCHED THEN
  UPDATE SET
    organization_id = incoming_data.organization_id,
    name = incoming_data.name
"""
)

spark.sql("select * from orgs_data where organization_id = 'FAB0d41d5b5ddd'").show()
```
‍
```
orgs_data.alias("oldData").merge(
   incoming_data.alias("newData"),
   f"oldData.organization_id = newData.organization_id") 
   .whenMatchedUpdateAll() 
   .whenNotMatchedInsertAll() 
   .execute()
```
This is the example of how you can do upsert using Spark APIs. The merge operation creates lots of small files. You can control the number of small files by setting up the following properties in the spark session.

spark.delta.merge.repartitionBeforeWrite true

spark.sql.shuffle.partitions 10

This is how merge operations work. Merge supports one-to-one mapping. What we want to say is that only rows should try to update the one row in the target delta table. If multiple rows try to update the rows in the target Delta table, it will fail. Delta Lake matches the data on the basis of a key in case of an update operation.

Change Data Feed

This is also a useful feature of Delta Lake and tracks and maintains the history of all records in the Delta table after upsert or insert at the row level. You can enable these things at the beginning while setting up the Spark session or using Spark SQL by enabling “change events” for all the data.

Now, you can see the whole journey of each record in the Delta table, from its insertion to deletion. It introduces one more extra column, _change_type, which contains the type of operations that have been performed on that particular row.

To enable this, you can set these configurations:
```
spark.sql("set spark.databricks.delta.properties.defaults.enableChangeDataFeed = true;") 
```
Or you can set this conf while reading the delta table as well.
## Stream Data Generation data = [{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True, "Year": 2022}, {"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False, "Year": 2020}, {"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None, "Year": 2022}, {"Category": 'E', "ID": 5, "Value": 33.87, "Truth": True, "Year": 2022} ] df = spark.createDataFrame(data) df.show() df.write.mode('overwrite').format("delta").partitionBy("Year").save("silver_table")
```
## Stream Data Generation

data = [{"Category": 'A', "ID": 1, "Value": 121.44, "Truth": True, "Year": 2022},
        {"Category": 'B', "ID": 2, "Value": 300.01, "Truth": False, "Year": 2020},
        {"Category": 'C', "ID": 3, "Value": 10.99, "Truth": None, "Year": 2022},
        {"Category": 'E', "ID": 5, "Value": 33.87, "Truth": True, "Year": 2022}
        ]

df = spark.createDataFrame(data)

df.show()

df.write.mode('overwrite').format("delta").partitionBy("Year").save("silver_table")
```
```
deltaTable = DeltaTable.forPath(spark, "silver_table")
                                 
deltaTable.delete(condition = "ID == 1")

delta_df = spark.read.format("delta").option("readChangeFeed", "true").option("startingVersion", 0).load("silver_table")
delta_df.show()
```
Now, after deleting something, you will be able to see the changes, like what is deleted and what is updated. If you are doing upserts on the same Delta table after enabling the change data feed, you will be able to see the update as well, and if you insert anything, you will be able to see what is inserted in your Delta table.

If we overwrite the complete Delta table, it will mark all past records as a delete:

If you want to record each data change, you have to enable this before creating the table so that we can see the data changes for each version. If you’ve already created one table, you won’t be able to see the changes for the previous version once you enable the change data feed, but you will be able to see the changes in all versions that came after this configuration.

Data Skipping with Z-ordering

Data skipping is the technique in Delta Lake where, if you have a larger number of records stored in many files, it will read the data from the files that contain required information, but apart from that, other files will get skipped. This makes it faster to read the data from the Delta tables.

Z-ordering is a technique used to colocate the information in the same dataset files. If you know the column that will be more in use in the select statement and has A cardinality, you can use Z-order by that particular column. It will reduce the large number of files from being read. We can give you multiple columns for Z-order by separating them from commas.

Let’s suppose you have two tables, a and b, and there is one column that is most frequently used. You can increase the number of files to be skipped, and you can use the columns files running that query. Normal order works linearly, whereas Z-order works in multi- dimensionally.
```
OPTIMIZE events
WHERE date >= current_timestamp() - INTERVAL 1 day
ZORDER BY (eventType)
```
DML Operations

Delta Lake has capabilities to run all the DML operations of SQL in the data lake as well as update, delete, and merge operations.

Integrations and Ecosystem Supported in Delta Lake

‍Read Delta Tables

Unlock the Delta Tables: Tools That Bring Data to Life

Reading data from Delta tables is like diving into a treasure trove of information, and there’s more than one way to unlock its secrets. Beyond the standard Spark API, we have a squad of powerful allies ready to assist: SQL query engines like Athena and Trino. But they’re not just passive onlookers; they bring their own magic to the table, empowering you to perform data manipulation language (DML) operations that can reshape your data universe.

Athena: Unleash the SQL Sorcery

Imagine Athena as the Oracle of data. With SQL as its spellbook, it delves deep into your Delta tables, fetching insights with precision and grace. But here’s the twist: Athena isn’t just for querying; like a skilled blacksmith, it can help you hammer your data into a new shape, creating a masterpiece.

Trino: The Shape-Shifting Wizard

Trino, on the other hand, is the shape-shifter of the data realm. It glides through Delta tables, allowing you to perform an array of DML operations that can transform your data into new, dazzling forms. Think of it as a master craftsman who can sculpt your data, creating entirely new narratives and visualizations.

So, when it comes to Delta tables, these tools are not just readers; they are your co-creators. They enable you to not only glimpse the data’s beauty but also mold it into whatever shape serves your purpose. With Athena and Trino at your side, the possibilities are as boundless as your imagination.

Read Delta Tables Using Spark APIS
```
from delta.tables import *
delta_df =DeltaTable.forPath(spark,"./Documents/DE/Delta/test-db/organisatuons")<br>delta_df.toDf().show()
```
Steps to Set Up Delta Lake with S3 on EC2 Or EMR and Access Data through Athena

Data Set Used – We have generated some dummy data of around 100gb and written that into the Delta tables.

Step 1- Set up a Spark session along with AWS cloud storage and Delta – Spark. Here, we have used an EC2 instance with Spark 3.3 and Delta version 2.1.1. Here, we are setting up Spark config for Delta and S3.
AWS_ACCESS_KEY_ID = "XXXXXXXXXXXXXXXXXXXXXX" AWS_SECRET_ACCESS_KEY = "XXXXXXXXXXXXXXXXXXXXXX+XXXXXXXXXXXXXXXXXXXXXX" spark_jars_packages = "com.amazonaws:aws-java-sdk:1.12.246,org.apache.hadoop:hadoop-aws:3.2.2,io.delta:delta-core_2.12:2.1.1" spark = SparkSession.builder.appName('delta') \ .config("spark.jars.packages", spark_jars_packages) \ .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \ .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \ .config('spark.hadoop.fs.s3a.aws.credentials.provider', 'org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider') \ .config("spark.hadoop.fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") \ .config("spark.hadoop.fs.AbstractFileSystem.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") \ .config("spark.delta.logStore.class", "org.apache.spark.sql.delta.storage.S3SingleDriverLogStore") \ .config("spark.driver.memory", "20g") \ .config("spark.memory.offHeap.enabled", "true") \ .config("spark.memory.offHeap.size", "8g") \ .getOrCreate() spark.sparkContext._jsc.hadoopConfiguration().set("fs.s3a.access.key", AWS_ACCESS_KEY_ID) spark.sparkContext._jsc.hadoopConfiguration().set("fs.s3a.secret.key", AWS_SECRET_ACCESS_KEY)
```
AWS_ACCESS_KEY_ID = "XXXXXXXXXXXXXXXXXXXXXX"
AWS_SECRET_ACCESS_KEY = "XXXXXXXXXXXXXXXXXXXXXX+XXXXXXXXXXXXXXXXXXXXXX"

spark_jars_packages = "com.amazonaws:aws-java-sdk:1.12.246,org.apache.hadoop:hadoop-aws:3.2.2,io.delta:delta-core_2.12:2.1.1"

spark = SparkSession.builder.appName('delta') \
   .config("spark.jars.packages", spark_jars_packages) \
   .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
   .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \
   .config('spark.hadoop.fs.s3a.aws.credentials.provider', 'org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider') \
   .config("spark.hadoop.fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") \
   .config("spark.hadoop.fs.AbstractFileSystem.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") \
   .config("spark.delta.logStore.class", "org.apache.spark.sql.delta.storage.S3SingleDriverLogStore") \
   .config("spark.driver.memory", "20g") \
   .config("spark.memory.offHeap.enabled", "true") \
   .config("spark.memory.offHeap.size", "8g") \
   .getOrCreate()

spark.sparkContext._jsc.hadoopConfiguration().set("fs.s3a.access.key", AWS_ACCESS_KEY_ID)
spark.sparkContext._jsc.hadoopConfiguration().set("fs.s3a.secret.key", AWS_SECRET_ACCESS_KEY)
```
Spark Version – You can use any Spark version, but Spark 3.3.1 came along with the pip install. Just make sure whatever version you are using is compatible with the Delta Lake version that you are using; otherwise, most of the features won’t work.

Step 2 – Here, we are creating a Delta table with an S3 path. We can directly write the data into an S3 bucket as a Delta table, but it is better to create a table first and then write it into S3 to make sure the schema is correct.

Set the Delta location path if it exists to run the Spark SQL query and create the Delta table along with the S3 path.
# If table is already there delta_path = "s3a://abhishek-test-01012023/delta-lake-sample-data/" spark.conf.set('table.location', delta_path) # Creating new delta table on s3 location spark.sql("CREATE TABLE delta.`s3://abhishek-test-01012023/delta-lake-sample-data/`(id INT, first_name String, " "last_name String, address String, pincocde INT, net_income INT, source_of_income String, state String, " "email_id String, description String, population INT, population_1 String, population_2 String, " "population_3 String, population_4 String, population_5 String, population_6 String, population_7 String, " "date String) USING DELTA PARTITIONED BY (date)")
```
# If table is already there
delta_path = "s3a://abhishek-test-01012023/delta-lake-sample-data/"
spark.conf.set('table.location', delta_path)

# Creating new delta table on s3 location
spark.sql("CREATE TABLE delta.`s3://abhishek-test-01012023/delta-lake-sample-data/`(id INT, first_name String, "
         "last_name String, address String, pincocde INT, net_income INT, source_of_income String, state String, "
         "email_id String, description String, population INT, population_1 String, population_2 String, "
         "population_3 String, population_4 String, population_5 String, population_6 String, population_7 String, "
         "date String) USING DELTA PARTITIONED BY (date)")
```
Step 3 – Here, I have given one link that I have used to generate the dummy data and have written that into the S3 bucket as Delta tables. Feel free to look over this. An example code of writing is given below:
```
df.write.format("delta").mode("append").partitionBy("date").save("s3a://abhishek-test-01012023/delta-lake-sample-data/")
```
https://github.com/velotio-tech/delta-lake-iceberg-poc/blob/0396cdbf96230609695a907fdbe8c240042fce9e/delta-data-writer.py#L83

In the above link, you find the code of dummy data generation.

Step 4 – Here, we are printing the count and selecting some data from the Delta table that we have written in just right away.

Run the SQL query to check the table data and upsert using S3 bucket data:
spark.sql("select count(*) from delta.`s3://abhishek-test-01012023/delta-lake-sample-data/` group by id having count(" "*) > 1").show() spark.sql("select count(*) from delta.`s3://abhishek-test-01012023/delta-lake-sample-data/`").show() ############################################################################# # Upsert ############################################################################# # upserts the starting five records. We will read first five record and will do some changes in some columns and input_df = spark.read.csv("s3a://abhishek-test-01012023/incoming_data/delta/4e0ae9f5-8c9d-435a-a434-febff1effbc3.csv",inferSchema=True,header=True) input_df.printSchema() input_df.createOrReplaceTempView("incoming") spark.sql("MERGE INTO delta.`s3://abhishek-test-01012023/delta-lake-sample-data/` t USING (SELECT * FROM incoming) s ON t.id = s.id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT *")
```
spark.sql("select count(*) from delta.`s3://abhishek-test-01012023/delta-lake-sample-data/` group by id having count("
         "*) > 1").show()

spark.sql("select count(*) from delta.`s3://abhishek-test-01012023/delta-lake-sample-data/`").show()

#############################################################################
# Upsert
#############################################################################

# upserts the starting five records. We will read first five record and will do some changes in some columns and

input_df = spark.read.csv("s3a://abhishek-test-01012023/incoming_data/delta/4e0ae9f5-8c9d-435a-a434-febff1effbc3.csv",inferSchema=True,header=True)
input_df.printSchema()
input_df.createOrReplaceTempView("incoming")
spark.sql("MERGE INTO delta.`s3://abhishek-test-01012023/delta-lake-sample-data/` t USING (SELECT * FROM incoming) s ON t.id = s.id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT *")
```
This is the output of select statement:

This is the schema of the incoming data we are planning to merge into the existing Delta table:

After upsert, let’s see the data for the particular data partition:

spark.sql(“select * from delta.`s3://abhishek-test-01012023/delta-lake-sample-data/` where id = 1 and date = 20221206”)

‍

‍

Access Delta table using Hive or any other external metastore:

For that, we have to create a link between them and to create this link, go to the Spark code and generate the manifest file on the S3 path where we have already written the data.
```
spark.sql("GENERATE symlink_format_manifest FOR TABLE 
delta.`s3a://abhishek-test-01012023/delta-lake-sample-data/`")
```
This will create the manifest folder not go to Athena and run this query:
CREATE EXTERNAL TABLE delta_db.delta_table(id INT, first_name String, last_name String, address String, pincocde INT, net_income INT, source_of_income String, state String, email_id String, description String, population INT, population_1 String, population_2 String, population_3 String, population_4 String, population_5 String, population_6 String, population_7 String) PARTITIONED BY (date String) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://abhishek-test-01012023/delta-lake-sample-data/_symlink_format_manifest/'
```
CREATE EXTERNAL TABLE delta_db.delta_table(id INT, first_name String, last_name String, address String, pincocde INT, net_income INT, source_of_income String, state String, email_id String, description String, population INT, population_1 String, population_2 String, population_3 String, population_4 String, population_5 String, population_6 String, population_7 String) 
PARTITIONED BY (date String)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://abhishek-test-01012023/delta-lake-sample-data/_symlink_format_manifest/'
```
```
MSCK REPAIR TABLE delta_db.delta_table
```
You will be able to query the data.

Conclusion: A New Dawn

In a world where data continued to grow in volume and complexity, Delta Lake stood as a beacon of hope. It empowered organizations to manage their data lakes with unprecedented efficiency and extract insights with unwavering confidence.

The adoption of Delta Lake marked a new dawn in the realm of data. Whether dealing with structured or semi-structured data, it was the answer to the prayers of data warriors. As the sun set on traditional formats, Delta Lake emerged as the hero they had been waiting for—a hero who had not only revolutionized data storage and processing but also transformed the way stories were told in the world of data.

And so, the legend of Delta Lake continued to unfold, inspiring data adventurers across the land to embark on their own quests, armed with the power of ACID transactions, time travel, and the promise of a brighter, data-driven future.
November 19, 2025
How Cross-Functional Data Collaboration Fuels AI Forecasting in Manufacturing
Factories have always been noisy places. Not just the machines—the data too. Every department speaks a different dialect: engineering in blueprints, quality in percentages, finance in margins. Somewhere in the middle, meaning flickers for a second and disappears again.

Now and then, though, someone manages to connect those fragments. And suddenly, forecasting stops being merely about predicting the future. It becomes more about teams listening to each other better.

What the Big Idea Really Is

Cross-functional data collaboration sounds grand, but in practice, it is a modest habit: people sharing what they know, even when their worlds don’t perfectly align. When that happens often enough, AI forecasting begins to find its footing.
1. Context changes everything. Data is mute until someone explains it. When quality and operations and finance look at the same trend together, the AI model stops predicting blindly and starts learning the logic behind the numbers.
2. Shared language breeds trust. The real progress often happens in small arguments—what counts as a defect, what counts as a delay, and so on. Agreement builds a kind of quiet reliability into the system.
3. Speed follows understanding. Once the translation work is done, decisions move faster without anyone forcing it. Collaboration shrinks the distance between seeing and acting.
4. Learning runs both ways. AI models learn from data, yes, but teams learn from the models too—what the system notices, what it misses, what it exaggerates.
What Research and Experience Suggest

Manufacturing research increasingly confirms what we experience in practice: AI forecasting only becomes truly powerful when data flows freely within the organization—across functions, not just within silos. A literature review in Applied Sciences highlights that machine learning in manufacturing gains its strategic power when production, quality, and system data are treated as interconnected (e.g., the “Four-Know” framework: what, why, when, how).

Academic work on decentralized manufacturing also shows that sharing insights between units —for example, through knowledge distillation—improves model performance in under-informed parts of the organization. Meanwhile, quality-prediction research demonstrates that explainability techniques can prune irrelevant features from forecasting models, improving accuracy and interpretability.

These findings support the idea that cross-functional collaboration is not optional – it’s a pre-condition for success. When AI models are built on data that reflects operations, engineering, procurement, and quality together, forecasting becomes more accurate, more explainable, and more trusted. That kind of collaboration isn’t just technical: it’s deeply human.

A Small Illustration to Put This Big Idea into Perspective

Consider a mid-sized electronics manufacturer. Suddenly, the production team notices a subtle decline in yield. The quality department begins reporting higher-than-usual defect rates. Procurement, however, insists that their suppliers are stable—no obvious issues. On the surface, dashboards look fine. But when a cross-functional data task force (operations, quality, procurement, and data engineers) digs deeper, they uncover a misalignment: data from different functions is recorded in inconsistent formats and timestamps, leading to distorted aggregations.

Together, they re-align the data – standardizing units, recalibrating timestamps, and merging datasets. Over the next few months, their AI forecasting system begins to issue early warning signals. A similar pattern to the previous yield decline appears in the forecast, but this time the team acts before the issue reaches the shop floor. The early alert isn’t magic; it’s the result of shared understanding—of functions working with the same underlying reality.

Why This Matters for Manufacturers
- Agility grows quietly. Teams that talk regularly move faster, even without new tools.
- Transparency replaces suspicion. When everyone helped build the model, no one treats it like a black box.
- Improvement loops back. Forecasts feed design; design refines process; process refines data.
- Resilience hides in plain sight. Shop floors that promote cross-functional data sharing tend to respond better, not louder.
Finding Rhythm in the Rough Edges

Collaboration can be tiring. Ownership blurs. Tools overlap. Meetings multiply. There’s a temptation to smooth the friction away. But the friction is the signal. It shows that people are actually engaging, not just aligning through slide decks. In data collaboration, as in music, a little dissonance means the system is alive.

How R Systems Helps Manufacturers Achieve More Reliable Forecasting

At R Systems, we’ve learned that collaboration cannot be installed; it has to be engineered gently. We help manufacturers connect their data across functions—through clean integration layers, unified data models, and agentic AI systems that learn from how people already work.

Our AI solutions don’t replace collaboration; they make it easier to sustain. The systems adapt to team rhythms, not the other way around. Forecasting then becomes a shared act between human intuition and machine precision.

A Closing Thought – Not by Any Means a Final Word

If perfect data is the goal, collaboration will always feel messy. But if learning is the goal, the mess starts to look like progress.

We tend to think of AI forecasting here as only about predicting the trends of tomorrow. In fact, it’s not entirely just that. It’s largely about helping manufacturers, see clearly, the facts of today. When cross-functional teams start collaborating and sharing data, the fog around what’s next begins to clear. Talk to our experts today.
November 12, 2025
How AI Knowledge Engines Accelerate Data-to-Decision Cycles in Manufacturing
Data runs through every corner of manufacturing, from shop-floor sensors to ERP systems, maintenance logs, and supplier feeds. Yet the journey from data to decision often drags, with insights stuck in silos and actions coming too late. That’s where an AI knowledge engine changes the game. It connects raw data with human expertise, reasons across sources, and turns information into timely, actionable insights. For data and AI professionals and business leaders, it’s a way to move from spotting a signal to making a confident decision and fast.

From Manufacturing Analytics to Decision Acceleration

Traditional manufacturing analytics has focused on reporting and visualization with dashboards, KPIs, and trend charts that describe what’s happening. But analytics alone doesn’t drive improvement. The real value lies in faster, smarter decisions: when to intervene, how to optimize, and where to allocate resources.

An AI knowledge engine transforms that process. It absorbs structured and unstructured data from machine sensors, operator logs, quality reports, and supply chain updates and weaves them into a coherent model. Instead of isolated insights, you get connected intelligence.

Imagine a scenario where one sensor shows a vibration anomaly while another logs a temperature spike in a different module. The engine correlates these signals, links them to past maintenance records, and infers a likely root cause, say a misaligned component creating downstream defects. The system then recommends a precise corrective action before the problem impacts production.

That’s not just analytics. That’s decision acceleration.

What Exactly Is an AI Knowledge Engine?

The term may sound abstract, but the concept is straightforward. An AI knowledge engine combines data integration, semantic understanding, and reasoning to deliver contextual insights. It doesn’t just retrieve data: it learns, connects, and infers.

Key components include:
- Data ingestion – capturing both structured data (sensor readings, ERP data) and unstructured data (notes, images, reports).
- Semantic modeling – using ontologies and knowledge graphs to represent how components, processes, and events relate to each other.
- Reasoning and inference – applying AI algorithms to detect patterns, diagnose root causes, and recommend actions.
- Continuous learning – refining models as new data and feedback come in.
As some sources describe it, knowledge engines encode not just data but also the expertise i.e. the judgment, intuition, and decision-making logic of seasoned engineers into systems that can reason autonomously.

In manufacturing, that means moving from descriptive (“what happened”) to prescriptive (“what should we do”) and eventually proactive (“what will happen next”).

Why Manufacturing Needs AI Knowledge Engines

Manufacturing is inherently complex. It’s a web of machines, materials, people, and suppliers. Decisions ripple across this network. Here’s where AI knowledge engines create tangible value:
- Speed: They cut decision latency by automating data synthesis and surfacing insights instantly.
- Integration: They break down silos between production, maintenance, and supply chain systems.
- Scalability: They capture the expertise of senior engineers and make it reusable across plants and teams.
- Predictive power: They detect hidden correlations, for instance, between supplier variability and product quality.
- Resilience: They enable earlier interventions, reducing downtime and waste.
Each of these outcomes ties directly to measurable KPIs like faster cycle times, higher first-pass yield, lower maintenance costs, and improved operational stability.

Making AI Knowledge Engine Work

Deploying an AI knowledge engine isn’t just about technology. It’s about clarity of purpose and disciplined execution.
1. Start with the decision — Identify which decisions you want to accelerate: maintenance actions, quality interventions, or supplier responses.
2. Map the data sources — Ensure access to both quantitative data and qualitative context, like technician notes or operator feedback.
3. Build the semantic layer — Create a knowledge graph that reflects how your processes and assets relate.
4. Embed into workflows — Integrate the engine’s insights into control systems, dashboards, or maintenance apps so actions follow naturally.
5. Govern continuously — Monitor model accuracy, data quality, and explainability to build trust and avoid bias.
When implemented well, the engine doesn’t replace human decision-makers rather, it empowers them. It brings context and foresight into everyday operations.

A Quick Use Case

Take a discrete manufacturing plant producing complex assemblies. Sensors monitor vibration, temperature, and throughput. Maintenance logs record interventions. Supplier data tracks material quality.

An AI knowledge engine connects all this. When a sensor drift appears, it correlates the signal with past maintenance records and supplier deliveries. Within seconds, it flags a likely cause like a worn bearing from a recent batch and suggests a replacement schedule. The maintenance team acts before a breakdown occurs, preventing a 12-hour downtime.

That’s data turning into decision: seamlessly.

The Bottom Line

Manufacturing has long chased the vision of real-time, insight-driven operations. AI knowledge engines make that vision practical. They don’t just analyze data; they understand it in context, reason over it, and translate it into decisions that matter.

For data and AI professionals, they represent the next step in operational intelligence. For business leaders, they unlock a faster, more resilient enterprise. The gap between knowing and doing is finally closing and it’s powered by knowledge.

How R Systems Helps Manufacturers Get There

At R Systems, we help manufacturers bridge the gap between data and decision through AI-driven knowledge systems, analytics, and domain expertise. Our teams combine manufacturing intelligence, data engineering, and applied AI to design solutions that capture context, automate reasoning, and accelerate insights across operations.

From predictive maintenance and process optimization to digital twins and AI-powered decision systems, we help you move beyond dashboards toward truly intelligent manufacturing.

Turn your data into knowledge. Turn your knowledge into action. Talk to our experts today.
November 12, 2025
Enabling Data-Driven Demand Planning with AI-Powered Momentum Volume Projection
- Unified Forecasting Framework – Developed an AI-powered Momentum Volume Projection model to centralize demand forecasting across all product categories and channels.
- Accuracy & Visibility – Achieved 5–10% forecast error (MAPE) with reliable projections across 1-, 3-, 6-, and 12-month horizons, enhancing business foresight.
- Operational Agility – Reduced manual forecasting efforts, improved inventory control, and accelerated planning cycles across residential, commercial, and smart access products.
- Data-Driven Decisioning – Enabled pricing, promotion, and sales teams with real-time dashboards for proactive, evidence-based actions.
- Scalability & Governance – Deployed a standardized, low-maintenance ML architecture ensuring consistency, model performance, and ease of extension to new product lines.
- Strategic Outcomes – Strengthened planning precision, improved financial alignment, and established a future-ready AI foundation for sustainable growth.
November 11, 2025
5 Reasons to Embed FinOps Governance-as-Code in Your IaC Workflows
Cloud adoption has changed how businesses build and scale technology. Infrastructure is now software-defined, and teams can spin up compute, storage, and networking in minutes. While this flexibility fuels innovation, it also creates a challenge: costs and compliance can easily get out of hand if not governed properly.

That is where FinOps comes in. Traditionally, FinOps has helped organizations bring financial accountability to cloud operations. But in many cases, the discipline has been reactive. Teams analyze bills after the fact, spot anomalies, and try to course-correct. By then, the costs have already been incurred.

With Infrastructure as Code (IaC), there is a better way. Governance-as-Code embeds FinOps principles directly into the development workflow, making cost and compliance checks proactive instead of reactive.

Why Reactive FinOps Isn’t Enough

Think about how DevOps transformed software delivery. Before CI/CD pipelines, testing and integration were manual, error-prone, and slow. By automating these steps, DevOps brought speed, consistency, and reliability.

FinOps is going through a similar shift. Manual reviews and after-the-fact reporting can’t keep pace with dynamic cloud environments. Developers need to move fast, but without the right guardrails, projects risk overspending, missing compliance requirements, or violating SLAs.

Traditional FinOps provides visibility into these problems. Governance-as-Code prevents them in the first place.

How Governance-as-Code Works

Governance-as-Code integrates FinOps policies into IaC tools and CI/CD pipelines. This means budgets, tagging standards, and cost limits are enforced as part of the deployment process—not afterward.

Here are some practical examples:
- Cost estimation with Infracost: Before a new resource is deployed, Infracost calculates its projected cost. Developers see the impact of their code changes on the cloud bill, enabling smarter choices.
- Policy enforcement with OPA (Open Policy Agent): OPA can enforce rules such as “all resources must have tags for cost center and owner” or “deployments cannot exceed predefined budgets.” If code violates these rules, the pipeline fails, stopping the issue before it reaches production.
- Automated anomaly detection: Pipelines can include automated checks for unusual usage patterns or unexpected cost spikes, catching issues early.
The result: cost governance is built into the development process, without slowing down delivery.

Benefits of FinOps Governance-as-Code

1. Proactive Cost Control

Instead of analyzing costs at the end of the month, teams know upfront what their changes will cost. This prevents surprises and helps organizations stay within budget.

2. Stronger Compliance

Tagging, access policies, and cost allocation rules are automatically enforced. This ensures clean data for reporting and audit readiness without adding manual effort.

3. Reduced Waste

Resources without proper governance—like orphaned storage volumes or untagged VMs—are caught at the source. This minimizes waste before it builds up.

4. Faster Delivery

Some worry that governance will slow down developers. The opposite is true. By automating policies, developers spend less time on manual reviews or fixes. They can move faster, with confidence that costs and compliance are under control.

5. Better Collaboration

Governance-as-Code creates a common framework for developers, finance teams, and operations. Instead of working in silos, everyone aligns on shared rules, improving trust and accountability.

From Challenge to Control: A Governance-as-Code Success Story

A global transit software leader faced exactly this challenge. With a growing cloud footprint, they struggled to enforce tagging, track anomalies, and ensure compliance with service-level agreements (SLAs). Manual checks weren’t keeping pace, and margins were under pressure.

Working with R Systems, the company implemented a Governance-as-Code approach to FinOps. Automation enforced 100% tagging compliance, while anomaly detection was built into deployment pipelines. This meant every new resource came with full visibility, cost attribution, and compliance by design.

The impact was clear:
- SLAs were protected through consistent governance.
- Margins were preserved by eliminating waste and avoiding cost surprises.
- Delivery speed improved, since developers no longer had to worry about manual compliance checks.
This shows how proactive FinOps discipline not only reduces risk but also gives teams the freedom to focus on innovation.

Read the full story here- Real-Time Cloud Governance That Safeguards Margins and SLAs – R Systems

The R Systems Approach to FinOps Governance-as-Code

At R Systems, we see Governance-as-Code as a natural next step for FinOps. Our Cloud FinOps practice is built on three pillars:
- Visibility and Control: Real-time insights into cloud spend, with proactive enforcement of budgets, tags, and policies.
- Automation and Scale: Tools like Infracost, OPA, and custom governance frameworks integrated directly into CI/CD pipelines.
- Collaboration and Enablement: Cross-functional alignment between engineering, finance, and business leaders, so FinOps is not a bottleneck but a business enabler.
Whether you’re looking to enforce cost allocation, automate anomaly detection, or scale compliance across multi-cloud environments, R Systems brings the expertise to embed governance smoothly into your workflows.

Explore our Cloud FinOps capabilities to learn more.

Looking Ahead: The Future of Proactive FinOps

Cloud environments are only getting more complex. Multi-cloud strategies, serverless architectures, and AI-driven workloads make governance even more important. Organizations that rely on manual, reactive FinOps will struggle to keep up.

Governance-as-Code offers a way forward. It allows companies to build cost control and compliance into their infrastructure from the start, ensuring agility doesn’t come at the expense of margins.

As many experts note, FinOps is not just about controlling spend—it’s about making the right spend. With Governance-as-Code, those right decisions happen at the speed of deployment.

The Way Forward

If your FinOps strategy is still reactive, it’s time to rethink. Embedding governance into IaC pipelines ensures costs are controlled, compliance is enforced, and developers can innovate without friction.

R Systems can help you make this shift. With deep expertise in Cloud FinOps and proven success in Governance-as-Code, we enable enterprises to protect margins, accelerate delivery, and reinvest savings into growth.

The question is not whether you can use FinOps Governance-as-Code. The question is whether you can afford to run cloud without it.

Let’s move forward together. Start the journey — talk to our Cloud FinOps experts today.
November 6, 2025
Payments Engineering & Intelligence
Our Payments Engineering & Intelligence flyer showcases how we help enterprises modernize and scale payment ecosystems with:
- End-to-end gateway lifecycle management, ensuring uptime, compliance, and API continuity across global acquirers
- AI-driven fraud detection and risk scoring for secure, low-friction transaction experiences
- Automated reconciliation, settlement, and treasury workflows for error-free, real-time financial control
- Modular SDKs, tokenization, and smart routing to accelerate integration and reduce cost of ownership
- Payments Intelligence dashboards and ML pipelines turning transaction data into actionable insights
With this flyer you will –
- See how enterprises achieve up to 75% faster integration cycles and 40% reduction in operational overhead
- Discover how R Systems helps transform payments from a cost center to a strategic revenue enabler
- Learn how to build resilient, compliant, and intelligent payments architectures ready for scale
November 4, 2025