Tag: DevOps

  • Simplifying MySQL Sharding with ProxySQL: A Step-by-Step Guide

    Introduction:

    ProxySQL is a powerful SQL-aware proxy designed to sit between database servers and client applications, optimizing database traffic with features like load balancing, query routing, and failover. This article focuses on simplifying the setup of ProxySQL, especially for users implementing data-based sharding in a MySQL database.

    What is Sharding?

    Sharding involves partitioning a database into smaller, more manageable pieces called shards based on certain criteria, such as data attributes. ProxySQL supports data-based sharding, allowing users to distribute data across different shards based on specific conditions.

    Understanding the Need for ProxySQL:

    ProxySQL is an intermediary layer that enhances database management, monitoring, and optimization. With features like data-based sharding, ProxySQL is an ideal solution for scenarios where databases need to be distributed based on specific data attributes, such as geographic regions.

    ‍Installation & Setup:‍

    There are two ways to install the proxy, either by installing it using packages or running  ProxySQL in docker. ProxySQL can be installed using two methods: via packages or running it in a Docker container. For this guide, we will focus on the Docker installation.

    1. Install ProxySQL and MySQL Docker Images:

    To start, pull the necessary Docker images for ProxySQL and MySQL using the following commands:

    docker pull mysql:latest
    docker pull proxysql/proxysql

    2. Create Docker Network:

    Create a Docker network for communication between MySQL containers:

    docker network create multi-tenant-network

    Note: ProxySQL setup will need connections to multiple SQL servers. So, we will set up multiple SQL servers on our docker inside a Docker network.

    Containers within the same Docker network can communicate with each other using their container names or IP addresses.

    You can check the list of all the Docker networks currently present by running the following command:

    docker network ls

    3. Set Up MySQL Containers:

    Now, create three MySQL containers within the network:

    Note: We can create any number of MySQL containers.

    docker run -d --name mysql_host_1 --network=multi-tenant-network -p 3307:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest 
    docker run -d --name mysql_host_2 --network=multi-tenant-network -p 3308:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest 
    docker run -d --name mysql_host_3 --network=multi-tenant-network -p 3309:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest

    Note: Adjust port numbers as necessary. 

    The default MySQL protocol port is 3306, but since we cannot access all three of our MySQL containers on the same port, we have set their ports to 3307, 3308, and 3309. Although internally, all MySQL containers will connect using port 3306.

    –network=multi-tenant-network. This specifies that the container should be created under the specified network.

    We have also specified the root password of the MySQL container to log into it, where the username is “root” and the password is “pass123” for all three of them.

    After running the above three commands, three MySQL containers will start running inside the network. You can connect to these three hosts using host = localhost or 127.0.0.1 and port = 3307 / 3308 / 3309.

    To ping the port, use the following command:

    for macOS:

    nc -zv 127.0.0.1 3307

    for Windows: 

    ping 127.0.0.1 3307

    for Linux: 

    telnet 127.0.0.1 3307

    Reference Image

    4. Create Users in MySQL Containers:

    Create “user_shard” and “monitor” users in each MySQL container.

    The “user_shard” user will be used by the proxy to make queries to the DB.

    The “monitor” user will be used by the proxy to monitor the DB.

    Note: To access the MySQL container mysql_host_1, use the command:

    docker exec -it mysql_host_1 mysql -uroot -ppass123

    Use the following commands inside the MySQL container to create the user:

    CREATE USER 'user_shard'@'%' IDENTIFIED BY 'pass123'; 
    GRANT ALL PRIVILEGES ON *.* TO 'user_shard'@'%' WITH GRANT OPTION; 
    FLUSH PRIVILEGES;
    
    CREATE USER monitor@'%' IDENTIFIED BY 'pass123'; 
    GRANT ALL PRIVILEGES ON *.* TO monitor@'%' WITH GRANT OPTION; 
    FLUSH PRIVILEGES;

    Repeat the above steps for mysql_host_2 & mysql_host_3.

    If, at any point, you need to drop the user, you can use the following command:

    DROP USER monitor@’%’;

    5. Prepare ProxySQL Configuration:

    To prepare the configuration, we will need the IP addresses of the MySQL containers. To find those, we can use the following command:

    docker inspect mysql_host_1;
    docker inspect mysql_host_2; 
    docker inspect mysql_host_3;

    By running these commands, you will get all the details of the MySQL Docker container under a field named “IPAddress” inside your network. That is the IP address of that particular MySQL container.

    Example:
    mysql_host_1: 172.19.0.2

    mysql_host_2: 172.19.0.3

    mysql_host_3: 172.19.0.4

    Reference image for IP address of mysql_host_1: 172.19.0.2

    Now, create a ProxySQL configuration file named proxysql.cnf. Include details such as IP addresses of MySQL containers, administrative credentials, and MySQL users.

    Below is the content that needs to be added to the proxysql.cnf file:

    datadir="/var/lib/proxysql"
    
    admin_variables=
    {
        admin_credentials="admin:admin;radmin:radmin"
        mysql_ifaces="0.0.0.0:6032"
        refresh_interval=2000
        hash_passwords=false
    }
    
    mysql_variables=
    {
        threads=4
        max_connections=2048
        default_query_delay=0
        default_query_timeout=36000000
        have_compress=true
        poll_timeout=2000
        interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
        default_schema="information_schema"
        stacksize=1048576
        server_version="5.1.30"
        connect_timeout_server=10000
        monitor_history=60000
        monitor_connect_interval=200000
        monitor_ping_interval=200000
        ping_interval_server_msec=10000
        ping_timeout_server=200
        commands_stats=true
        sessions_sort=true
        monitor_username="monitor"
        monitor_password="pass123"
    }
    
    mysql_servers =
    (
        { address="172.19.0.2" , port=3306 , hostgroup=10, max_connections=100 },
        { address="172.19.0.3" , port=3306 , hostgroup=20, max_connections=100 },
        { address="172.19.0.4" , port=3306 , hostgroup=30, max_connections=100 }
    )
    
    
    mysql_users =
    (
        { username = "user_shard" , password = "pass123" , default_hostgroup = 10 , active = 1 },
        { username = "user_shard" , password = "pass123" , default_hostgroup = 20 , active = 1 },
        { username = "user_shard" , password = "pass123" , default_hostgroup = 30 , active = 1 }
    )

    Most of the settings are default; we won’t go into much detail for each setting. 

    admin_variables: These variables are used for ProxySQL’s administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

    mysql_variables, monitor_username, and monitor_password are used to specify the username that ProxySQL will use when connecting to MySQL servers for monitoring purposes. This monitoring user is used to execute queries and gather statistics about the health and performance of the MySQL servers. This is the user we created during step 4.

    mysql_servers will contain all the MySQL servers we want to be connected with ProxySQL. Each entry will have the IP address of the MySQL container, port, host group, and max_connections. Mysql_users will have all the users we created during step 4.

    7. Run ProxySQL Container:

    Inside the same directory where the proxysql.cnf file is located, run the following command to start ProxySQL:

    docker run -d --rm -p 6032:6032 -p 6033:6033 -p 6080:6080 --name=proxysql --network=multi-tenant-network -v $PWD/proxysql.cnf:/etc/proxysql.cnf proxysql/proxysql

    Here, port 6032 is used for ProxySQL’s administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

    Port 6033 is the default port for ProxySQL’s MySQL protocol interface. It is used for handling MySQL client connections. Our application will use it to access the ProxySQL db and make SQL queries.

    The above command will make ProxySQL run on our Docker with the configuration provided in the proxysql.cnf file.

    Inside ProxySQL Container:

    8. Access ProxySQL Admin Console:

    Now, to access the ProxySQL Docker container, use the following command:

    docker exec -it proxysql bash

    Now, once you’re inside the ProxySQL Docker container, you can access the ProxySQL admin console using the command:

    mysql -u admin -padmin -h 127.0.0.1 -P 6032

    You can run the following queries to get insights into your ProxySQL server:

    i) To get the list of all the connected MySQL servers:

    SELECT * FROM mysql_servers;

    ii) Verify the status of the MySQL backends in the monitor database tables in ProxySQL admin using the following command:

    SHOW TABLES FROM monitor;


    If this returns an empty set, it means that the monitor username and password are not set correctly. You can do so by using the below commands:

    UPDATE global_variables SET variable_value=’monitor’ WHERE variable_name='mysql-monitor_username'; 
    UPDATE global_variables SET variable_value=’pass123’ WHERE variable_name='mysql-monitor_password';
    LOAD MYSQL VARIABLES TO RUNTIME; 
    SAVE MYSQL VARIABLES TO DISK;

    And then restart the proxy Docker container:

    iii) Check the status of DBs connected to ProxySQL using the following command:

    SELECT * FROM monitor.mysql_server_connect_log ORDER BY time_start_us DESC;

    iv) To get a list of all the ProxySQL global variables, use the following command:

    SELECT * FROM global_variables; 

    v) To get all the queries made on ProxySQL, use the following command:

    Select * from stats_mysql_query_digest;

    Note: Whenever we change any row, use the below commands to load them:

    Change in variables:

    LOAD MYSQL VARIABLES TO RUNTIME; 
    SAVE MYSQL VARIABLES TO DISK;
    
    Change in mysql_servers:
    LOAD MYSQL SERVERS TO RUNTIME;
    SAVE MYSQL SERVERS TO DISK;
    
    Change in mysql_query_rules:
    LOAD MYSQL QUERY RULES TO RUNTIME;
    SAVE MYSQL QUERY RULES TO DISK;

    And then restart the proxy docker container.

    IMPORTANT:

    To connect to ProxySQL’s admin console, first get into the Docker container using the following command:

    docker exec -it proxysql bash

    Then, to access the ProxySQL admin console, use the following command:

    mysql -u admin -padmin -h 127.0.0.1 -P6032

    To access the ProxySQL MySQL console, we can directly access it using the following command without going inside the Docker ProxySQL container:

    mysql -u user_shard -ppass123 -h 127.0.0.1 -P6033

    To make queries to the database, we make use of ProxySQL’s 6033 port, where MySQL is being accessed.

    9. Define Query Rules:

    We can add custom query rules inside the mysql_query_rules table to redirect queries to specific databases based on defined patterns. Load the rules to runtime and save to disk.

    12. Sharding Example:

    Now, let’s illustrate how to leverage ProxySQL’s data-based sharding capabilities through a practical example. We’ll create three MySQL containers, each containing data from different continents in the “world” database, specifically within the “countries” table.

    Step 1: Create 3 MySQL containers named mysql_host_1, mysql_host_2 & mysql_host_3.

    Inside all containers, create a database named “world” with a table named “countries”.

    i) Inside mysql_host_1: Insert countries using the following query:

    INSERT INTO `countries` VALUES (1,'India','Asia'),(2,'Japan','Asia'),(3,'China','Asia'),(4,'USA','North America'),(5,'Cuba','North America'),(6,'Honduras','North America');

    ii) Inside mysql_host_2: Insert countries using the following query:

    INSERT INTO `countries` VALUES (1,'Kenya','Africa'),(2,'Ghana','Africa'),(3,'Morocco','Africa'),(4, "Brazil", "South America"), (5, "Chile", "South America"), (6, "Morocco", "South America");

    iii) Inside mysql_host_3: Insert countries using the following query:

    CODE: INSERT INTO `countries` VALUES (1, “Italy”, “Europe”), (2, “Germany”, “Europe”), (3, “France”, “Europe”);

    Now, we have distinct data sets for Asia & North America in mysql_host_1, Africa & South America in mysql_host_2, and Europe in mysql_host_3..js

    Step 2: Define Query Rules for Sharding

    Let’s create custom query rules to redirect queries based on the continent specified in the SQL statement.

    For example, if the query contains the continent “Asia,” we want it to be directed to mysql_host_1.

    — Query Rule for Asia and North America 

    INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (10, 1, 'user_shard', "s*continents*=s*.*?(Asia|North America).*?s*", 10, 0);

    — Query Rule for Africa and South America

    INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (20, 1, 'user_shard', "s*continents*=s*.*?(Africa|South America).*?s*", 20, 0);

    — Query Rule for Europe 

    INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (30, 1, 'user_shard', "s*continents*=s*.*?(Europe).*?s*", 30, 0);

    Step 3: Apply and Save Query Rules

    After adding the query rules, ensure they take effect by running the following commands:

    LOAD MYSQL QUERY RULES TO RUNTIME; 
    SAVE MYSQL QUERY RULES TO DISK;

    Step 4: Test Sharding

    Now, access the MySQL server using the ProxySQL port and execute queries:

    mysql -u user_shard -ppass123 -h 127.0.0.1 -P 6033

    use world;

    — Example Queries:

    Select * from countries where id = 1 and continent = "Asia";

    — This will return id=1, name=India, continent=Asia

    Select * from countries where id = 1 and continent = "Africa";

    — This will return id=1, name=Kenya, continent=Africa.

    Select * from countries where id = 1 and continent = "Africa";

    Based on the defined query rules, the queries will be redirected to the specified MySQL host groups. If no rules match, the default host group that’s specified in mysql_users inside proxysql.cnf will be used.

    Conclusion:

    ProxySQL simplifies access to distributed data through effective sharding strategies. Its flexible query rules, combined with regex patterns and host group definitions, offer significant flexibility with relative simplicity.

    By following this step-by-step guide, users can quickly set up ProxySQL and leverage its capabilities to optimize database performance and achieve efficient data distribution.

    References:

    Download and Install ProxySQL – ProxySQL

    How to configure ProxySQL for the first time – ProxySQL

    Admin Variables – ProxySQL

  • A Guide to End-to-End API Test Automation with Postman and GitHub Actions

    Objective

    • The blog intends to provide a step-by-step guide on how to automate API testing using Postman. It also demonstrates how we can create a pipeline for periodically running the test suite.
    • Further, it explains how the report can be stored in a central S3 bucket, finally sending the status of the execution back to a designated slack Channel, informing stakeholders about the status, and enabling them to obtain detailed information about the quality of the API.

    Introduction to Postman

    • To speed up the API testing process and improve the accuracy of our APIs, we are going to automate the API functional tests using Postman.
    • Postman is a great tool when trying to dissect RESTful APIs.
    • It offers a sleek user interface to create our functional tests to validate our API’s functionality.
    • Furthermore, the collection of tests will be integrated with GitHub Actions to set up a CI/CD platform that will be used to automate this API testing workflow.

    Getting started with Postman

    Setting up the environment

    • Click on the “New” button on the top left corner. 
    • Select “Environment” as the building block.
    • Give the desired name to the environment file.

    Create a collection

    • Click on the “New” button on the top left corner.
    • Select “Collection” as the building block.
    • Give the desired name to the Collection.

    ‍Adding requests to the collection

    • Configure the Requests under test in folders as per requirement.
    • Enter the API endpoint in the URL field.
    • Set the Auth credentials necessary to run the endpoint. 
    • Set the header values, if required.
    • Enter the request body, if applicable.
    • Send the request by clicking on the “Send” button.
    • Verify the response status and response body.

    Creating TESTS

    • Click on the “Tests” tab.
    • Write the test scripts in JavaScript using the Postman test API.
    • Run the tests by clicking on the “Send” button and validate the execution of the tests written.
    • Alternatively, the prebuilt snippets given by Postman can also be used to create the tests.
    • In case some test data needs to be created, the “Pre-request Script“ tab can be used.

    Running the Collection

    • Click on the ellipses beside the collection created. 
    • Select the environment created in Step 1.
    • Click on the “Run Collection” button.
    • Alternatively, the collection and the env file can be exported and also run via the Newman command.

    Collaboration

    The original collection and the environment file can be exported and shared with others by clicking on the “Export” button. These collections and environments can be version controlled using a system such as Git.

    • While working in a team, team members raise PRs for their changes against the original collection and env via forking. Create a fork.
    • Make necessary changes to the collection and click on Create a Pull request.
    • Validate the changes and approve and merge them to the main collection.

    Integrating with CI/CD

    Creating a pipeline with GitHub Actions

    GitHub Actions is a continuous integration and continuous delivery (CI/CD) platform that allows you to automate your build, test, and deployment pipeline.

    You can create workflows that build and test every pull request to your repository, or deploy merged pull requests to production. To create a pipeline, follow the below steps

    • Create a .yml file inside folder .github/workflows at root level.
    • The same can also be created via GitHub.
    • Configure the necessary actions/steps for the pipeline.

    Workflow File

    • Add a trigger to run the workflow.
    • The schedule in the below code snippet is a GitHub Actions event that triggers the workflow at a specific time interval using a CRON expression.
    • The push and pull_request events denote the Actions event that triggers the workflow for each push and pull request on the develop branch.
    • The workflow_dispatch tag denotes the ability to run the workflow manually, too, from GitHub Actions.
    • Create a job to run the Postman Collection.
    • Check out the code from the current repository. Also, create a directory to store the results.
    • Install Nodejs.
    • Install Newman and necessary dependencies
    • Running the collection
    • Upload Newman report into the directory

    Generating Allure report and hosting the report onto s3.

    • Along with the default report that Newman provides, Allure reporting can also be used in order to get a dashboard of the result. 
    • To generate the Allure report, install the Allure dependencies given in the installation step above.
    • Once, that is done, add below code to your .yml file.
    • Create a bucket in s3, which you will be using for storing the reports
    • Create an iam role for the bucket.
    • The below code snipper user aws-actions/configure-aws-credentials@v1 action to configure your AWS.
    • Credentials Allure generates 2 separate folders eventually combining them to create a dashboard.

    Use the code snippet in the deploy section to upload the contents of the folder onto your s3 bucket.

    • Once done, you should be able to see the Allure dashboard hosted on the Static Website URL for your bucket.

    Send Slack notification with the Status of the job

    • When a job is executed in a CI/CD pipeline, it’s important to keep the team members informed about the status of the job. 
    • Below GitHub Actions step sends a notification to a Slack channel with the status of the job.
    • It uses the “notify-slack-action” GitHub Action, which is defined in the “ravsamhq/notify-slack-action” repository.
    • The “if: always()” condition indicates that this step should always be executed, regardless of whether the previous steps in the workflow succeeded or failed.
  • Surviving & Thriving in the Age of Software Accelerations

    It has almost been a decade since Marc Andreessen made this prescient statement. Software is not only eating the world but doing so at an accelerating pace. There is no industry that hasn’t been challenged by technology startups with disruptive approaches.

    • Automakers are no longer just manufacturing companies: Tesla is disrupting the industry with their software approach to vehicle development and continuous over-the-air software delivery. Waymo’s autonomous cars have driven millions of miles and self-driving cars are a near-term reality. Uber is transforming the transportation industry into a service, potentially affecting the economics and incentives of almost 3–4% of the world GDP!
    • Social networks and media platforms had a significant and decisive impact on the US election results.
    • Banks and large financial institutions are being attacked by FinTech startups like WealthFront, Venmo, Affirm, Stripe, SoFi, etc. Bitcoin, Ethereum and the broader blockchain revolution can upend the core structure of banks and even sovereign currencies.
    • Traditional retail businesses are under tremendous pressure due to Amazon and other e-commerce vendors. Retail is now a customer ownership, recommendations, and optimization business rather than a brick and mortar one.

    Enterprises need to adopt a new approach to software development and digital innovation. At Velotio, we are helping customers to modernize and transform their business with all of the approaches and best practices listed below.

    Agility

    In this fast-changing world, your business needs to be agile and fast-moving. You need to ship software faster, at a regular cadence, with high quality and be able to scale it globally.

    Agile practices allow companies to rally diverse teams behind a defined process that helps to achieve inclusivity and drives productivity. Agile is about getting cross-functional teams to work in concert in planned short iterations with continuous learning and improvement.

    Generally, teams that work in an Agile methodology will:

    • Conduct regular stand-ups and Scrum/Kanban planning meetings with the optimal use of tools like Jira, PivotalTracker, Rally, etc.
    • Use pair programming and code review practices to ensure better code quality.
    • Use continuous integration and delivery tools like Jenkins or CircleCI.
    • Design processes for all aspects of product management, development, QA, DevOps and SRE.
    • Use Slack, Hipchat or Teams for communication between team members and geographically diverse teams. Integrate all tools with Slack to ensure that it becomes the central hub for notifications and engagement.

    Cloud-Native

    Businesses need software that is purpose-built for the cloud model. What does that mean? Software team sizes are now in the hundreds of thousands. The number of applications and software stacks is growing rapidly in most companies. All companies use various cloud providers, SaaS vendors and best-of-breed hosted or on-premise software. Essentially, software complexity has increased exponentially which required a “cloud-native” approach to manage effectively. Cloud Native Computing Foundation defines cloud native as a software stack which is:

    1. Containerized: Each part (applications, processes, etc) is packaged in its own container. This facilitates reproducibility, transparency, and resource isolation.
    2. Dynamically orchestrated: Containers are actively scheduled and managed to optimize resource utilization.
    3. Microservices oriented: Applications are segmented into micro services. This significantly increases the overall agility and maintainability of applications.

    You can deep-dive into cloud native with this blog by our CTO, Chirag Jog.

    Cloud native is disrupting the traditional enterprise software vendors. Software is getting decomposed into specialized best of breed components — much like the micro-services architecture. See the Cloud Native landscape below from CNCF.

    DevOps

    Process and toolsets need to change to enable faster development and deployment of software. Enterprises cannot compete without mature DevOps strategies. DevOps is essentially a set of practices, processes, culture, tooling, and automation that focuses on delivering software continuously with high quality.

    DevOps tool chains & process

    As you begin or expand your DevOps journey, a few things to keep in mind:

    • Customize to your needs: There is no single DevOps process or toolchain that suits all needs. Take into account your organization structure, team capabilities, current software process, opportunities for automation and goals while making decisions. For example, your infrastructure team may have automated deployments but the main source of your quality issues could be the lack of code reviews in your development team. So identify the critical pain points and sources of delay to address those first.
    • Automation: Automate everything that can be. The lesser the dependency on human intervention, the higher are the chances for success.
    • Culture: Align the incentives and goals with your development, ITOps, SecOps, SRE teams. Ensure that they collaborate effectively and ownership in the DevOps pipeline is well established.
    • Small wins: Pick one application or team and implement your DevOps strategy within it. That way you can focus your energies and refine your experiments before applying them broadly. Show success as measured by quantifiable parameters and use that to transform the rest of your teams.
    • Organizational dynamics & integrations: Adoption of new processes and tools will cause some disruptions and you may need to re-skill part of your team or hire externally. Ensure that compliance, SecOps & audit teams are aware of your DevOps journey and get their buy-in.
    • DevOps is a continuous journey: DevOps will never be done. Train your team to learn continuously and refine your DevOps practice to keep achieving your goal: delivering software reliably and quickly.

    Micro-services

    As the amount of software in an enterprise explodes, so does the complexity. The only way to manage this complexity is by splitting your software and teams into smaller manageable units. Micro-services adoption is primarily to manage this complexity.

    Development teams across the board are choosing micro services to develop new applications and break down legacy monoliths. Every micro-service can be deployed, upgraded, scaled, monitored and restarted independent of other services. Micro-services should ideally be managed by an automated system so that teams can easily update live applications without affecting end-users.

    There are companies with 100s of micro-services in production which is only possible with mature DevOps, cloud-native and agile practice adoption.

    Interestingly, serverless platforms like Google Functions and AWS Lambda are taking the concept of micro-services to the extreme by allowing each function to act like an independent piece of the application. You can read about my thoughts on serverless computing in this blog: Serverless Computing Predictions for 2017.

    Digital Transformation

    Digital transformation involves making strategic changes to business processes, competencies, and models to leverage digital technologies. It is a very broad term and every consulting vendor twists it in various ways. Let me give a couple of examples to drive home the point that digital transformation is about using technology to improve your business model, gain efficiencies or built a moat around your business:

    • GE has done an excellent job transforming themselves from a manufacturing company into an IoT/software company with Predix. GE builds airplane engines, medical equipment, oil & gas equipment and much more. Predix is an IoT platform that is being embedded into all of GE’s products. This enabled them to charge airlines on a per-mile basis by taking the ownership of maintenance and quality instead of charging on a one-time basis. This also gives them huge amounts of data that they can leverage to improve the business as a whole. So digital innovation has enabled a business model improvement leading to higher profits.
    • Car companies are exploring models where they can provide autonomous car fleets to cities where they will charge on a per-mile basis. This will convert them into a “service” & “data” company from a pure manufacturing one.
    • Insurance companies need to built digital capabilities to acquire and retain customers. They need to build data capabilities and provide ongoing value with services rather than interact with the customer just once a year.

    You would be better placed to compete in the market if you have automation and digital process in place so that you can build new products and pivot in an agile manner.

    Big Data / Data Science

    Businesses need to deal with increasing amounts of data due to IoT, social media, mobile and due to the adoption of software for various processes. And they need to use this data intelligently. Cloud platforms provide the services and solutions to accelerate your data science and machine learning strategies. AWS, Google Cloud & open-source libraries like Tensorflow, SciPy, Keras, etc. have a broad set of machine learning and big data services that can be leveraged. Companies need to build mature data processing pipelines to aggregate data from various sources and store it for quick and efficient access to various teams. Companies are leveraging these services and libraries to build solutions like:

    • Predictive analytics
    • Cognitive computing
    • Robotic Process Automation
    • Fraud detection
    • Customer churn and segmentation analysis
    • Recommendation engines
    • Forecasting
    • Anomaly detection

    Companies are creating data science teams to build long term capabilities and moats around their business by using their data smartly.

    Re-platforming & App Modernization

    Enterprises want to modernize their legacy, often monolithic apps as they migrate to the cloud. The move can be triggered due to hardware refresh cycles or license renewals or IT cost optimization or adoption of software-focused business models.

     Benefits of modernization to customers and businesses

    Intelligent Applications

    Software is getting more intelligent and to enable this, businesses need to integrate disparate datasets, distributed teams, and processes. This is best done on a scalable global cloud platform with agile processes. Big data and data science enables the creation of intelligent applications.

    How can smart applications help your business?

    • New intelligent systems of engagement: intelligent apps surface insights to users enabling the user to be more effective and efficient. For example, CRMs and marketing software is getting intelligent and multi-platform enabling sales and marketing reps to become more productive.
    • Personalisation: E-Commerce, social networks and now B2B software is getting personalized. In order to improve user experience and reduce churn, your applications should be personalized based on the user preferences and traits.
    • Drive efficiencies: IoT is an excellent example where the efficiency of machines can be improved with data and cloud software. Real-time insights can help to optimize processes or can be used for preventive maintenance.
    • Creation of new business models: Traditional and modern industries can use AI to build new business models. For example, what if insurance companies allow you to pay insurance premiums only for the miles driven?

    Security

    Security threats to governments, enterprises and data have never been greater. As business adopt cloud native, DevOps & micro-services practices, their security practices need to evolve.

    In our experience, these are few of the features of a mature cloud native security practice:

    • Automated: Systems are updated automatically with the latest fixes. Another approach is immutable infrastructure with the adoption of containers and serverless.
    • Proactive: Automated security processes tend to be proactive. For example, if a malware of vulnerability is found in one environment, automation can fix it in all environments. Mature DevOps & CI/CD processes ensure that fixes can be deployed in hours or days instead of weeks or months.
    • Cloud Platforms: Businesses have realized that the mega-clouds are way more secure than their own data centers can be. Many of these cloud platforms have audit, security and compliance services which should be leveraged.
    • Protecting credentials: Use AWS KMS, Hashicorp Vault or other solutions for protecting keys, passwords and authorizations.
    • Bug bounties: Either setup bug bounties internally or through sites like HackerOne. You want the good guys to work for you and this is an easy way to do that.

    Conclusion

    As you can see, all of these approaches and best practices are intertwined and need to be implemented in concert to gain the desired results. It is best to start with one project, one group or one application and build on early wins. Remember, that is is a process and you are looking for gradual improvements to achieve your final objectives.

    Please let us know your thoughts and experiences by adding comments to this blog or reaching out to @kalpakshah or RSI. We would love to help your business adopt these best practices and help to build great software together. Drop me a note at kalpak (at) velotio (dot) com.

  • Taking Amazon’s Elastic Kubernetes Service for a Spin

    With the introduction of Elastic Kubernetes service at AWS re: Invent last year, AWS finally threw their hat in the ever booming space of managed Kubernetes services. In this blog post, we will learn the basic concepts of EKS, launch an EKS cluster and also deploy a multi-tier application on it.

    What is Elastic Kubernetes service (EKS)?

    Kubernetes works on a master-slave architecture. The master is also referred to as control plane. If the master goes down it brings our entire cluster down, thus ensuring high availability of master is absolutely critical as it can be a single point of failure. Ensuring high availability of master and managing all the worker nodes along with it becomes a cumbersome task in itself, thus it is most desirable for organizations to have managed Kubernetes cluster so that they can focus on the most important task which is to run their applications rather than managing the cluster. Other cloud providers like Google cloud and Azure already had their managed Kubernetes service named GKE and AKS respectively. Similarly now with EKS Amazon has also rolled out its managed Kubernetes cluster to provide a seamless way to run Kubernetes workloads.

    Key EKS concepts:

    EKS takes full advantage of the fact that it is running on AWS so instead of creating Kubernetes specific features from the scratch they have reused/plugged in the existing AWS services with EKS for achieving Kubernetes specific functionalities. Here is a brief overview:

    IAM-integration: Amazon EKS integrates IAM authentication with Kubernetes RBAC ( role-based access control system native to Kubernetes) with the help of Heptio Authenticator which is a tool that uses AWS IAM credentials to authenticate to a Kubernetes cluster. Here we can directly attach an RBAC role with an IAM entity this saves the pain of managing another set of credentials at the cluster level.

    Container Interface:  AWS has developed an open source cni plugin which takes advantage of the fact that multiple network interfaces can be attached to a single EC2 instance and these interfaces can have multiple secondary private ips associated with them, these secondary ips are used to provide pods running on EKS with real ip address from VPC cidr pool. This improves the latency for inter pod communications as the traffic flows without any overlay.  

    ELB Support:  We can use any of the AWS ELB offerings (classic, network, application) to route traffic to our service running on the working nodes.

    Auto scaling:  The number of worker nodes in the cluster can grow and shrink using the EC2 auto scaling service.

    Route 53: With the help of the External DNS project and AWS route53 we can manage the DNS entries for the load balancers which get created when we create an ingress object in our EKS cluster or when we create a service of type LoadBalancer in our cluster. This way the DNS names are always in sync with the load balancers and we don’t have to give separate attention to it.   

    Shared responsibility for cluster: The responsibilities of an EKS cluster is shared between AWS and customer. AWS takes care of the most critical part of managing the control plane (api server and etcd database) and customers need to manage the worker node. Amazon EKS automatically runs Kubernetes with three masters across three Availability Zones to protect against a single point of failure, control plane nodes are also monitored and replaced if they fail, and are also patched and updated automatically this ensures high availability of the cluster and makes it extremely simple to migrate existing workloads to EKS.

    Prerequisites for launching an EKS cluster:

    1.  IAM role to be assumed by the cluster: Create an IAM role that allows EKS to manage a cluster on your behalf. Choose EKS as the service which will assume this role and add AWS managed policies ‘AmazonEKSClusterPolicy’ and ‘AmazonEKSServicePolicy’ to it.

    2.  VPC for the cluster:  We need to create the VPC where our cluster is going to reside. We need a VPC with subnets, internet gateways and other components configured. We can use an existing VPC for this if we wish or create one using the CloudFormation script provided by AWS here or use the Terraform script available here. The scripts take ‘cidr’ block of the VPC and three other subnets as arguments.

    Launching an EKS cluster:

    1.  Using the web console: With the prerequisites in place now we can go to the EKS console and launch an EKS cluster when we try to launch an EKS cluster we need to provide a the name of the EKS cluster, choose the Kubernetes version to use, provide the IAM role we created in step one and also choose a VPC, once we choose a VPC we also need to select subnets from the VPC where we want our worker nodes to be launched by default all the subnets in the VPC are selected we also need to provide a security group which is applied to the elastic network interfaces (eni) that EKS creates to allow control plane communicate with the worker nodes.

    NOTE: Couple of things to note here is that the subnets must be in at least two different availability zones and the security group that we provided is later updated when we create worker node cluster so it is better to not use this security group with any other entity or be completely sure of the changes happening to it.

    2. Using awscli :

    aws eks create-cluster --name eks-blog-cluster --role-arn arn:aws:iam::XXXXXXXXXXXX:role/eks-service-role  
    --resources-vpc-config subnetIds=subnet-0b8da2094908e1b23,subnet-01a46af43b2c5e16c,securityGroupIds=sg-03fa0c02886c183d4

    {
        "cluster": {
            "status": "CREATING",
            "name": "eks-blog-cluster",
            "certificateAuthority": {},
            "roleArn": "arn:aws:iam::XXXXXXXXXXXX:role/eks-service-role",
            "resourcesVpcConfig": {
                "subnetIds": [
                    "subnet-0b8da2094908e1b23",
                    "subnet-01a46af43b2c5e16c"
                ],
                "vpcId": "vpc-0364b5ed9f85e7ce1",
                "securityGroupIds": [
                    "sg-03fa0c02886c183d4"
                ]
            },
            "version": "1.10",
            "arn": "arn:aws:eks:us-east-1:XXXXXXXXXXXX:cluster/eks-blog-cluster",
            "createdAt": 1535269577.147
        }
    }

    In the response, we see that the cluster is in creating state. It will take a few minutes before it is available. We can check the status using the below command:

    aws eks describe-cluster --name=eks-blog-cluster

    Configure kubectl for EKS:

    We know that in Kubernetes we interact with the control plane by making requests to the API server. The most common way to interact with the API server is via kubectl command line utility. As our cluster is ready now we need to install kubectl.

    1.  Install the kubectl binary

    curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s 
    https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl

    Give executable permission to the binary.

    chmod +x ./kubectl

    Move the kubectl binary to a folder in your system’s $PATH.

    sudo cp ./kubectl /bin/kubectl && export PATH=$HOME/bin:$PATH

    As discussed earlier EKS uses AWS IAM Authenticator for Kubernetes to allow IAM authentication for your Kubernetes cluster. So we need to download and install the same.

    2.  Install aws-iam-authenticator

    curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/linux/amd64/aws-iam-authenticator

    Give executable permission to the binary

    chmod +x ./aws-iam-authenticator

    Move the aws-iam-authenticator binary to a folder in your system’s $PATH.

    sudo cp ./aws-iam-authenticator /bin/aws-iam-authenticator

    3.  Create the kubeconfig file

    First create the directory.

    mkdir -p ~/.kube

    Open a config file in the folder created above

    sudo vi .kube/config-eks-blog-cluster

    Paste the below code in the file

    clusters:      
    - cluster:       
    server: https://DBFE36D09896EECAB426959C35FFCC47.sk1.us-east-1.eks.amazonaws.com        
    certificate-authority-data: ”....................”        
    name: kubernetes        
    contexts:        
    - context:             
    cluster: kubernetes             
    user: aws          
    name: aws        
    current-context: aws        
    kind: Config       
    preferences: {}        
    users:           
    - name: aws            
    user:                
    exec:                    
    apiVersion: client.authentication.k8s.io/v1alpha1                    
    command: aws-iam-authenticator                    
    args:                       
    - "token"                       
    - "-i"                     
    - “eks-blog-cluster"

    Replace the values of the server and certificateauthority data with the values of your cluster and certificate and also update the cluster name in the args section. You can get these values from the web console as well as using the command.

    aws eks describe-cluster --name=eks-blog-cluster

    Save and exit.

    Add that file path to your KUBECONFIG environment variable so that kubectl knows where to look for your cluster configuration.

    export KUBECONFIG=$KUBECONFIG:~/.kube/config-eks-blog-cluster

    To verify that the kubectl is now properly configured :

    kubectl get all
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    service/kubernetes ClusterIP 172.20.0.1  443/TCP 50m

    Launch and configure worker nodes :

    Now we need to launch worker nodes before we can start deploying apps. We can create the worker node cluster by using the CloudFormation script provided by AWS which is available here or use the Terraform script available here.

    • ClusterName: Name of the Amazon EKS cluster we created earlier.
    • ClusterControlPlaneSecurityGroup: Id of the security group we used in EKS cluster.
    • NodeGroupName: Name for the worker node auto scaling group.
    • NodeAutoScalingGroupMinSize: Minimum number of worker nodes that you always want in your cluster.
    • NodeAutoScalingGroupMaxSize: Maximum number of worker nodes that you want in your cluster.
    • NodeInstanceType: Type of worker node you wish to launch.
    • NodeImageId: AWS provides Amazon EKS-optimized AMI to be used as worker nodes. Currently AKS is available in only two AWS regions Oregon and N.virginia and the AMI ids are ami-02415125ccd555295 and ami-048486555686d18a0 respectively
    • KeyName: Name of the key you will use to ssh into the worker node.
    • VpcId: Id of the VPC that we created earlier.
    • Subnets: Subnets from the VPC we created earlier.

    To enable worker nodes to join your cluster, we need to download, edit and apply the AWS authenticator config map.

    Download the config map:

    curl -O https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/aws-auth-cm.yaml

    Open it in an editor

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: aws-auth
      namespace: kube-system
    data:
      mapRoles: |
        - rolearn: <ARN of instance role (not instance profile)>
          username: system:node:{{EC2PrivateDNSName}}
          groups:
            - system:bootstrappers
            - system:nodes

    Edit the value of rolearn with the arn of the role of your worker nodes. This value is available in the output of the scripts that you ran. Save the change and then apply

    kubectl apply -f aws-auth-cm.yaml

    Now you can check if the nodes have joined the cluster or not.

    kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    ip-10-0-2-171.ec2.internal Ready  12s v1.10.3
    ip-10-0-3-58.ec2.internal Ready  14s v1.10.3

    Deploying an application:

    As our cluster is completely ready now we can start deploying applications on it. We will deploy a simple books api application which connects to a mongodb database and allows users to store,list and delete book information.

    1. MongoDB Deployment YAML

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: mongodb
    spec:
      template:
        metadata:
          labels:
            app: mongodb
        spec:
          containers:
          - name: mongodb
            image: mongo
            ports:
            - name: mongodbport
              containerPort: 27017
              protocol: TCP

    2. Test Application Development YAML

    apiVersion: apps/v1beta1
    kind: Deployment
    metadata:
      name: test-app
    spec:
      replicas: 1
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - name: test-app
            image: akash125/pyapp
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 3000

    3. MongoDB Service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: mongodb-service
    spec:
      ports:
      - port: 27017
        targetPort: 27017
        protocol: TCP
        name: mongodbport
      selector:
        app: mongodb

    4. Test Application Service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: test-service
    spec:
      type: LoadBalancer
      ports:
      - name: test-service
        port: 80
        protocol: TCP
        targetPort: 3000
      selector:
        app: test-app

    Services

    $ kubectl create -f mongodb-service.yaml
    $ kubectl create -f testapp-service.yaml

    Deployments

    $ kubectl create -f mongodb-deployment.yaml
    $ kubectl create -f testapp-deployment.yaml$ kubectl get services
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 12m
    mongodb-service ClusterIP 172.20.55.194 <none> 27017/TCP 4m
    test-service LoadBalancer 172.20.188.77 a7ee4f4c3b0ea 80:31427/TCP 3m

    In the EXTERNAL-IP section of the test-service we see dns of an load balancer we can now access the application from outside the cluster using this dns.

    To Store Data :

    curl -X POST -d '{"name":"A Game of Thrones (A Song of Ice and Fire)“, "author":"George R.R. Martin","price":343}' http://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books
    {"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}

    To Get Data :

    curl -X GET http://a7ee4f4c3b0ea11e8b0f912f36098e4d-672471149.us-east-1.elb.amazonaws.com/books
    [{"id":"5b8fab49fa142b000108d6aa","name":"A Game of Thrones (A Song of Ice and Fire)","author":"George R.R. Martin","price":343}]

    We can directly put the URL used in the curl operation above in our browser as well, we will get the same response.

    Now our application is deployed on EKS and can be accessed by the users.

    Comparison BETWEEN GKE, ECS and EKS:

    Cluster creation: Creating GKE and ECS cluster is way simpler than creating an EKS cluster. GKE being the simplest of all three.

    Cost: In case of both, GKE and ECS we pay only for the infrastructure that is visible to us i.e., servers, volumes, ELB etc. and there is no cost for master nodes or other cluster management services but with EKS there is a charge of 0.2 $ per hour for the control plane.

    Add-ons: GKE provides the option of using Calico as the network plugin which helps in defining network policies for controlling inter pod communication (by default all pods in k8s can communicate with each other).

    Serverless: ECS cluster can be created using Fargate which is container as a Service (CaaS) offering from AWS. Similarly EKS is also expected to support Fargate very soon.

    In terms of availability and scalability all the services are at par with each other.

    Conclusion:

    In this blog post we learned the basics concepts of EKS, launched our own EKS cluster and deployed an application as well. EKS is much awaited service from AWS especially for the folks who were already running their Kubernetes workloads on AWS, as now they can easily migrate to EKS and have a fully managed Kubernetes control plane. EKS is expected to be adopted by many organisations in near future.

    References:

  • Machine Learning for your Infrastructure: Anomaly Detection with Elastic + X-Pack

    Introduction

    The world continues to go through digital transformation at an accelerating pace. Modern applications and infrastructure continues to expand and operational complexity continues to grow. According to a recent ManageEngine Application Performance Monitoring Survey:

    • 28 percent use ad-hoc scripts to detect issues in over 50 percent of their applications.
    • 32 percent learn about application performance issues from end users.
    • 59 percent trust monitoring tools to identify most performance deviations.

    Most enterprises and web-scale companies have instrumentation & monitoring capabilities with an ElasticSearch cluster. They have a high amount of collected data but struggle to use it effectively. This available data can be used to improve availability and effectiveness of performance and uptime along with root cause analysis and incident prediction

    IT Operations & Machine Learning

    Here is the main question: How to make sense of the huge piles of collected data? The first step towards making sense of data is to understand the correlations between the time series data. But only understanding will not work since correlation does not imply causation. We need a practical and scalable approach to understand the cause-effect relationship between data sources and events across complex infrastructure of VMs, containers, networks, micro-services, regions, etc.

    It’s very likely that due to one component something goes wrong with another component. In such cases, operational historical data can be used to identify the root cause by investigating through a series of intermediate causes and effects. Machine learning is particularly useful for such problems where we need to identify “what changed”, since machine learning algorithms can easily analyze existing data to understand the patterns, thus making easier to recognize the cause. This is known as unsupervised learning, where the algorithm learns from the experience and identifies similar patterns when they come along again.

    Let’s see how you can setup Elastic + X-Pack to enable anomaly detection for your infrastructure & applications.

    Anomaly Detection using Elastic’s machine learning with X-Pack

    Step I: Setup

    1. Setup Elasticsearch: 

    According to Elastic documentation, it is recommended to use the Oracle JDK version 1.8.0_131. Check if you have required Java version installed on your system. It should be at least Java 8, if required install/upgrade accordingly.

    • Download elasticsearch tarball and untar it
    $ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.tar.gz
    $ tar -xzvf elasticsearch-5.5.1.tar.gz

    • It will then create a folder named elasticsearch-5.5.1. Go into the folder.
    $ cd elasticsearch-5.5.1

    • Install X-Pack into Elasticsearch
    $ ./bin/elasticsearch-plugin install x-pack

    • Start elasticsearch
    $ bin/elasticsearch

    2. Setup Kibana

    Kibana is an open source analytics and visualization platform designed to work with Elasticsearch.

    • Download kibana tarball and untar it
    $ wget https://artifacts.elastic.co/downloads/kibana/kibana-5.5.1-linux-x86_64.tar.gz
    $ tar -xzf kibana-5.5.1-linux-x86_64.tar.gz

    • It will then create a folder named kibana-5.5.1. Go into the directory.
    $ cd kibana-5.5.1-linux-x86_64

    • Install X-Pack into Kibana
    $ ./bin/kibana-plugin install x-pack

    • Running kibana
    $ ./bin/kibana

    • Navigate to Kibana at http://localhost:5601/
    • Log in as the built-in user elastic and password changeme.
    • You will see the below screen:
    Kibana: X-Pack Welcome Page

     

    3. Metricbeat:

    Metricbeat helps in monitoring servers and the services they host by collecting metrics from the operating system and services. We will use it to get CPU utilization metrics of our local system in this blog.

    • Download Metric Beat’s tarball and untar it
    $ wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-5.5.1-linux-x86_64.tar.gz
    $ tar -xvzf metricbeat-5.5.1-linux-x86_64.tar.gz

    • It will create a folder metricbeat-5.5.1-linux-x86_64. Go to the folder
    $ cd metricbeat-5.5.1-linux-x86_64

    • By default, Metricbeat is configured to send collected data to elasticsearch running on localhost. If your elasticsearch is hosted on any server, change the IP and authentication credentials in metricbeat.yml file.
     Metricbeat Config

     

    • Metric beat provides following stats:
    • System load
    • CPU stats
    • IO stats
    • Per filesystem stats
    • Per CPU core stats
    • File system summary stats
    • Memory stats
    • Network stats
    • Per process stats
    • Start Metricbeat as daemon process
    $ sudo ./metricbeat -e -c metricbeat.yml &

    Now, all setup is done. Let’s go to step 2 to create machine learning jobs. 

    Step II: Time Series data

    • Real-time data: We have metricbeat providing us the real-time series data which will be used for unsupervised learning. Follow below steps to define index pattern metricbeat-*  in Kibana to search against this pattern in Elasticsearch:
      – Go to Management -> Index Patterns  
      – Provide Index name or pattern as metricbeat-*
      – Select Time filter field name as @timestamp
      – Click Create

    You will not be able to create an index if elasticsearch did not contain any metric beat data. Make sure your metric beat is running and output is configured as elasticsearch.

    • Saved Historic data: Just to see quickly how machine learning detect the anomalies you can also use data provided by Elastic. Download sample data by clicking here.
    • Unzip the files in a folder: tar -zxvf server_metrics.tar.gz
    • Download this script. It will be used to upload sample data to elastic.
    • Provide execute permissions to the file: chmod +x upload_server-metrics.sh
    • Run the script.
    • As we created index pattern for metricbeat data, in same way create index pattern server-metrics*

    Step III: Creating Machine Learning jobs

    There are two scenarios in which data is considered anomalous. First, when the behavior of key indicator changes over time relative to its previous behavior. Secondly, when within a population behavior of an entity deviates from other entities in population over single key indicator.

    To detect these anomalies, there are three types of jobs we can create:

    1. Single Metric job: This job is used to detect Scenario 1 kind of anomalies over only one key performance indicator.
    2. Multimetric job: Multimetric job also detects Scenario 1 kind of anomalies but in this type of job we can track more than one performance indicators, such as CPU utilization along with memory utilization.
    3. Advanced job: This kind of job is created to detect anomalies of type 2.

    For simplicity, we are creating following single metric jobs:

    1. Tracking CPU Utilization: Using metric beat data
    2. Tracking total requests made on server: Using sample server data

    Follow below steps to create single metric jobs:

    Job1: Tracking CPU Utilization

    Job2: Tracking total requests made on server

    • Go to http://localhost:5601/
    • Go to Machine learning tab on the left panel of Kibana.
    • Click on Create new job
    • Click Create single metric job
    • Select index we created in Step 2 i.e. metricbeat-* and server-metrics* respectively
    • Configure jobs by providing following values:
    1. Aggregation: Here you need to select an aggregation function that will be applied to a particular field of data we are analyzing.
    2. Field: It is a drop down, will show you all field that you have w.r.t index pattern.
    3. Bucket span: It is interval time for analysis. Aggregation function will be applied on selected field after every interval time specified here.
    • If your data contains so many empty buckets i.e. data is sparse and you don’t want to consider it as anomalous check the checkbox named sparse data  (if it appears).
    • Click on Use full <index pattern=””> data to use all available data for analysis.</index>
    Metricbeats Description
    Server Description
    • Click on play symbol
    • Provide job name and description
    • Click on Create Job

    After creating job the data available will be analyzed. Click on view results, you will see a chart which will show the actual and upper & lower bound of predicted value. If actual value lies outside of the range, it will be considered as anomalous. The Color of the circles represents the severity level.

    Here we are getting a high range of prediction values since it just started learning. As we get more data the prediction will get better.
    You can see here predictions are pretty good since there is a lot of data to understand the pattern
    • Click on machine learning tab in the left panel. The jobs we created will be listed here.
    • You will see the list of actions for every job you have created.
    • Since we are storing every minute data for Job1 using metricbeat. We can feed the data to the job in real time. Click on play button to start data feed. As we get more and more data prediction will improve.
    • You see details of anomalies by clicking Anomaly Viewer.
    Anomaly in metricbeats data
    Server metrics anomalies  

    We have seen how machine learning can be used to get patterns among the different statistics along with anomaly detection. After identifying anomalies, it is required to find the context of those events. For example, to know about what other factors are contributing to the problem? In such cases, we can troubleshoot by creating multimetric jobs.

  • Automated Containerization and Migration of On-premise Applications to Cloud Platforms

    Containerized applications are becoming more popular with each passing year. All enterprise applications are adopting container technology as they modernize their IT systems. Migrating your applications from VMs or physical machines to containers comes with multiple advantages like optimal resource utilization, faster deployment times, replication, quick cloning, lesser lock-in and so on. Various container orchestration platforms like Kubernetes, Google Container Engine (GKE), Amazon EC2 Container Service (Amazon ECS) help in quick deployment and easy management of your containerized applications. But in order to use these platforms, you need to migrate your legacy applications to containers or rewrite/redeploy your applications from scratch with the containerization approach. Rearchitecting your applications using containerization approach is preferable, but is that possible for complex legacy applications? Is your deployment team capable enough to list down each and every detail about the deployment process of your application? Do you have the patience of authoring a Docker file for each of the components of your complex application stack?

    Automated migrations!

    Velotio has been helping customers with automated migration of VMs and bare-metal servers to various container platforms. We have developed automation to convert these migrated applications as containers on various container deployment platforms like GKE, Amazon ECS and Kubernetes. In this blog post, we will cover one such migration tool developed at Velotio which will migrate your application running on a VM or physical machine to Google Container Engine (GKE) by running a single command.

    Migration tool details

    We have named our migration tool as A2C(Anything to Container). It can migrate applications running on any Unix or Windows operating system. 

    The migration tool requires the following information about the server to be migrated:

    • IP of the server
    • SSH User, SSH Key/Password of the application server
    • Configuration file containing data paths for application/database/components (more details below)
    • Required name of your docker image (The docker image that will get created for your application)
    • GKE Container Cluster details

    In order to store persistent data, volumes can be defined in container definition. Data changes done on volume path remain persistent even if the container is killed or crashes. Volumes are basically filesystem path from host machine on which your container is running, NFS or cloud storage. Containers will mount the filesystem path from your local machine to container, leading to data changes being written on the host machine filesystem instead of the container’s filesystem. Our migration tool supports data volumes which can be defined in the configuration file. It will automatically create disks for the defined volumes and copy data from your application server to these disks in a consistent way.

    The configuration file we have been talking about is basically a YAML file containing filesystem level information about your application server. A sample of this file can be found below:

    includes:
    - /
    volumes:
    - var/log/httpd
    - var/log/mariadb
    - var/www/html
    - var/lib/mysql
    excludes:
    - mnt
    - var/tmp
    - etc/fstab
    - proc
    - tmp

    The configuration file contains 3 sections: includes, volumes and excludes:

    • Includes contains filesystem paths on your application server which you want to add to your container image.
    • Volumes contain filesystem paths on your application server which stores your application data. Generally, filesystem paths containing database files, application code files, configuration files, log files are good candidates for volumes.
    • The excludes section contains filesystem paths which you don’t want to make part of the container. This may include temporary filesystem paths like /proc, /tmp and also NFS mounted paths. Ideally, you would include everything by giving “/” in includes section and exclude specifics in exclude section.

    Docker image name to be given as input to the migration tool is the docker registry path in which the image will be stored, followed by the name and tag of the image. Docker registry is like GitHub of docker images, where you can store all your images. Different versions of the same image can be stored by giving version specific tag to the image. GKE also provides a Docker registry. Since in this demo we are migrating to GKE, we will also store our image to GKE registry.

    GKE container cluster details to be given as input to the migration tool, contains GKE specific details like GKE project name, GKE container cluster name and GKE region name. A container cluster can be created in GKE to host the container applications. We have a separate set of scripts to perform cluster creation operation. Container cluster creation can also be done easily through GKE UI. For now, we will assume that we have a 3 node cluster created in GKE, which we will use to host our application.

    Tasks performed under migration

    Our migration tool (A2C), performs the following set of activities for migrating the application running on a VM or physical machine to GKE Container Cluster:

    1. Install the A2C migration tool with all it’s dependencies to the target application server

    2. Create a docker image of the application server, based on the filesystem level information given in the configuration file

    3. Capture metadata from the application server like configured services information, port usage information, network configuration, external services, etc.

    4.  Push the docker image to GKE container registry

    5. Create disk in Google Cloud for each volume path defined in configuration file and prepopulate disks with data from application server

    6. Create deployment spec for the container application in GKE container cluster, which will open the required ports, configure required services, add multi container dependencies, attach the pre populated disks to containers, etc.

    7. Deploy the application, after which you will have your application running as containers in GKE with application software in running state. New application URL’s will be given as output.

    8. Load balancing, HA will be configured for your application.

    Demo

    For demonstration purpose, we will deploy a LAMP stack (Apache+PHP+Mysql) on a CentOS 7 VM and will run the migration utility for the VM, which will migrate the application to our GKE cluster. After the migration we will show our application preconfigured with the same data as on our VM, running on GKE.

    Step 1

    We setup LAMP stack using Apache, PHP and Mysql on a CentOS 7 VM in GCP. The PHP application can be used to list, add, delete or edit user data. The data is getting stored in MySQL database. We added some data to the database using the application and the UI would show the following:

    Step 2

    Now we run the A2C migration tool, which will migrate this application stack running on a VM into a container and auto-deploy it to GKE.

    # ./migrate.py -c lamp_data_handler.yml -d "tcp://35.202.201.247:4243" -i migrate-lamp -p glassy-chalice-XXXXX -u root -k ~/mykey -l a2c-host --gcecluster a2c-demo --gcezone us-central1-b 130.211.231.58

    Pushing converter binary to target machine
    Pushing data config to target machine
    Pushing installer script to target machine
    Running converter binary on target machine
    [130.211.231.58] out: creating docker image
    [130.211.231.58] out: image created with id 6dad12ba171eaa8615a9c353e2983f0f9130f3a25128708762228f293e82198d
    [130.211.231.58] out: Collecting metadata for image
    [130.211.231.58] out: Generating metadata for cent7
    [130.211.231.58] out: Building image from metadata
    Pushing the docker image to GCP container registryInitiate remote data copy
    Activated service account credentials for: [glassy-chaliceXXXXX@appspot.gserviceaccount.com]
    for volume var/log/httpd
    Creating disk migrate-lamp-0
    Disk Created Successfully
    transferring data from sourcefor volume var/log/mariadb
    Creating disk migrate-lamp-1
    Disk Created Successfully
    transferring data from sourcefor volume var/www/html
    Creating disk migrate-lamp-2
    Disk Created Successfully
    transferring data from sourcefor volume var/lib/mysql
    Creating disk migrate-lamp-3
    Disk Created Successfully
    transferring data from sourceConnecting to GCP cluster for deployment
    Created service file /tmp/gcp-service.yaml
    Created deployment file /tmp/gcp-deployment.yaml

    Deploying to GKE

    $ kubectl get pod
    
    NAMEREADY STATUSRESTARTS AGE
    migrate-lamp-3707510312-6dr5g 0/1 ContainerCreating 058s

    $ kubectl get deployment
    
    NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
    migrate-lamp 1 1 10 1m

    $ kubectl get service
    
    NAME CLUSTER-IP EXTERNAL-IP PORT(S)AGE
    kubernetes 10.59.240.1443/TCP23hmigrate-lamp 10.59.248.44 35.184.53.100 3306:31494/TCP,80:30909/TCP,22:31448/TCP 53s

    You can access your application using above connection details!

    Step 3

    Access LAMP stack on GKE using the IP 35.184.53.100 on default 80 port as was done on the source machine.

    Here is the Docker image being created in GKE Container Registry:

    We can also see that disks were created with migrate-lamp-x, as part of this automated migration.

    Load Balancer also got provisioned in GCP as part of the migration process

    Following service files and deployment files were created by our migration tool to deploy the application on GKE:

    # cat /tmp/gcp-service.yaml
    apiVersion: v1
    kind: Service
    metadata:
    labels:
    app: migrate-lamp
    name: migrate-lamp
    spec:
    ports:
    - name: migrate-lamp-3306
    port: 3306
    - name: migrate-lamp-80
    port: 80
    - name: migrate-lamp-22
    port: 22
    selector:
    app: migrate-lamp
    type: LoadBalancer

    # cat /tmp/gcp-deployment.yaml
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
    labels:
    app: migrate-lamp
    name: migrate-lamp
    spec:
    replicas: 1
    selector:
    matchLabels:
    app: migrate-lamp
    template:
    metadata:
    labels:
    app: migrate-lamp
    spec:
    containers:
    - image: us.gcr.io/glassy-chalice-129514/migrate-lamp
    name: migrate-lamp
    ports:
    - containerPort: 3306
    - containerPort: 80
    - containerPort: 22
    securityContext:
    privileged: true
    volumeMounts:
    - mountPath: /var/log/httpd
    name: migrate-lamp-var-log-httpd
    - mountPath: /var/www/html
    name: migrate-lamp-var-www-html
    - mountPath: /var/log/mariadb
    name: migrate-lamp-var-log-mariadb
    - mountPath: /var/lib/mysql
    name: migrate-lamp-var-lib-mysql
    volumes:
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-0
    name: migrate-lamp-var-log-httpd
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-2
    name: migrate-lamp-var-www-html
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-1
    name: migrate-lamp-var-log-mariadb
    - gcePersistentDisk:
    fsType: ext4
    pdName: migrate-lamp-3
    name: migrate-lamp-var-lib-mysql

    Conclusion

    Migrations are always hard for IT and development teams. At Velotio, we have been helping customers to migrate to cloud and container platforms using streamlined processes and automation. Feel free to reach out to us at contact@rsystems.com to know more about our cloud and container adoption/migration offerings.

  • A Primer on HTTP Load Balancing in Kubernetes using Ingress on Google Cloud Platform

    Containerized applications and Kubernetes adoption in cloud environments is on the rise. One of the challenges while deploying applications in Kubernetes is exposing these containerized applications to the outside world. This blog explores different options via which applications can be externally accessed with focus on Ingress – a new feature in Kubernetes that provides an external load balancer. This blog also provides a simple hand-on tutorial on Google Cloud Platform (GCP).  

    Ingress is the new feature (currently in beta) from Kubernetes which aspires to be an Application Load Balancer intending to simplify the ability to expose your applications and services to the outside world. It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL, offer name based virtual hosting etc. Before we dive into Ingress, let’s look at some of the alternatives currently available that help expose your applications, their complexities/limitations and then try to understand Ingress and how it addresses these problems.

    Current ways of exposing applications externally:

    There are certain ways using which you can expose your applications externally. Lets look at each of them:

    EXPOSE Pod:

    You can expose your application directly from your pod by using a port from the node which is running your pod, mapping that port to a port exposed by your container and using the combination of your HOST-IP:HOST-PORT to access your application externally. This is similar to what you would have done when running docker containers directly without using Kubernetes. Using Kubernetes you can use hostPortsetting in service configuration which will do the same thing. Another approach is to set hostNetwork: true in service configuration to use the host’s network interface from your pod.

    Limitations:

    • In both scenarios you should take extra care to avoid port conflicts at the host, and possibly some issues with packet routing and name resolutions.
    • This would limit running only one replica of the pod per cluster node as the hostport you use is unique and can bind with only one service.

    EXPOSE Service:

    Kubernetes services primarily work to interconnect different pods which constitute an application. You can scale the pods of your application very easily using services. Services are not primarily intended for external access, but there are some accepted ways to expose services to the external world.

    Basically, services provide a routing, balancing and discovery mechanism for the pod’s endpoints. Services target pods using selectors, and can map container ports to service ports. A service exposes one or more ports, although usually, you will find that only one is defined.

    A service can be exposed using 3 ServiceType choices:

    • ClusterIP: Exposes the service on a cluster-internal IP. Choosing this value makes the service only reachable from within the cluster. This is the default ServiceType.
    • NodePort: Exposes the service on each Node’s IP at a static port (the NodePort). A ClusterIP service, to which the NodePort service will route, is automatically created. You’ll be able to contact the NodePort service, from outside the cluster, by requesting <nodeip>:<nodeport>.Here NodePort remains fixed and NodeIP can be any node IP of your Kubernetes cluster.</nodeport></nodeip>
    • LoadBalancer: Exposes the service externally using a cloud provider’s load balancer (eg. AWS ELB). NodePort and ClusterIP services, to which the external load balancer will route, are automatically created.
    • ExternalName: Maps the service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value. No proxying of any kind is set up. This requires version 1.7 or higher of kube-dns

    Limitations:

    • If we choose NodePort to expose our services, kubernetes will generate ports corresponding to the ports of your pods in the range of 30000-32767. You will need to add an external proxy layer that uses DNAT to expose more friendly ports. The external proxy layer will also have to take care of load balancing so that you leverage the power of your pod replicas. Also it would not be easy to add TLS or simple host header routing rules to the external service.
    • ClusterIP and ExternalName similarly while easy to use have the limitation where we can add any routing or load balancing rules.
    • Choosing LoadBalancer is probably the easiest of all methods to get your service exposed to the internet. The problem is that there is no standard way of telling a Kubernetes service about the elements that a balancer requires, again TLS and host headers are left out. Another limitation is reliance on an external load balancer (AWS’s ELB, GCP’s Cloud Load Balancer etc.)

    Endpoints

    Endpoints are usually automatically created by services, unless you are using headless services and adding the endpoints manually. An endpoint is a host:port tuple registered at Kubernetes, and in the service context it is used to route traffic. The service tracks the endpoints as pods, that match the selector are created, deleted and modified. Individually, endpoints are not useful to expose services, since they are to some extent ephemeral objects.

    Summary

    If you can rely on your cloud provider to correctly implement the LoadBalancer for their API, to keep up-to-date with Kubernetes releases, and you are happy with their management interfaces for DNS and certificates, then setting up your services as type LoadBalancer is quite acceptable.

    On the other hand, if you want to manage load balancing systems manually and set up port mappings yourself, NodePort is a low-complexity solution. If you are directly using Endpoints to expose external traffic, perhaps you already know what you are doing (but consider that you might have made a mistake, there could be another option).

    Given that none of these elements has been originally designed to expose services to the internet, their functionality may seem limited for this purpose.

    Understanding Ingress

    Traditionally, you would create a LoadBalancer service for each public application you want to expose. Ingress gives you a way to route requests to services based on the request host or path, centralizing a number of services into a single entrypoint.

    Ingress is split up into two main pieces. The first is an Ingress resource, which defines how you want requests routed to the backing services and second is the Ingress Controller which does the routing and also keeps track of the changes on a service level.

    Ingress Resources

    The Ingress resource is a set of rules that map to Kubernetes services. Ingress resources are defined purely within Kubernetes as an object that other entities can watch and respond to.

    Ingress Supports defining following rules in beta stage:

    • host header:  Forward traffic based on domain names.
    • paths: Looks for a match at the beginning of the path.
    • TLS: If the ingress adds TLS, HTTPS and a certificate configured through a secret will be used.

    When no host header rules are included at an Ingress, requests without a match will use that Ingress and be mapped to the backend service. You will usually do this to send a 404 page to requests for sites/paths which are not sent to the other services. Ingress tries to match requests to rules, and forwards them to backends, which are composed of a service and a port.

    Ingress Controllers

    Ingress controller is the entity which grants (or remove) access, based on the changes in the services, pods and Ingress resources. Ingress controller gets the state change data by directly calling Kubernetes API.

    Ingress controllers are applications that watch Ingresses in the cluster and configure a balancer to apply those rules. You can configure any of the third party balancers like HAProxy, NGINX, Vulcand or Traefik to create your version of the Ingress controller.  Ingress controller should track the changes in ingress resources, services and pods and accordingly update configuration of the balancer.

    Ingress controllers will usually track and communicate with endpoints behind services instead of using services directly. This way some network plumbing is avoided, and we can also manage the balancing strategy from the balancer. Some of the open source implementations of Ingress Controllers can be found here.

    Now, let’s do an exercise of setting up a HTTP Load Balancer using Ingress on Google Cloud Platform (GCP), which has already integrated the ingress feature in it’s Container Engine (GKE) service.

    Ingress-based HTTP Load Balancer in Google Cloud Platform

    The tutorial assumes that you have your GCP account setup done and a default project created. We will first create a Container cluster, followed by deployment of a nginx server service and an echoserver service. Then we will setup an ingress resource for both the services, which will configure the HTTP Load Balancer provided by GCP

    Basic Setup

    Get your project ID by going to the “Project info” section in your GCP dashboard. Start the Cloud Shell terminal, set your project id and the compute/zone in which you want to create your cluster.

    $ gcloud config set project glassy-chalice-129514$ 
    gcloud config set compute/zone us-east1-d
    # Create a 3 node cluster with name “loadbalancedcluster”$ 
    gcloud container clusters create loadbalancedcluster  

    Fetch the cluster credentials for the kubectl tool:

    $ gcloud container clusters get-credentials loadbalancedcluster --zone us-east1-d --project glassy-chalice-129514

    Step 1: Deploy an nginx server and echoserver service

    $ kubectl run nginx --image=nginx --port=80
    $ kubectl run echoserver --image=gcr.io/google_containers/echoserver:1.4 --port=8080
    $ kubectl get deployments
    NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    echoserver   1         1         1            1           15s
    nginx        1         1         1            1           26m

    Step 2: Expose your nginx and echoserver deployment as a service internally

    Create a Service resource to make the nginx and echoserver deployment reachable within your container cluster:

    $ kubectl expose deployment nginx --target-port=80  --type=NodePort
    $ kubectl expose deployment echoserver --target-port=8080 --type=NodePort

    When you create a Service of type NodePort with this command, Container Engine makes your Service available on a randomly-selected high port number (e.g. 30746) on all the nodes in your cluster. Verify the Service was created and a node port was allocated:

    $ kubectl get service nginx
    NAME      CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
    nginx     10.47.245.54   <nodes>       80:30746/TCP   20s
    $ kubectl get service echoserver
    NAME         CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
    echoserver   10.47.251.9   <nodes>       8080:32301/TCP   33s

    In the output above, the node port for the nginx Service is 30746 and for echoserver service is 32301. Also, note that there is no external IP allocated for this Services. Since the Container Engine nodes are not externally accessible by default, creating this Service does not make your application accessible from the Internet. To make your HTTP(S) web server application publicly accessible, you need to create an Ingress resource.

    Step 3: Create an Ingress resource

    On Container Engine, Ingress is implemented using Cloud Load Balancing. When you create an Ingress in your cluster, Container Engine creates an HTTP(S) load balancer and configures it to route traffic to your application. Container Engine has internally defined an Ingress Controller, which takes the Ingress resource as input for setting up proxy rules and talk to Kubernetes API to get the service related information.

    The following config file defines an Ingress resource that directs traffic to your nginx and echoserver server:

    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata: 
    name: fanout-ingress
    spec: 
    rules: 
    - http:     
    paths:     
    - path: /       
    backend:         
    serviceName: nginx         
    servicePort: 80     
    - path: /echo       
    backend:         
    serviceName: echoserver         
    servicePort: 8080

    To deploy this Ingress resource run in the cloud shell:

    $ kubectl apply -f basic-ingress.yaml

    Step 4: Access your application

    Find out the external IP address of the load balancer serving your application by running:

    $ kubectl get ingress fanout-ingres
    NAME             HOSTS     ADDRESS          PORTS     AG
    fanout-ingress   *         130.211.36.168   80        36s    

     

    Use http://<external-ip-address> </external-ip-address>and http://<external-ip-address>/echo</external-ip-address> to access nginx and the echo-server.

    Summary

    Ingresses are simple and very easy to deploy, and really fun to play with. However, it’s currently in beta phase and misses some of the features that may restrict it from production use. Stay tuned to get updates in Ingress on Kubernetes page and their Github repo.

    References

  • A Practical Guide to Deploying Multi-tier Applications on Google Container Engine (GKE)

    Introduction

    All modern era programmers can attest that containerization has afforded more flexibility and allows us to build truly cloud-native applications. Containers provide portability – ability to easily move applications across environments. Although complex applications comprise of many (10s or 100s) containers. Managing such applications is a real challenge and that’s where container orchestration and scheduling platforms like Kubernetes, Mesosphere, Docker Swarm, etc. come into the picture. 
    Kubernetes, backed by Google is leading the pack given that Redhat, Microsoft and now Amazon are putting their weight behind it.

    Kubernetes can run on any cloud or bare metal infrastructure. Setting up & managing Kubernetes can be a challenge but Google provides an easy way to use Kubernetes through the Google Container Engine(GKE) service.

    What is GKE?

    Google Container Engine is a Management and orchestration system for Containers. In short, it is a hosted Kubernetes. The goal of GKE is to increase the productivity of DevOps and development teams by hiding the complexity of setting up the Kubernetes cluster, the overlay network, etc.

    Why GKE? What are the things that GKE does for the user?

    • GKE abstracts away the complexity of managing a highly available Kubernetes cluster.
    • GKE takes care of the overlay network
    • GKE also provides built-in authentication
    • GKE also provides built-in auto-scaling.
    • GKE also provides easy integration with the Google storage services.

    In this blog, we will see how to create your own Kubernetes cluster in GKE and how to deploy a multi-tier application in it. The blog assumes you have a basic understanding of Kubernetes and have used it before. It also assumes you have created an account with Google Cloud Platform. If you are not familiar with Kubernetes, this guide from Deis  is a good place to start.

    Google provides a Command-line interface (gcloud) to interact with all Google Cloud Platform products and services. gcloud is a tool that provides the primary command-line interface to Google Cloud Platform. Gcloud tool can be used in the scripts to automate the tasks or directly from the command-line. Follow this guide to install the gcloud tool.

    Now let’s begin! The first step is to create the cluster.

    Basic Steps to create cluster

    In this section, I would like to explain about how to create GKE cluster. We will use a command-line tool to setup the cluster.

    Set the zone in which you want to deploy the cluster

    $ gcloud config set compute/zone us-west1-a

    Create the cluster using following command,

    $ gcloud container --project <project-name> 
    clusters create <cluster-name> 
    --machine-type n1-standard-2 
    --image-type "COS" --disk-size "50" 
    --num-nodes 2 --network default 
    --enable-cloud-logging --no-enable-cloud-monitoring

    Let’s try to understand what each of these parameters mean:

    –project: Project Name

    –machine-type: Type of the machine like n1-standard-2, n1-standard-4

    –image-type: OS image.”COS” i.e. Container Optimized OS from Google: More Info here.

    –disk-size: Disk size of each instance.

    –num-nodes: Number of nodes in the cluster.

    –network: Network that users want to use for the cluster. In this case, we are using default network.

    Apart from the above options, you can also use the following to provide specific requirements while creating the cluster:

    –scopes: Scopes enable containers to direct access any Google service without needs credentials. You can specify comma separated list of scope APIs. For example:

    • Compute: Lets you view and manage your Google Compute Engine resources
    • Logging.write: Submit log data to Stackdriver.

    You can find all the Scopes that Google supports here: .

    –additional-zones: Specify additional zones to high availability. Eg. –additional-zones us-east1-b, us-east1-d . Here Kubernetes will create a cluster in 3 zones (1 specified at the beginning and additional 2 here).

    –enable-autoscaling : To enable the autoscaling option. If you specify this option then you have to specify the minimum and maximum required nodes as follows; You can read more about how auto-scaling works here. Eg:   –enable-autoscaling –min-nodes=15 –max-nodes=50

    You can fetch the credentials of the created cluster. This step is to update the credentials in the kubeconfig file, so that kubectl will point to required cluster.

    $ gcloud container clusters get-credentials my-first-cluster --project project-name

    Now, your First Kubernetes cluster is ready. Let’s check the cluster information & health.

    $ kubectl get nodes
    NAME    STATUS    AGE   VERSION
    gke-first-cluster-default-pool-d344484d-vnj1  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-kdd7  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-ytre2  Ready  2h  v1.6.4

    After creating Cluster, now let’s see how to deploy a multi tier application on it. Let’s use simple Python Flask app which will greet the user, store employee data & get employee data.

    Application Deployment

    I have created simple Python Flask application to deploy on K8S cluster created using GKE. you can go through the source code here. If you check the source code then you will find directory structure as follows:

    TryGKE/
    ├── Dockerfile
    ├── mysql-deployment.yaml
    ├── mysql-service.yaml
    ├── src    
      ├── app.py    
      └── requirements.txt    
      ├── testapp-deployment.yaml    
      └── testapp-service.yaml

    In this, I have written a Dockerfile for the Python Flask application in order to build our own image to deploy. For MySQL, we won’t build an image of our own. We will use the latest MySQL image from the public docker repository.

    Before deploying the application, let’s re-visit some of the important Kubernetes terms:

    Pods:

    The pod is a Docker container or a group of Docker containers which are deployed together on the host machine. It acts as a single unit of deployment.

    Deployments:

    Deployment is an entity which manages the ReplicaSets and provides declarative updates to pods. It is recommended to use Deployments instead of directly using ReplicaSets. We can use deployment to create, remove and update ReplicaSets. Deployments have the ability to rollout and rollback the changes.

    Services:

    Service in K8S is an abstraction which will connect you to one or more pods. You can connect to pod using the pod’s IP Address but since pods come and go, their IP Addresses change.  Services get their own IP & DNS and those remain for the entire lifetime of the service. 

    Each tier in an application is represented by a Deployment. A Deployment is described by the YAML file. We have two YAML files – one for MySQL and one for the Python application.

    1. MySQL Deployment YAML

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: mysql
    spec:
      template:
        metadata:
          labels:
            app: mysql
        spec:
          containers:
            - env:
                - name: MYSQL_DATABASE
                  value: admin
                - name: MYSQL_ROOT_PASSWORD
                  value: admin
              image: 'mysql:latest'
              name: mysql
              ports:
                - name: mysqlport
                  containerPort: 3306
                  protocol: TCP

    2. Python Application Deployment YAML

    apiVersion: apps/v1beta1
    kind: Deployment
    metadata:
      name: test-app
    spec:
      replicas: 1
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - name: test-app
            image: ajaynemade/pymy:latest
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 5000

    Each Service is also represented by a YAML file as follows:

    1. MySQL service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: mysql-service
    spec:
      ports:
      - port: 3306
        targetPort: 3306
        protocol: TCP
        name: http
      selector:
        app: mysql

    2. Python Application service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: test-service
    spec:
      type: LoadBalancer
      ports:
      - name: test-service
        port: 80
        protocol: TCP
        targetPort: 5000
      selector:
        app: test-app

    You will find a ‘kind’ field in each YAML file. It is used to specify whether the given configuration is for deployment, service, pod, etc.

    In the Python app service YAML, I am using type = LoadBalancer. In GKE, There are two types of cloud load balancers available to expose the application to outside world.

    1. TCP load balancer: This is a TCP Proxy-based load balancer. We will use this in our example.
    2. HTTP(s) load balancer: It can be created using Ingress. For more information, refer to this post that talks about Ingress in detail.

    In the MySQL service, I’ve not specified any type, in that case, type ‘ClusterIP’ will get used, which will make sure that MySQL container is exposed to the cluster and the Python app can access it.

    If you check the app.py, you can see that I have used “mysql-service.default” as a hostname. “Mysql-service.default” is a DNS name of the service. The Python application will refer to that DNS name while accessing the MySQL Database.

    Now, let’s actually setup the components from the configurations. As mentioned above, we will first create services followed by deployments.

    Services:

    $ kubectl create -f mysql-service.yaml
    $ kubectl create -f testapp-service.yaml

    Deployments:

    $ kubectl create -f mysql-deployment.yaml
    $ kubectl create -f testapp-deployment.yaml

    Check the status of the pods and services. Wait till all pods come to the running state and Python application service to get external IP like below:

    $ kubectl get services
    NAME            CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
    kubernetes      10.55.240.1     <none>        443/TCP        5h
    mysql-service   10.55.240.57    <none>        3306/TCP       1m
    test-service    10.55.246.105   35.185.225.67     80:32546/TCP   11s

    Once you get the external IP, then you should be able to make APIs calls using simple curl requests.

    Eg. To Store Data :

    curl -H "Content-Type: application/x-www-form-urlencoded" -X POST  http://35.185.225.67:80/storedata -d id=1 -d name=NoOne

    Eg. To Get Data :

    curl 35.185.225.67:80/getdata/1

    At this stage your application is completely deployed and is externally accessible.

    Manual scaling of pods

    Scaling your application up or down in Kubernetes is quite straightforward. Let’s scale up the test-app deployment.

    $ kubectl scale deployment test-app --replicas=3

    Deployment configuration for test-app will get updated and you can see 3 replicas of test-app are running. Verify it using,

    kubectl get pods

    In the same manner, you can scale down your application by reducing the replica count.

    Cleanup :

    Un-deploying an application from Kubernetes is also quite straightforward. All we have to do is delete the services and delete the deployments. The only caveat is that the deletion of the load balancer is an asynchronous process. You have to wait until it gets deleted.

    $ kubectl delete service mysql-service
    $ kubectl delete service test-service

    The above command will deallocate Load Balancer which was created as a part of test-service. You can check the status of the load balancer with the following command.

    $ gcloud compute forwarding-rules list

    Once the load balancer is deleted, you can clean-up the deployments as well.

    $ kubectl delete deployments test-app
    $ kubectl delete deployments mysql

    Delete the Cluster:

    $ gcloud container clusters delete my-first-cluster

    Conclusion

    In this blog, we saw how easy it is to deploy, scale & terminate applications on Google Container Engine. Google Container Engine abstracts away all the complexity of Kubernetes and gives us a robust platform to run containerised applications. I am super excited about what the future holds for Kubernetes!

    Check out some of Velotio’s other blogs on Kubernetes.

  • How to Avoid Screwing Up CI/CD: Best Practices for DevOps Team

    Basic Fundamentals (One-line definition) :

    CI/CD is defined as continuous integration, continuous delivery, and/or continuous deployment. 

    Continuous Integration: 

    Continuous integration is defined as a practice where a developer’s changes are merged back to the main branch as soon as possible to avoid facing integration challenges.

    Continuous Delivery:

    Continuous delivery is basically the ability to get all the types of changes deployed to production or delivered to the customer in a safe, quick, and sustainable way.

    An oversimplified CI/CD pipeline

    Why CI/CD?

    • Avoid integration hell

    In most modern application development scenarios, multiple developers work on different features simultaneously. However, if all the source code is to be merged on the same day, the result can be a manual, tedious process of resolving conflicts between branches, as well as a lot of rework.  

    Continuous integration (CI) is the process of merging the code changes frequently (can be daily or multiple times a day also) to a shared branch (aka master or truck branch). The CI process makes it easier and quicker to identify bugs, saving a lot of developer time and effort.

    • Faster time to market

    Less time is spent on solving integration problems and reworking, allowing faster time to market for products.

    • Have a better and more reliable code

    The changes are small and thus easier to test. Each change goes through a rigorous cycle of unit tests, integration/regression tests, and performance tests before being pushed to prod, ensuring a better quality code.  

    • Lower costs 

    As we have a faster time to market and fewer integration problems,  a lot of developer time and development cycles are saved, leading to a lower cost of development.

    Enough theory now, let’s dive into “How do I get started ?”

    Basic Overview of CI/CD

    Decide on your branching strategy

    A good branching strategy should have the following characteristics:

    • Defines a clear development process from initial commit to production deployment
    • Enables parallel development
    • Optimizes developer productivity
    • Enables faster time to market for products and services
    • Facilitates integration with all DevOps practices and tools such as different versions of control systems

    Types of branching strategies (please refer to references for more details) :

    • Git flow – Ideal when handling multiple versions of the production code and for enterprise customers who have to adhere to release plans and workflows 
    • Trunk-based development – Ideal for simpler workflows and if automated testing is available, leading to a faster development time
    • Other branching strategies that you can read about are Github flow, Gitlab flow, and Forking flow.

    Build or compile your code 

    The next step is to build/compile your code, and if it is interpreted code, go ahead and package it.

    Build best practices :

    • Build Once – Building the same artifact for multiple env is inadvisable.
    • Exact versions of third-party dependencies should be used.
    • Libraries used for debugging, etc., should be removed from the product package.
    • Have a feedback loop so that the team is made aware of the status of the build step.
    • Make sure your builds are versioned correctly using semver 2.0 (https://semver.org/).
    • Commit early, commit often.

    Select tool for stitching the pipeline together

    • You can choose from GitHub actions, Jenkins, circleci, GitLab, etc.
    • Tool selection will not affect the quality of your CI/CD pipeline but might increase the maintenance if we go for managed CI/CD services as opposed to services like Jenkins deployed onprem. 

    Tools and strategy for SAST

    Instead of just DevOps, we should think of devsecops. To make the code more secure and reliable, we can introduce a step for SAST (static application security testing).

    SAST, or static analysis, is a testing procedure that analyzes source code to find security vulnerabilities. SAST scans the application code before the code is compiled. It’s also known as white-box testing, and it helps shift towards a security-first mindset as the code is scanned right at the start of SDLC.

    Problems SAST solves:

    • SAST tools give developers real-time feedback as they code, helping them fix issues before they pass the code to the next phase of the SDLC. 
    • This prevents security-related issues from being considered an afterthought. 

    Deployment strategies

    How will you deploy your code with zero downtime so that the customer has the best experience? Try and implement one of the strategies below automatically via CI/CD. This will help in keeping the blast radius to the minimum in case something goes wrong. 

    • Ramped (also known as rolling-update or incremental): The new version is slowly rolled out to replace the older version of the product .
    • Blue/Green: The new version is released alongside the older version, then the traffic is switched to the newer version.
    • Canary: The new version is released to a selected group of users before doing  a full rollout. This can be achieved by feature flagging as well. For more information, read about tools like launch darkly(https://launchdarkly.com/) and git unleash (https://github.com/Unleash/unleash). 
    • A/B testing: The new version is released to a subset of users under specific conditions.
    • Shadow: The new version receives real-world traffic alongside the older version and doesn’t impact the response.

    Config and Secret Management

    According to the 12-factor app, application configs should be exposed to the application with environment variables. However, it does not have restrictions on where these configurations need to be stored and sourced from.

    A few things to keep in mind while storing configs.

    • Versioning of configs always helps, but storing secrets in VCS is strongly discouraged.
    • For an enterprise, it is beneficial to use a cloud-agnostic solution.

    Solution:

    • Store your configuration secrets outside of the version control system.
    • You can use AWS secret manager, Vault, and even S3 for storing your configs, e.g.: S3 with KMS, etc. There are other services available as well, so choose the one which suits your use case the best.

    Automate versioning and release notes generation

    All the releases should be tagged in the version control system. Versions can be automatically updated by looking at the git commit history and searching for keywords.

    There are many modules available for release notes generation. Try and automate these as well as a part of your CI/CD process. If this is done, you can successfully eliminate human intervention from the release process.

    Example from GitHub actions workflow :

    - name: Automated Version Bump
      id: version-bump
      uses: 'phips28/gh-action-bump-version@v9.0.16'
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      with:
        commit-message: 'CI: Bump version to v{{version}}'

    Have a rollback strategy

    In case of regression, performance, or smoke test fails after deployment onto an environment, feedback should be given and the version should be rolled back automatically as a part of the CI/CD process. This makes sure that the environment is up and also reduces the MTTR (mean time to recovery), and MTTD (mean time to detection) in case there is a production outage due to code deployment.

    GitOps tools like argocd and flux make it easy to do things like this, but even if you are not using any of the GitOps tools, this can be easily managed using scripts or whatever tool you are using for deployment.

    Include db changes as a part of your CI/CD

    Databases are often created manually and frequently evolve through manual changes, informal processes, and even testing in production. Manual changes often lack documentation and are harder to review, test, and coordinate with software releases. This makes the system more fragile with a higher risk of failure.

    The correct way to do this is to include the database in source control and CI/CD pipeline. This lets the team document each change, follow the code review process, test it thoroughly before release, make rollbacks easier, and coordinate with software releases. 

    For a more enterprise or structured solution, we could use a tool such as Liquibase, Alembic, or Flyway.

    How it should ideally be done:

    • We can have a migration-based strategy where, for each DB change, an additional migration script is added and is executed as a part of CI/CD .
    • Things to keep in mind are that the CI/CD process should be the same across all the environments. Also, the amount of data on prod and other environments might vary drastically, so batching and limits should be used so that we don’t end up using all the memory of our database server.
    • As far as possible, DB migrations should be backward compatible. This makes it easier for rollbacks. This is the reason some companies only allow additive changes as a part of db migration scripts. 

    Real-world scenarios

    • Gated approach 

    It is not always possible to have a fully automated CI/CD pipeline because the team may have just started the development of a product and might not have automated testing yet.

    So, in cases like these, we have manual gates that can be approved by the responsible teams. For example, we will deploy to the development environment and then wait for testers to test the code and approve the manual gate, then the pipeline can go forward.

    Most of the tools support these kinds of requests. Make sure that you are not using any kind of resources for this step otherwise you will end up blocking resources for the other pipelines.

    Example:

    https://www.jenkins.io/doc/pipeline/steps/pipeline-input-step/#input-wait-for-interactive-input

    def LABEL_ID = "yourappname-${UUID.randomUUID().toString()}"
    def BRANCH_NAME = "<Your branch name>"
    def GIT_URL = "<Your git url>"
    // Start Agent
    node(LABEL_ID) {
        stage('Checkout') {
            doCheckout(BRANCH_NAME, GIT_URL)
        }
        stage('Build') {
            ...
        }
        stage('Tests') {
            ...
        }    
    }
    // Kill Agent
    // Input Step
    timeout(time: 15, unit: "MINUTES") {
        input message: 'Do you want to approve the deploy in production?', ok: 'Yes'
    }
    // Start Agent Again
    node(LABEL_ID) {
        doCheckout(BRANCH_NAME, GIT_URL) 
        stage('Deploy') {
            ...
        }
    }
    def doCheckout(branchName, gitUrl){
        checkout([$class: 'GitSCM',
            branches: [[name: branchName]],
            doGenerateSubmoduleConfigurations: false,
            extensions:[[$class: 'CloneOption', noTags: true, reference: '', shallow: true]],
            userRemoteConfigs: [[credentialsId: '<Your credentials id>', url: gitUrl]]])
    }

    Observability of releases 

    Whenever we are debugging the root cause of issues in production, we might need the information below. As the system gets more complex with multiple upstreams and downstream, it becomes imperative that we have this information, all in one place, for efficient debugging and support by the operations team.

    • When was the last deployment? What version was deployed?
    • The deployment history as to which version was deployed when along with the code changes that went in.

    Below are the 2 ways generally organizations follow to achieve this:

    • Have a release workflow that is tracked using a Change request or Service request on Jira or any other tracking tool.
    • For GitOps applications using tools like Argo CD and flux, all this information is available as a part of the version control system and can be derived from there.

    DORA metrics 

    DevOps maturity of a team is measured based on mainly four metrics that are defined below, and CI/CD helps in improving all of the below. So, teams and organizations should try and achieve the Elite status for DORA metrics.

    • Deployment Frequency— How often an org successfully releases to production
    • Lead Time for Changes— The amount of time a commit takes to get into prod
    • Change Failure Rate— The percentage of deployments causing a failure in prod
    • Time to Restore Service— How long an org takes to recover from a failure in prod

    Conclusion 

    CI/CD forms an integral part of DevOps and SRE practices, and if done correctly,  it can impact the team’s and organization’s productivity in a huge way. 

    So, try and implement the above principles and get one step closer to having a highly productive team and a better product.

  • Cloud Native Applications — The Why, The What & The How

    Cloud-native is an approach to build & run applications that can leverage the advantages of the cloud computing model — On demand computing power & pay-as-you-go pricing model. These applications are built and deployed in a rapid cadence to the cloud platform and offer organizations greater agility, resilience, and portability across clouds.

    This blog explains the importance, the benefits and how to go about building Cloud Native Applications.

    CLOUD NATIVE – The Why?

    Early technology adapters like FANG (Facebook, Amazon, Netflix & Google) have some common themes when it comes to shipping software. They have invested heavily in building capabilities that enable them to release new features regularly (weekly, daily or in some cases even hourly). They have achieved this rapid release cadence while supporting safe and reliable operation of their applications; in turn allowing them to respond more effectively to their customers’ needs.

    They have achieved this level of agility by moving beyond ad-hoc automation and by adopting cloud native practices that deliver these predictable capabilities. DevOps,Continuous Delivery, micro services & containers form the 4 main tenets of Cloud Native patterns. All of them have the same overarching goal of making application development and operations team more efficient through automation.

    At this point though, these techniques have only been successfully proven at the aforementioned software driven companies. Smaller, more agile companies are also realising the value here. However, as per Joe Beda(creator of Kubernetes & CTO at Heptio) there are very few examples of this philosophy being applied outside these technology centric companies.

    Any team/company shipping products should seriously consider adopting Cloud Native practices if they want to ship software faster while reducing risk and in turn delighting their customers.

    CLOUD NATIVE – The What?

    Cloud Native practices comprise of 4 main tenets.

     

    Cloud native — main tenets
    • DevOps is the collaboration between software developers and IT operations with the goal of automating the process of software delivery & infrastructure changes.
    • Continuous Delivery enables applications to released quickly, reliably & frequently, with less risk.
    • Micro-services is an architectural approach to building an application as a collection of small independent services that run on their own and communicate over HTTP APIs.
    • Containers provide light-weight virtualization by dynamically dividing a single server into one or more isolated containers. Containers offer both effiiciency & speed compared to standard Virual Machines (VMs). Containers provide the ability to manage and migrate the application dependencies along with the application. while abstracting away the OS and the underlying cloud platform in many cases.

    The benefits that can be reaped by adopting these methodologies include:

    1. Self managing infrastructure through automation: The Cloud Native practice goes beyond ad-hoc automation built on top of virtualization platforms, instead it focuses on orchestration, management and automation of the entire infrastructure right upto the application tier.
    2. Reliable infrastructure & application: Cloud Native practice ensures that it much easier to handle churn, replace failed components and even easier to recover from unexpected events & failures.
    3. Deeper insights into complex applications: Cloud Native tooling provides visualization for health management, monitoring and notifications with audit logs making applications easy to audit & debug
    4. Security: Enable developers to build security into applications from the start rather than an afterthought.
    5. More efficient use of resources: Containers are lighter in weight that full systems. Deploying applications in containers lead to increased resource utilization.

    Software teams have grown in size and the amount of applications and tools that a company needs to be build has grown 10x over last few years. Microservices break large complex applications into smaller pieces so that they can be developed, tested and managed independently. This enables a single microservice to be updated or rolled-back without affecting other parts of the application. Also nowadays software teams are distributed and microservices enables each team to own a small piece with service contracts acting as the communication layer.

    CLOUD NATIVE – The How?

    Now, lets look at the various building blocks of the cloud native stack that help achieve the above described goals. Here, we have grouped tools & solutions as per the problem they solve. We start with the infrastructure layer at the bottom, then the tools used to provision the infrastructure, following which we have the container runtime environment; above that we have tools to manage clusters of container environments and then at the very top we have the tools, frameworks to develop the applications.

    1. Infrastructure: At the very bottom, we have the infrastructure layer which provides the compute, storage, network & operating system usually provided by the Cloud (AWS, GCP, Azure, Openstack, VMware).

    2. Provisioning: The provisioning layer consists of automation tools that help in provisioning the infrastructure, managing images and deploying the application. Chef, Puppet & Ansible are the DevOps tools that give the ability to manage their configuration & environments. Spinnaker, Terraform, Cloudformation provide workflows to provision the infrastructure. Twistlock, Clair provide the ability to harden container images.

    3. Runtime: The Runtime provides the environment in which the application runs. It consists of the Container Engines where the application runs along with the associated storage & networking. containerd & rkt are the most widely used Container engines. Flannel, OpenContrail provide the necessary overlay networking for containers to interact with each other and the outside world while Datera, Portworx, AppOrbit etc. provide the necessary persistent storage enabling easy movement of containers across clouds.

    4. Orchestration and Management: Tools like Kubernetes, Docker Swarm and Apache Mesos abstract the management container clusters allowing easy scheduling & orchestration of containers across multiple hosts. etcd, Consul provide service registries for discovery while AVI, Envoy provide proxy, load balancer etc. services.

    5. Application Definition & Development: We can build micro-services for applications across multiple langauges — Python, Spring/Java, Ruby, Node. Packer, Habitat & Bitnami provide image management for the application to run across all infrastructure — container or otherwise. 
    Jenkins, TravisCI, CircleCI and other build automation servers provide the capability to setup continuous integration and delivery pipelines.

    6. Monitoring, Logging & Auditing: One of the key features of managing Cloud Native Infrastructure is the ability to monitor & audit the applications & underlying infrastructure.

    All modern monitoring platforms like Datadog, Newrelic, AppDynamic support monitoring of containers & microservices.

    Splunk, Elasticsearch & fluentd help in log aggregration while Open Tracing and Zipkin help in debugging applications.

    7. Culture: Adopting cloud native practices needs a cultural change where teams no longer work in independent silos. End-to-end automation of software delivery pipelines is only possible when there is an increased collaboration between development and IT operations team with a shared responbility.

    When we put all the pieces together we get the complete Cloud Native Landscape as shown below.

    Cloud Native Landscape

    I hope this post gives an idea why Cloud Native is important and what the main benefits are. As you may have noticed in the above infographic, there are several projects, tools & companies trying to solve similar problems. The next questions in mind most likely will be How do i get started? Which tools are right for me? and so on. I will cover these topics and more in my following blog posts. Stay tuned!

    Please let us know what you think by adding comments to this blog or reaching out to chirag_jog or Velotio on Twitter.

    Learn more about what we do at Velotio here and how Velotio can get you started on your cloud native journey here.

    References: