Tag: mysql

  • Simplifying MySQL Sharding with ProxySQL: A Step-by-Step Guide

    Introduction:

    ProxySQL is a powerful SQL-aware proxy designed to sit between database servers and client applications, optimizing database traffic with features like load balancing, query routing, and failover. This article focuses on simplifying the setup of ProxySQL, especially for users implementing data-based sharding in a MySQL database.

    What is Sharding?

    Sharding involves partitioning a database into smaller, more manageable pieces called shards based on certain criteria, such as data attributes. ProxySQL supports data-based sharding, allowing users to distribute data across different shards based on specific conditions.

    Understanding the Need for ProxySQL:

    ProxySQL is an intermediary layer that enhances database management, monitoring, and optimization. With features like data-based sharding, ProxySQL is an ideal solution for scenarios where databases need to be distributed based on specific data attributes, such as geographic regions.

    ‍Installation & Setup:‍

    There are two ways to install the proxy, either by installing it using packages or running  ProxySQL in docker. ProxySQL can be installed using two methods: via packages or running it in a Docker container. For this guide, we will focus on the Docker installation.

    1. Install ProxySQL and MySQL Docker Images:

    To start, pull the necessary Docker images for ProxySQL and MySQL using the following commands:

    docker pull mysql:latest
    docker pull proxysql/proxysql

    2. Create Docker Network:

    Create a Docker network for communication between MySQL containers:

    docker network create multi-tenant-network

    Note: ProxySQL setup will need connections to multiple SQL servers. So, we will set up multiple SQL servers on our docker inside a Docker network.

    Containers within the same Docker network can communicate with each other using their container names or IP addresses.

    You can check the list of all the Docker networks currently present by running the following command:

    docker network ls

    3. Set Up MySQL Containers:

    Now, create three MySQL containers within the network:

    Note: We can create any number of MySQL containers.

    docker run -d --name mysql_host_1 --network=multi-tenant-network -p 3307:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest 
    docker run -d --name mysql_host_2 --network=multi-tenant-network -p 3308:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest 
    docker run -d --name mysql_host_3 --network=multi-tenant-network -p 3309:3306 -e MYSQL_ROOT_PASSWORD=pass123 mysql:latest

    Note: Adjust port numbers as necessary. 

    The default MySQL protocol port is 3306, but since we cannot access all three of our MySQL containers on the same port, we have set their ports to 3307, 3308, and 3309. Although internally, all MySQL containers will connect using port 3306.

    –network=multi-tenant-network. This specifies that the container should be created under the specified network.

    We have also specified the root password of the MySQL container to log into it, where the username is “root” and the password is “pass123” for all three of them.

    After running the above three commands, three MySQL containers will start running inside the network. You can connect to these three hosts using host = localhost or 127.0.0.1 and port = 3307 / 3308 / 3309.

    To ping the port, use the following command:

    for macOS:

    nc -zv 127.0.0.1 3307

    for Windows: 

    ping 127.0.0.1 3307

    for Linux: 

    telnet 127.0.0.1 3307

    Reference Image

    4. Create Users in MySQL Containers:

    Create “user_shard” and “monitor” users in each MySQL container.

    The “user_shard” user will be used by the proxy to make queries to the DB.

    The “monitor” user will be used by the proxy to monitor the DB.

    Note: To access the MySQL container mysql_host_1, use the command:

    docker exec -it mysql_host_1 mysql -uroot -ppass123

    Use the following commands inside the MySQL container to create the user:

    CREATE USER 'user_shard'@'%' IDENTIFIED BY 'pass123'; 
    GRANT ALL PRIVILEGES ON *.* TO 'user_shard'@'%' WITH GRANT OPTION; 
    FLUSH PRIVILEGES;
    
    CREATE USER monitor@'%' IDENTIFIED BY 'pass123'; 
    GRANT ALL PRIVILEGES ON *.* TO monitor@'%' WITH GRANT OPTION; 
    FLUSH PRIVILEGES;

    Repeat the above steps for mysql_host_2 & mysql_host_3.

    If, at any point, you need to drop the user, you can use the following command:

    DROP USER monitor@’%’;

    5. Prepare ProxySQL Configuration:

    To prepare the configuration, we will need the IP addresses of the MySQL containers. To find those, we can use the following command:

    docker inspect mysql_host_1;
    docker inspect mysql_host_2; 
    docker inspect mysql_host_3;

    By running these commands, you will get all the details of the MySQL Docker container under a field named “IPAddress” inside your network. That is the IP address of that particular MySQL container.

    Example:
    mysql_host_1: 172.19.0.2

    mysql_host_2: 172.19.0.3

    mysql_host_3: 172.19.0.4

    Reference image for IP address of mysql_host_1: 172.19.0.2

    Now, create a ProxySQL configuration file named proxysql.cnf. Include details such as IP addresses of MySQL containers, administrative credentials, and MySQL users.

    Below is the content that needs to be added to the proxysql.cnf file:

    datadir="/var/lib/proxysql"
    
    admin_variables=
    {
        admin_credentials="admin:admin;radmin:radmin"
        mysql_ifaces="0.0.0.0:6032"
        refresh_interval=2000
        hash_passwords=false
    }
    
    mysql_variables=
    {
        threads=4
        max_connections=2048
        default_query_delay=0
        default_query_timeout=36000000
        have_compress=true
        poll_timeout=2000
        interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
        default_schema="information_schema"
        stacksize=1048576
        server_version="5.1.30"
        connect_timeout_server=10000
        monitor_history=60000
        monitor_connect_interval=200000
        monitor_ping_interval=200000
        ping_interval_server_msec=10000
        ping_timeout_server=200
        commands_stats=true
        sessions_sort=true
        monitor_username="monitor"
        monitor_password="pass123"
    }
    
    mysql_servers =
    (
        { address="172.19.0.2" , port=3306 , hostgroup=10, max_connections=100 },
        { address="172.19.0.3" , port=3306 , hostgroup=20, max_connections=100 },
        { address="172.19.0.4" , port=3306 , hostgroup=30, max_connections=100 }
    )
    
    
    mysql_users =
    (
        { username = "user_shard" , password = "pass123" , default_hostgroup = 10 , active = 1 },
        { username = "user_shard" , password = "pass123" , default_hostgroup = 20 , active = 1 },
        { username = "user_shard" , password = "pass123" , default_hostgroup = 30 , active = 1 }
    )

    Most of the settings are default; we won’t go into much detail for each setting. 

    admin_variables: These variables are used for ProxySQL’s administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

    mysql_variables, monitor_username, and monitor_password are used to specify the username that ProxySQL will use when connecting to MySQL servers for monitoring purposes. This monitoring user is used to execute queries and gather statistics about the health and performance of the MySQL servers. This is the user we created during step 4.

    mysql_servers will contain all the MySQL servers we want to be connected with ProxySQL. Each entry will have the IP address of the MySQL container, port, host group, and max_connections. Mysql_users will have all the users we created during step 4.

    7. Run ProxySQL Container:

    Inside the same directory where the proxysql.cnf file is located, run the following command to start ProxySQL:

    docker run -d --rm -p 6032:6032 -p 6033:6033 -p 6080:6080 --name=proxysql --network=multi-tenant-network -v $PWD/proxysql.cnf:/etc/proxysql.cnf proxysql/proxysql

    Here, port 6032 is used for ProxySQL’s administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

    Port 6033 is the default port for ProxySQL’s MySQL protocol interface. It is used for handling MySQL client connections. Our application will use it to access the ProxySQL db and make SQL queries.

    The above command will make ProxySQL run on our Docker with the configuration provided in the proxysql.cnf file.

    Inside ProxySQL Container:

    8. Access ProxySQL Admin Console:

    Now, to access the ProxySQL Docker container, use the following command:

    docker exec -it proxysql bash

    Now, once you’re inside the ProxySQL Docker container, you can access the ProxySQL admin console using the command:

    mysql -u admin -padmin -h 127.0.0.1 -P 6032

    You can run the following queries to get insights into your ProxySQL server:

    i) To get the list of all the connected MySQL servers:

    SELECT * FROM mysql_servers;

    ii) Verify the status of the MySQL backends in the monitor database tables in ProxySQL admin using the following command:

    SHOW TABLES FROM monitor;


    If this returns an empty set, it means that the monitor username and password are not set correctly. You can do so by using the below commands:

    UPDATE global_variables SET variable_value=’monitor’ WHERE variable_name='mysql-monitor_username'; 
    UPDATE global_variables SET variable_value=’pass123’ WHERE variable_name='mysql-monitor_password';
    LOAD MYSQL VARIABLES TO RUNTIME; 
    SAVE MYSQL VARIABLES TO DISK;

    And then restart the proxy Docker container:

    iii) Check the status of DBs connected to ProxySQL using the following command:

    SELECT * FROM monitor.mysql_server_connect_log ORDER BY time_start_us DESC;

    iv) To get a list of all the ProxySQL global variables, use the following command:

    SELECT * FROM global_variables; 

    v) To get all the queries made on ProxySQL, use the following command:

    Select * from stats_mysql_query_digest;

    Note: Whenever we change any row, use the below commands to load them:

    Change in variables:

    LOAD MYSQL VARIABLES TO RUNTIME; 
    SAVE MYSQL VARIABLES TO DISK;
    
    Change in mysql_servers:
    LOAD MYSQL SERVERS TO RUNTIME;
    SAVE MYSQL SERVERS TO DISK;
    
    Change in mysql_query_rules:
    LOAD MYSQL QUERY RULES TO RUNTIME;
    SAVE MYSQL QUERY RULES TO DISK;

    And then restart the proxy docker container.

    IMPORTANT:

    To connect to ProxySQL’s admin console, first get into the Docker container using the following command:

    docker exec -it proxysql bash

    Then, to access the ProxySQL admin console, use the following command:

    mysql -u admin -padmin -h 127.0.0.1 -P6032

    To access the ProxySQL MySQL console, we can directly access it using the following command without going inside the Docker ProxySQL container:

    mysql -u user_shard -ppass123 -h 127.0.0.1 -P6033

    To make queries to the database, we make use of ProxySQL’s 6033 port, where MySQL is being accessed.

    9. Define Query Rules:

    We can add custom query rules inside the mysql_query_rules table to redirect queries to specific databases based on defined patterns. Load the rules to runtime and save to disk.

    12. Sharding Example:

    Now, let’s illustrate how to leverage ProxySQL’s data-based sharding capabilities through a practical example. We’ll create three MySQL containers, each containing data from different continents in the “world” database, specifically within the “countries” table.

    Step 1: Create 3 MySQL containers named mysql_host_1, mysql_host_2 & mysql_host_3.

    Inside all containers, create a database named “world” with a table named “countries”.

    i) Inside mysql_host_1: Insert countries using the following query:

    INSERT INTO `countries` VALUES (1,'India','Asia'),(2,'Japan','Asia'),(3,'China','Asia'),(4,'USA','North America'),(5,'Cuba','North America'),(6,'Honduras','North America');

    ii) Inside mysql_host_2: Insert countries using the following query:

    INSERT INTO `countries` VALUES (1,'Kenya','Africa'),(2,'Ghana','Africa'),(3,'Morocco','Africa'),(4, "Brazil", "South America"), (5, "Chile", "South America"), (6, "Morocco", "South America");

    iii) Inside mysql_host_3: Insert countries using the following query:

    CODE: INSERT INTO `countries` VALUES (1, “Italy”, “Europe”), (2, “Germany”, “Europe”), (3, “France”, “Europe”);

    Now, we have distinct data sets for Asia & North America in mysql_host_1, Africa & South America in mysql_host_2, and Europe in mysql_host_3..js

    Step 2: Define Query Rules for Sharding

    Let’s create custom query rules to redirect queries based on the continent specified in the SQL statement.

    For example, if the query contains the continent “Asia,” we want it to be directed to mysql_host_1.

    — Query Rule for Asia and North America 

    INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (10, 1, 'user_shard', "s*continents*=s*.*?(Asia|North America).*?s*", 10, 0);

    — Query Rule for Africa and South America

    INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (20, 1, 'user_shard', "s*continents*=s*.*?(Africa|South America).*?s*", 20, 0);

    — Query Rule for Europe 

    INSERT INTO mysql_query_rules (rule_id, active, username, match_pattern, destination_hostgroup, apply) VALUES (30, 1, 'user_shard', "s*continents*=s*.*?(Europe).*?s*", 30, 0);

    Step 3: Apply and Save Query Rules

    After adding the query rules, ensure they take effect by running the following commands:

    LOAD MYSQL QUERY RULES TO RUNTIME; 
    SAVE MYSQL QUERY RULES TO DISK;

    Step 4: Test Sharding

    Now, access the MySQL server using the ProxySQL port and execute queries:

    mysql -u user_shard -ppass123 -h 127.0.0.1 -P 6033

    use world;

    — Example Queries:

    Select * from countries where id = 1 and continent = "Asia";

    — This will return id=1, name=India, continent=Asia

    Select * from countries where id = 1 and continent = "Africa";

    — This will return id=1, name=Kenya, continent=Africa.

    Select * from countries where id = 1 and continent = "Africa";

    Based on the defined query rules, the queries will be redirected to the specified MySQL host groups. If no rules match, the default host group that’s specified in mysql_users inside proxysql.cnf will be used.

    Conclusion:

    ProxySQL simplifies access to distributed data through effective sharding strategies. Its flexible query rules, combined with regex patterns and host group definitions, offer significant flexibility with relative simplicity.

    By following this step-by-step guide, users can quickly set up ProxySQL and leverage its capabilities to optimize database performance and achieve efficient data distribution.

    References:

    Download and Install ProxySQL – ProxySQL

    How to configure ProxySQL for the first time – ProxySQL

    Admin Variables – ProxySQL

  • How to deploy GitHub Actions Self-Hosted Runners on Kubernetes

    GitHub Actions jobs are run in the cloud by default; however, sometimes we want to run jobs in our own customized/private environment where we have full control. That is where a self-hosted runner saves us from this problem. 

    To get a basic understanding of running self-hosted runners on the Kubernetes cluster, this blog is perfect for you. 

    We’ll be focusing on running GitHub Actions on a self-hosted runner on Kubernetes. 

    An example use case would be to create an automation in GitHub Actions to execute MySQL queries on MySQL Database running in a private network (i.e., MySQL DB, which is not accessible publicly).

    A self-hosted runner requires the provisioning and configuration of a virtual machine instance; here, we are running it on Kubernetes. For running a self-hosted runner on a Kubernetes cluster, the action-runner-controller helps us to make that possible.

    This blog aims to try out self-hosted runners on Kubernetes and covers:

    1. Deploying MySQL Database on minikube, which is accessible only within Kubernetes Cluster.
    2. Deploying self-hosted action runners on the minikube.
    3. Running GitHub Action on minikube to execute MySQL queries on MySQL Database.

    Steps for completing this tutorial:

    Create a GitHub repository

    1. Create a private repository on GitHub. I am creating it with the name velotio/action-runner-poc.

    Setup a Kubernetes cluster using minikube

    1. Install Docker.
    2. Install Minikube.
    3. Install Helm 
    4. Install kubectl

    Install cert-manager on a Kubernetes cluster

    • By default, actions-runner-controller uses cert-manager for certificate management of admission webhook, so we have to make sure cert-manager is installed on Kubernetes before we install actions-runner-controller. 
    • Run the below helm commands to install cert-manager on minikube.
    • Verify installation using “kubectl –namespace cert-manager get all”. If everything is okay, you will see an output as below:

    Setting Up Authentication for Hosted Runners‍

    There are two ways for actions-runner-controller to authenticate with the GitHub API (only 1 can be configured at a time, however):

    1. Using a GitHub App (not supported for enterprise-level runners due to lack of support from GitHub.)
    2. Using a PAT (personal access token)

    To keep this blog simple, we are going with PAT.

    To authenticate an action-runner-controller with the GitHub API, we can use a  PAT with the action-runner-controller registers a self-hosted runner.

    • Go to account > Settings > Developers settings > Personal access token. Click on “Generate new token”. Under scopes, select “Full control of private repositories”.
    •  Click on the “Generate token” button.
    • Copy the generated token and run the below commands to create a Kubernetes secret, which will be used by action-runner-controller deployment.
    export GITHUB_TOKEN=XXXxxxXXXxxxxXYAVNa 

    kubectl create ns actions-runner-system

    Create secret

    kubectl create secret generic controller-manager  -n actions-runner-system 
    --from-literal=github_token=${GITHUB_TOKEN}

    Install action runner controller on the Kubernetes cluster

    • Run the below helm commands
    helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
    helm repo update
    helm upgrade --install --namespace actions-runner-system 
    --create-namespace --wait actions-runner-controller 
    actions-runner-controller/actions-runner-controller --set 
    syncPeriod=1m

    • Verify that the action-runner-controller installed properly using below command
    kubectl --namespace actions-runner-system get all

     

    Create a Repository Runner

    • Create a RunnerDeployment Kubernetes object, which will create a self-hosted runner named k8s-action-runner for the GitHub repository velotio/action-runner-poc
    • Please Update Repo name from “velotio/action-runner-poc” to “<Your-repo-name>”
    • To create the RunnerDeployment object, create the file runner.yaml as follows:
    apiVersion: actions.summerwind.dev/v1alpha1
    kind: RunnerDeployment
    metadata:
     name: k8s-action-runner
     namespace: actions-runner-system
    spec:
     replicas: 2
     template:
       spec:
         repository: velotio/action-runner-poc

    • To create, run this command:
    kubectl create -f runner.yaml

    Check that the pod is running using the below command:

    kubectl get pod -n actions-runner-system | grep -i "k8s-action-runner"

    • If everything goes well, you should see two action runners on the Kubernetes, and the same are registered on Github. Check under Settings > Actions > Runner of your repository.
    • Check the pod with kubectl get po -n actions-runner-system

    Install a MySQL Database on the Kubernetes cluster

    • Create PV and PVC for MySQL Database. 
    • Create mysql-pv.yaml with the below content.
    apiVersion: v1
    kind: PersistentVolume
    metadata:
     name: mysql-pv-volume
     labels:
       type: local
    spec:
     capacity:
       storage: 2Gi
     accessModes:
       - ReadWriteOnce
     hostPath:
       path: "/mnt/data"
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
     name: mysql-pv-claim
    spec:
     accessModes:
       - ReadWriteOnce
     resources:
       requests:
         storage: 2Gi

    • Create mysql namespace
    kubectl create ns mysql

    • Now apply mysql-pv.yaml to create PV and PVC 
    kubectl create -f mysql-pv.yaml -n mysql

    Create the file mysql-svc-deploy.yaml and add the below content to mysql-svc-deploy.yaml

    Here, we have used MYSQL_ROOT_PASSWORD as “password”.

    apiVersion: v1
    kind: Service
    metadata:
     name: mysql
    spec:
     ports:
       - port: 3306
     selector:
       app: mysql
     clusterIP: None
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
     name: mysql
    spec:
     selector:
       matchLabels:
         app: mysql
     strategy:
       type: Recreate
     template:
       metadata:
         labels:
           app: mysql
       spec:
         containers:
           - image: mysql:5.6
             name: mysql
             env:
                 # Use secret in real usage
               - name: MYSQL_ROOT_PASSWORD
                 value: password
             ports:
               - containerPort: 3306
                 name: mysql
             volumeMounts:
               - name: mysql-persistent-storage
                 mountPath: /var/lib/mysql
         volumes:
           - name: mysql-persistent-storage
             persistentVolumeClaim:
               claimName: mysql-pv-claim

    • Create the service and deployment
    kubectl create -f mysql-svc-deploy.yaml -n mysql

    • Verify that the MySQL database is running
    kubectl get po -n mysql

    Create a GitHub repository secret to store MySQL password

    As we will use MySQL password in the GitHub action workflow file as a good practice, we should not use it in plain text. So we will store MySQL password in GitHub secrets, and we will use this secret in our GitHub action workflow file.

    • Create a secret in the GitHub repository and give the name to the secret as “MYSQL_PASS”, and in the values, enter “password”. 

    Create a GitHub workflow file

    • YAML syntax is used to write GitHub workflows. For each workflow, we use a separate YAML file, which we store at .github/workflows/ directory. So, create a .github/workflows/ directory in your repository and create a file .github/workflows/mysql_workflow.yaml as follows.
    ---
    name: Example 1
    on:
     push:
       branches: [ main ]
    jobs:
     build:
       name: Build-job
       runs-on: self-hosted
       steps:
       - name: Checkout
         uses: actions/checkout@v2
     
       - name: MySQLQuery
         env:
           PASS: ${{ secrets.MYSQL_PASS }}
         run: |
           docker run -v ${GITHUB_WORKSPACE}:/var/lib/docker --rm mysql:5.6 sh -c "mysql -u root -p$PASS -hmysql.mysql.svc.cluster.local </var/lib/docker/test.sql"

    • If you check the docker run command in the mysql_workflow.yaml file, we are referring to the .sql file, i.e., test.sql. So, create a test.sql file in your repository as follows:
    use mysql;
    CREATE TABLE IF NOT EXISTS Persons (
       PersonID int,
       LastName varchar(255),
       FirstName varchar(255),
       Address varchar(255),
       City varchar(255)
    );
     
    SHOW TABLES;

    • In test.sql, we are running MySQL queries like create tables.
    • Push changes to your repository main branch.
    • If everything is fine, you will be able to see that the GitHub action is getting executed in a self-hosted runner pod. You can check it under the “Actions” tab of your repository.
    • You can check the workflow logs to see the output of SHOW TABLES—a command we have used in the test.sql file—and check whether the persons tables is created.

    References