Category: Type

  • How to Setup HashiCorp Vault HA Cluster with Integrated Storage (Raft)

    As businesses move their data to the public cloud, one of the most pressing issues is how to keep it safe from illegal access.

    Using a tool like HashiCorp Vault gives you greater control over your sensitive credentials and fulfills cloud security regulations.

    In this blog, we’ll walk you through HashiCorp Vault High Availability Setup.

    Hashicorp Vault

    Hashicorp Vault is an open-source tool that provides a secure, reliable way to store and distribute sensitive information like API keys, access tokens, passwords, etc. Vault provides high-level policy management, secret leasing, audit logging, and automatic revocation to protect this information using UI, CLI, or HTTP API.

    High Availability

    Vault can run in a High Availability mode to protect against outages by running multiple Vault servers. When running in HA mode, Vault servers have two additional states, i.e., active and standby. Within a Vault cluster, only a single instance will be active, handling all requests, and all standby instances redirect requests to the active instance.

    Integrated Storage Raft

    The Integrated Storage backend is used to maintain Vault’s data. Unlike other storage backends, Integrated Storage does not operate from a single source of data. Instead, all the nodes in a Vault cluster will have a replicated copy of Vault’s data. Data gets replicated across all the nodes via the Raft Consensus Algorithm.

    Raft is officially supported by Hashicorp.

    Architecture

    Prerequisites

    This setup requires Vault, Sudo access on the machines, and the below configuration to create the cluster.

    • Install Vault v1.6.3+ent or later on all nodes in the Vault cluster 

    In this example, we have 3 CentOs VMs provisioned using VMware. 

    Setup

    1. Verify the Vault version on all the nodes using the below command (in this case, we have 3 nodes node1, node2, node3):

    vault --version

    2. Configure SSL certificates

    Note: Vault should always be used with TLS in production to provide secure communication between clients and the Vault server. It requires a certificate file and key file on each Vault host.

    We can generate SSL certs for the Vault Cluster on the Master and copy them on the other nodes in the cluster.

    Refer to: https://developer.hashicorp.com/vault/tutorials/secrets-management/pki-engine#scenario-introduction for generating SSL certs.

    • Copy tls.crt tls.key tls_ca.pem to /etc/vault.d/ssl/ 
    • Change ownership to `vault`
    [user@node1 ~]$ cd /etc/vault.d/ssl/           
    [user@node1 ssl]$ sudo chown vault. tls*

    • Copy tls* from /etc/vault.d/ssl to of the nodes

    3. Configure the enterprise license. Copy license on all nodes:

    cp /root/vault.hclic /etc/vault.d/vault.hclic
    chown root:vault /etc/vault.d/vault.hclic
    chmod 0640 /etc/vault.d/vault.hclic

    4. Create the storage directory for raft storage on all nodes:

    sudo mkdir --parents /opt/raft
    sudo chown --recursive vault:vault /opt/raft

    5. Set firewall rules on all nodes:

    sudo firewall-cmd --permanent --add-port=8200/tcp
    sudo firewall-cmd --permanent --add-port=8201/tcp
    sudo firewall-cmd --reload

    6. Create vault configuration file on all nodes:

    ### Node 1 ###
    [user@node1 vault.d]$ cat vault.hcl
    storage "raft" {
        path = "/opt/raft"
        node_id = "node1"
        retry_join 
        {
            leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
            leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
            leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
            leader_client_key_file = "/etc/vault.d/ssl/tls.key"
        }
        retry_join 
        {
            leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
            leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
            leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
            leader_client_key_file = "/etc/vault.d/ssl/tls.key"
        }
    }
    
    listener "tcp" {
       address = "0.0.0.0:8200"
       tls_disable = false
       tls_cert_file = "/etc/vault.d/ssl/tls.crt"
       tls_key_file = "/etc/vault.d/ssl/tls.key"
       tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
       tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                            TLS_TEST_128_GCM_SHA256,
                            TLS_TEST20_POLY1305,
                            TLS_TEST_256_GCM_SHA384,
                            TLS_TEST20_POLY1305,
                            TLS_TEST_256_GCM_SHA384"
    }
    api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
    cluster_addr = "https://node1.int.us-west-1-dev.central.example.com:8201"
    disable_mlock = true
    ui = true
    log_level = "trace"
    disable_cache = true
    cluster_name = "POC"
    
    # Enterprise license_path
    # This will be required for enterprise as of v1.8
    license_path = "/etc/vault.d/vault.hclic"

    ### Node 2 ###
    [user@node2 vault.d]$ cat vault.hcl
    storage "raft" {
        path = "/opt/raft"
        node_id = "node2"
        retry_join 
        {
            leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
            leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
            leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
            leader_client_key_file = "/etc/vault.d/ssl/tls.key"
        }
        retry_join 
        {
            leader_api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
            leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
            leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
            leader_client_key_file = "/etc/vault.d/ssl/tls.key"
        } 
    }
    
    listener "tcp" {
       address = "0.0.0.0:8200"
       tls_disable = false
       tls_cert_file = "/etc/vault.d/ssl/tls.crt"
       tls_key_file = "/etc/vault.d/ssl/tls.key"
       tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
       tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                            TLS_TEST_128_GCM_SHA256,
                            TLS_TEST20_POLY1305,
                            TLS_TEST_256_GCM_SHA384,
                            TLS_TEST20_POLY1305,
                            TLS_TEST_256_GCM_SHA384"
    }
    api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
    cluster_addr = "https://node2.int.us-west-1-dev.central.example.com:8201"
    disable_mlock = true
    ui = true
    log_level = "trace"
    disable_cache = true
    cluster_name = "POC"
    
    # Enterprise license_path
    # This will be required for enterprise as of v1.8
    license_path = "/etc/vault.d/vault.hclic"

    ### Node 3 ###
    [user@node3 ~]$ cat /etc/vault.d/vault.hcl
    storage "raft" {
        path = "/opt/raft"
        node_id = "node3"
        retry_join 
        {
            leader_api_addr = "https://node1.int.us-west-1-dev.central.example.com:8200"
            leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
            leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
            leader_client_key_file = "/etc/vault.d/ssl/tls.key"
        }
        retry_join 
        {
            leader_api_addr = "https://node2.int.us-west-1-dev.central.example.com:8200"
            leader_ca_cert_file = "/etc/vault.d/ssl/tls_ca.pem"
            leader_client_cert_file = "/etc/vault.d/ssl/tls.crt"
            leader_client_key_file = "/etc/vault.d/ssl/tls.key"
        }
    }
    
    listener "tcp" {
       address = "0.0.0.0:8200"
       tls_disable = false
       tls_cert_file = "/etc/vault.d/ssl/tls.crt"
       tls_key_file = "/etc/vault.d/ssl/tls.key"
       tls_client_ca_file = "/etc/vault.d/ssl/tls_ca.pem"
       tls_cipher_suites = "TLS_TEST_128_GCM_SHA256,
                            TLS_TEST_128_GCM_SHA256,
                            TLS_TEST20_POLY1305,
                            TLS_TEST_256_GCM_SHA384,
                            TLS_TEST20_POLY1305,
                            TLS_TEST_256_GCM_SHA384"
    }
    api_addr = "https://node3.int.us-west-1-dev.central.example.com:8200"
    cluster_addr = "https://node3.int.us-west-1-dev.central.example.com:8201"
    disable_mlock = true
    ui = true
    log_level = "trace"
    disable_cache = true
    cluster_name = "POC"
    
    # Enterprise license_path
    # This will be required for enterprise as of v1.8
    license_path = "/etc/vault.d/vault.hclic"

    7. Set environment variables on all nodes:

    export VAULT_ADDR=https://$(hostname):8200
    export VAULT_CACERT=/etc/vault.d/ssl/tls_ca.pem
    export CA_CERT=`cat /etc/vault.d/ssl/tls_ca.pem`

    8. Start Vault as a service on all nodes:

    You can view the systemd unit file if interested by: 

    cat /etc/systemd/system/vault.service
    systemctl enable vault.service
    systemctl start vault.service
    systemctl status vault.service

    9. Check Vault status on all nodes:

    vault status

    10. Initialize Vault with the following command on vault node 1 only. Store unseal keys securely.

    [user@node1 vault.d]$ vault operator init -key-shares=1 -key-threshold=1
    Unseal Key 1: HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
    Initial Root Token: hvs.j4qTq1IZP9nscILMtN2p9GE0
    Vault initialized with 1 key shares and a key threshold of 1.
    Please securely distribute the key shares printed above. 
    When the Vault is re-sealed, restarted, or stopped, you must supply at least 1 of these keys to unseal it
    before it can start servicing requests.
    Vault does not store the generated root key. 
    Without at least 1 keys to reconstruct the root key, Vault will remain permanently sealed!
    It is possible to generate new unseal keys, provided you have a
    quorum of existing unseal keys shares. See "vault operator rekey" for more information.

    11. Set Vault token environment variable for the vault CLI command to authenticate to the server. Use the following command, replacing <initial-root- token> with the value generated in the previous step.

    export VAULT_TOKEN=<initial-root-token>
    echo "export VAULT_TOKEN=$VAULT_TOKEN" >> /root/.bash_profile
    ### Repeat this step for the other 2 servers.

    12. Unseal Vault1 using the unseal key generated in step 10. Notice the Unseal Progress key-value change as you present each key. After meeting the key threshold, the status of the key value for Sealed should change from true to false.

    [user@node1 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
    Key                         Value
    ---                         -----
    Seal Type                   shamir
    Initialized                 true
    Sealed                      false
    Total Shares                1
    Threshold                   1
    Version                     1.11.0
    Build Date                  2022-06-17T15:48:44Z
    Storage Type                raft
    Cluster Name                POC
    Cluster ID                  109658fe-36bd-7d28-bf92-f095c77e860c
    HA Enabled                  true
    HA Cluster                  https://node1.int.us-west-1-dev.central.example.com:8201
    HA Mode                     active
    Active Since                2022-06-29T12:50:46.992698336Z
    Raft Committed Index        36
    Raft Applied Index          36

    13. Unseal Vault2 (Use the same unseal key generated in step 10 for Vault1):

    [user@node2 vault.d]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
    Key                Value
    ---                -----
    Seal Type          shamir
    Initialized        true
    Sealed             true
    Total Shares       1
    Threshold          1
    Unseal Progress    0/1
    Unseal Nonce       n/a
    Version            1.11.0
    Build Date         2022-06-17T15:48:44Z
    Storage Type       raft
    HA Enabled         true
    
    [user@node2 vault.d]$ vault status
    Key                   Value
    ---                   -----
    Seal Type             shamir
    Initialized           true
    Sealed                true
    Total Shares          1
    Threshold             1
    Version               1.11.0
    Build Date            2022-06-17T15:48:44Z
    Storage Type          raft
    Cluster Name          POC
    Cluster ID            109658fe-36bd-7d28-bf92-f095c77e860c
    HA Enabled            true
    HA Cluster            https://node1.int.us-west-1-dev.central.example.com:8201
    HA Mode               standby
    Active Node Address   https://node1.int.us-west-1-dev.central.example.com:8200
    Raft Committed Index  37
    Raft Applied Index    37

    14. Unseal Vault3 (Use the same unseal key generated in step 10 for Vault1):

    [user@node3 ~]$ vault operator unseal HPY/g5OiT8ivD6L4Bqfjx9L1We2MVb4WZAqKZk6zFf8=
    Key                Value
    ---                -----
    Seal Type          shamir
    Initialized        true
    Sealed             true
    Total Shares       1
    Threshold          1
    Unseal Progress    0/1
    Unseal Nonce       n/a
    Version            1.11.0
    Build Date         2022-06-17T15:48:44Z
    Storage Type       raft
    HA Enabled         true
    
    [user@node3 ~]$ vault status
    Key                       Value
    ---                       -----
    Seal Type                 shamir
    Initialized               true
    Sealed                    false
    Total Shares              1
    Threshold                 1
    Version                   1.11.0
    Build Date                2022-06-17T15:48:44Z
    Storage Type              raft
    Cluster Name              POC
    Cluster ID                109658fe-36bd-7d28-bf92-f095c77e860c
    HA Enabled                true
    HA Cluster                https://node1.int.us-west-1-dev.central.example.com:8201
    HA Mode                   standby
    Active Node Address       https://node1.int.us-west-1-dev.central.example.com:8200
    Raft Committed Index      39
    Raft Applied Index        39

    15. Check the cluster’s raft status with the following command:

    [user@node3 ~]$ vault operator raft list-peers
    Node      Address                                            State       Voter
    ----      -------                                            -----       -----
    node1    node1.int.us-west-1-dev.central.example.com:8201    leader      true
    node2    node2.int.us-west-1-dev.central.example.com:8201    follower    true
    node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

    16. Currently, node1 is the active node. We can experiment to see what happens if node1 steps down from its active node duty.

    In the terminal where VAULT_ADDR is set to: https://node1.int.us-west-1-dev.central.example.com, execute the step-down command.

    $ vault operator step-down # equivalent of stopping the node or stopping the systemctl service
    Success! Stepped down: https://node2.int.us-west-1-dev.central.example.com:8200

    In the terminal, where VAULT_ADDR is set to https://node2.int.us-west-1-dev.central.example.com:8200, examine the raft peer set.

    [user@node1 ~]$ vault operator raft list-peers
    Node      Address                                            State       Voter
    ----      -------                                            -----       -----
    node1    node1.int.us-west-1-dev.central.example.com:8201    follower    true
    node2    node2.int.us-west-1-dev.central.example.com:8201    leader      true
    node3    node3.int.us-west-1-dev.central.example.com:8201    follower    true

    Conclusion 

    Vault servers are now operational in High Availability mode, and we can test this by writing a secret from either the active or standby Vault instance and see it succeed as a test of request forwarding. Also, we can shut down the active vault instance (sudo systemctl stop vault) to simulate a system failure and see the standby instance assumes the leadership.

  • Modern Data Stack: The What, Why and How?

    This post will provide you with a comprehensive overview of the modern data stack (MDS), including its benefits, how it’s components differ from its predecessors’, and what its future holds.

    “Modern” has the connotation of being up-to-date, of being better. This is true for MDS, but how exactly is MDS better than what was before?

    What was the data stack like?…

    A few decades back, the map-reduce technological breakthrough made it possible to efficiently process large amounts of data in parallel on multiple machines.

    It provided the backbone of a standard pipeline that looked like:

    It was common to see HDFS used for storage, spark for computing, and hive to perform SQL queries on top.

    To run this, we had people handling the deployment and maintenance of Hadoop on their own.

    This core attribute of the setup eventually became a pain point and made it complex and inefficient in the long run.

    Being on-prem while facing growing heavier loads meant scalability became a huge concern.

    Hence, unlike today, the process was much more manual. Adding more RAM, increasing storage, and rolling out updates manually reduced productivity

    Moreover,

    • The pipeline wasn’t modular; components were tightly coupled, causing failures when deciding to shift to something new.
    • Teams committed to specific vendors and found themselves locked in, by design, for years.
    • Setup was complex, and the infrastructure was not resilient. Random surges in data crashed the systems. (This randomness in demand has only increased since the early decade of internet, due to social media-triggered virality.)
    • Self-service was non-existent. If you wanted to do anything with your data, you needed data engineers.
    • Observability was a myth. Your pipeline is failing, but you’re unaware, and then you don’t know why, where, how…Your customers become your testers, knowing more about your system’s issues.
    • Data protection laws weren’t as formalized, especially the lack of policies within the organization. These issues made the traditional setup inefficient in solving modern problems, and as we all know…

    For an upgraded modern setup, we needed something that is scalable, has a smaller learning curve, and something that is feasible for both a seed-stage startup or a fortune 500.

    Standing on the shoulders of tech innovations from the 2000s, data engineers started building a blueprint for MDS tooling with three core attributes: 

    Cloud Native (or the ocean)

    Arguably the definitive change of the MDS era, the cloud reduces the hassle of on-prem and welcomes auto-scaling horizontally or vertically in the era of virality and spikes as technical requirements.

    Modularity

    The M in MDS could stand for modular.

    You can integrate any MDS tool into your existing stack, like LEGO blocks.

    You can test out multiple tools, whether they’re open source or managed, choose the best fit, and iteratively build out your data infrastructure.

    This mindset helps instill a habit of avoiding vendor lock-in by continuously upgrading your architecture with relative ease.

    By moving away from the ancient, one-size-fits-all model, MDS recognizes the uniqueness of each company’s budget, domain, data types, and maturity—and provides the correct solution for a given use case.

    Ease of Use

    MDS tools are easier to set up. You can start playing with these tools within a day.

    Importantly, the ease of use is not limited to technical engineers.

    Owing to the rise of self-serve and no-code tools like tableau—data is finally democratized for usage for all kinds of consumers. SQL remains crucial, but for basic metric calculations PMs, Sales, Marketing, etc., can use a simple drag and drop in the UI (sometimes even simpler than Excel pivot tables).

    MDS also enables one to experiment with different architectural frameworks for their use case. For example, ELT vs. ETL (explained under Data Transformation).

    But, one might think such improvements mean MDS is the v1.1 of Data Stack, a tech upgrade that ultimately uses data to solve similar problems.

    Fortunately, that’s far from the case.

    MDS enables data to solve more human problems across the org—problems that employees have long been facing but could never systematically solve for, helping generate much more value from the data.

    Beyond these, employees want transparency and visibility into how any metric was calculated and which data source in Snowflake was used to build what specific tableau dashboard.

    Critically, with compliance finally being focused on, orgs need solutions for giving the right people the right access at the right time.

    Lastly, as opposed to previous eras, these days, even startups have varied infrastructure components with data; if you’re a PM tasked with bringing insights, how do you know where to start? What data assets the organization has?

    Besides these problem statements being tackled, MDS builds a culture of upskilling employees in various data concepts.

    Data security, governance, and data lineage are important irrespective of department or persona in the organization.

    From designers to support executives, the need for a data-driven culture is a given.

    You’re probably bored of hearing how good the MDS is and want to deconstruct it into its components.

    Let’s dive in.

    SOURCES

    In our modern era, every product is inevitably becoming a tech product

    From a smart bulb to an orbiting satellite, each generates data in its own unique flavor of frequency of generation, data format, data size, etc.

    Social media, microservices, IoT devices, smart devices, DBs, CRMs, ERPs, flat files, and a lot more…

    INGESTION

    Post creation of data, how does one “ingest” or take in that data for actual usage? (the whole point of investing).

    Roughly, there are three categories to help describe the ingestion solutions:

    Generic tools allow us to connect various data sources with data storages.

    E.g.: we can connect Google Ads or Salesforce to dump data into BigQuery or S3.

    These generic tools highlight the modularity and low or no code barrier aspect in MDS.

    Things are as easy as drag and drop, and one doesn’t need to be fluent in scripting.

    Then we have programmable tools as well, where we get more control over how we ingest data through code

    For example, we can write Apache Airflow DAGs in Python to load data from S3 and dump it to Redshift.

    Intermediary – these tools cater to a specific use case or are coupled with the source itself.

    E.g. – Snowpipe, a part of the data source snowflake itself, allows us to load data from files as soon as it’s available at the source.

    DATA STORAGE‍

    Where do you ingest data into?

    Here, we’ve expanded from HDFS & SQL DBs to a wider variety of formats (noSQL, document DB).

    Depending on the use case and the way you interact with data, you can choose from a DW, DB, DL, ObjectStores, etc.

    You might need a standard relational DB for transactions in finance, or you might be collecting logs. You might be experimenting with your product at an early stage and be fine with noSQL without worrying about prescribing schemas.

    One key feature to note is that—most are cloud-based. So, no more worrying about scalability and we pay only for what we use.

    PS: Do stick around till the end for new concepts of Lake House and reverse ETL (already prevalent in the industry).

    DATA TRANSFORMATION

    The stored raw data must be cleaned and restructured into the shape we deem best for actual usage. This slicing and dicing is different for every kind of data.

    For example, we have tools for the E-T-L way, which can be categorized into SaaS and Frameworks, e.g., Fivetran and Spark respectively.

    Interestingly, the cloud era has given storage computational capability such that we don’t even need an external system for transformation, sometimes.

    With this rise of E-LT, we leverage the processing capabilities of cloud data warehouses or lake houses. Using tools like DBT, we write templated SQL queries to transform our data in the warehouses or lake house itself.

    This is enabling analysts to perform heavy lifting of traditional DE problems

    We also see stream processing where we work with applications where “micro” data is processed in real time (analyzed as soon as it’s produced, as opposed to large batches).

    DATA VISUALIZATION

    The ability to visually learn from data has only improved in the MDS era with advanced design, methodology, and integration.

    With Embedded analytics, one can integrate analytical capabilities and data visualizations into the software application itself.

    External analytics, on the other hand, are used to build using your processed data. You choose your source, create a chart, and let it run.

    DATA SCIENCE, MACHINE LEARNING, MLOps

    Source: https://medium.com/vertexventures/thinking-data-the-modern-data-stack-d7d59e81e8c6

    In the last decade, we have moved beyond ad-hoc insight generation in Jupyter notebooks to

    production-ready, real-time ML workflows, like recommendation systems and price predictions. Any startup can and does integrate ML into its products.

    Most cloud service providers offer machine learning models and automated model building as a service.

    MDS concepts like data observation are used to build tools for ML practitioners, whether its feature stores (a feature store is a central repository that provides entity values as of a certain time), or model monitoring (checking data drift, tracking model performance, and improving model accuracy).

    This is extremely important as statisticians can focus on the business problem not infrastructure.

    This is an ever-expanding field where concepts for ex MLOps (DevOps for the ML pipelines—optimizing workflows, efficient transformations) and Synthetic media (using AI to generate content itself) arrive and quickly become mainstream.

    ChatGPT is the current buzz, but by the time you’re reading this, I’m sure there’s going to be an updated one—such is the pace of development.

    DATA ORCHESTRATION

    With a higher number of modularized tools and source systems comes complicated complexity.

    More steps, processes, connections, settings, and synchronization are required.

    Data orchestration in MDS needs to be Cron on steroids.

    Using a wide variety of products, MDS tools help bring the right data for the right purposes based on complex logic.

     

    DATA OBSERVABILITY

    Data observability is the ability to monitor and understand the state and behavior of data as it flows through an organization’s systems.

    In a traditional data stack, organizations often rely on reactive approaches to data management, only addressing issues as they arise. In contrast, data observability in an MDS involves adopting a proactive mindset, where organizations actively monitor and understand the state of their data pipelines to identify potential issues before they become critical.

    Monitoring – a dashboard that provides an operational view of your pipeline or system

    Alerting – both for expected events and anomalies 

    Tracking – ability to set and track specific events

    Analysis – automated issue detection that adapts to your pipeline and data health

    Logging – a record of an event in a standardized format for faster resolution

    SLA Tracking – Measure data quality against predefined standards (cost, performance, reliability)

    Data Lineage – graph representation of data assets showing upstream/downstream steps.

    DATA GOVERNANCE & SECURITY

    Data security is a critical consideration for organizations of all sizes and industries and needs to be prioritized to protect sensitive information, ensure compliance, and preserve business continuity. 

    The introduction of stricter data protection regulations, such as the General Data Protection Regulation (GDPR) and CCPA, introduced a huge need in the market for MDS tools, which efficiently and painlessly help organizations govern and secure their data.

    DATA CATALOG

    Now that we have all the components of MDS, from ingestion to BI, we have so many sources, as well as things like dashboards, reports, views, other metadata, etc., that we need a google like engine just to navigate our components.

    This is where a data catalog helps; it allows people to stitch the metadata (data about your data: the #rows in your table, the column names, types, etc.) across sources.

    This is necessary to help efficiently discover, understand, trust, and collaborate on data assets.

    We don’t want PMs & GTM to look at different dashboards for adoption data.

    Previously, the sole purpose of the original data pipeline was to aggregate and upload events to Hadoop/Hive for batch processing. Chukwa collected events and wrote them to S3 in Hadoop sequence file format. In those days, end-to-end latency was up to 10 minutes. That was sufficient for batch jobs, which usually scan data at daily or hourly frequency.

    With the emergence of Kafka and Elasticsearch over the last decade, there has been a growing demand for real-time analytics on Netflix. By real-time, we mean sub-minute latency. Instead of starting from scratch, Netflix was able to iteratively grow its MDS as per changes in market requirements.

    Source: https://blog.transform.co/data-talks/the-metric-layer-why-you-need-it-examples-and-how-it-fits-into-your-modern-data-stack/

     

    This is a snapshot of the MDS stack a data-mature company like Netflix had some years back where instead of a few all in one tools, each data category was solved by a specialized tool.

    FUTURE COMPONENTS OF MDS?

    DATA MESH

    Source: https://martinfowler.com/articles/data-monolith-to-mesh.html

    The top picture shows how teams currently operate, where no matter the feature or product on the Y axis, the data pipeline’s journey remains the same moving along the X. But in an ideal world of data mesh, those who know the data should own its journey.

    As decentralization is the name of the game, data mesh is MDS’s response to this demand for an architecture shift where domain owners use self-service infrastructure to shape how their data is consumed.

    DATA LAKEHOUSE

    Source: https://www.altexsoft.com/blog/data-lakehouse/

    We have talked about data warehouses and data lakes being used for data storage.

    Initially, when we only needed structured data, data warehouses were used. Later, with big data, we started getting all kinds of data, structured and unstructured.

    So, we started using Data Lakes, where we just dumped everything.

    The lakehouse tries to combine the best of both worlds by adding an intelligent metadata layer on top of the data lake. This layer basically classifies and categorizes data such that it can be interpreted in a structured manner.

    Also, all the data in the lake house is open, meaning that it can be utilized by all kinds of tools. They are generally built on top of open data formats like parquet so that they can be easily accessed by all the tools.

    End users can simply run their SQLs as if they’re querying a DWH. 

    REVERSE ETL

    Suppose you’re a salesperson using Salesforce and want to know if a lead you just got is warm or cold (warm indicating a higher chance of conversion).

    The attributes about your lead, like salary and age are fetched from your OLTP into a DWH, analyzed, and then the flag “warm” is sent back to Salesforce UI, ready to be used in live operations.

     METRICS LAYER

    The Metric layer will be all about consistency, accessibility, and trust in the calculations of metrics.

    Earlier, for metrics, you had v1 v1.1 Excels with logic scattered around.

    Currently, in the modern data stack world, each team’s calculation is isolated in the tool they are used to. For example, BI would store metrics in tableau dashboards while DEs would use code.

    A metric layer would exist to ensure global access of the metrics to every other tool in the data stack.

    For example, DBT metrics layer helps define these in the warehouse—something accessible to both BI and engineers. Similarly, looker, mode, and others have their unique approach to it.

    In summary, this blog post discussed the modern data stack and its advantages over older approaches. We examined the components of the modern data stack, including data sources, ingestion, transformation, and more, and how they work together to create an efficient and effective system for data management and analysis. We also highlighted the benefits of the modern data stack, including increased efficiency, scalability, and flexibility. 

    As technology continues to advance, the modern data stack will evolve and incorporate new components and capabilities.

  • Best Practices for Kafka Security

    Overview‍

    We will cover the security concepts of Kafka and walkthrough the implementation of encryption, authentication, and authorization for the Kafka cluster.

    This article will explain how to configure SASL_SSL (simple authentication security layer) security for your Kafka cluster and how to protect the data in transit. SASL_SSL is a communication type in which clients use authentication mechanisms like PLAIN, SCRAM, etc., and the server uses SSL certificates to establish secure communication. We will use the SCRAM authentication mechanism here for the client to help establish mutual authentication between the client and server. We’ll also discuss authorization and ACLs, which are important for securing your cluster.

    Prerequisites

    Running Kafka Cluster, basic understanding of security components.

    Need for Kafka Security

    The primary reason is to prevent unlawful internet activities for the purpose of misuse, modification, disruption, and disclosure. So, to understand the security in Kafka cluster a secure Kafka cluster, we need to know three terms:

    • Authentication – It is a security method used for servers to determine whether users have permission to access their information or website.
    • Authorization – The authorization security method implemented with authentication enables servers to have a methodology of identifying clients for access. Basically, it gives limited access, which is sufficient for the client.
    • Encryption – It is the process of transforming data to make it distorted and unreadable without a decryption key. Encryption ensures that no other client can intercept and steal or read data.

    Here is the quick start guide by Apache Kafka, so check it out if you still need to set up Kafka.

    https://kafka.apache.org/quickstart

    We’ll not cover the theoretical aspects here, but you can find a ton of sources on how these three components work internally. For now, we’ll focus on the implementation part and how Kafka revolves around security.

    This image illustrates SSL communication between the Kafka client and server.

    We are going to implement the steps in the below order:

    • Create a Certificate Authority
    • Create a Truststore & Keystore

    Certificate Authority – It is a trusted entity that issues SSL certificates. As such, a CA is an independent entity that acts as a trusted third party, issuing certificates for use by others. A certificate authority validates the credentials of a person or organization that requests a certificate before issuing one.

    Truststore – A truststore contains certificates from other parties with which you want to communicate or certificate authorities that you trust to identify other parties. In simple words, a list of CAs that can validate the certificate signed by the trusted CA.

    KeyStore – A KeyStore contains private keys and certificates with their corresponding public keys. Keystores can have one or more CA certificates depending upon what’s needed.

    For Kafka Server, we need a server certificate, and here, Keystore comes into the picture since it stores a server certificate. The server certificate should be signed by Certificate Authority (CA). The KeyStore requests to sign the server certificate and in response, CA send a signed CRT to Keystore.

    We will create our own certificate authority for demonstration purposes. If you don’t want to create a private certificate authority, there are many certificate providers you can go with, like IdenTrust and GoDaddy. Since we are creating one, we need to tell our Kafka client to trust our private certificate authority using the Trust Store.

    This block diagram shows you how all the components communicate with each other and their role to generate the final certificate.

    So, let’s create our Certificate Authority. Run the below command in your terminal:

    “openssl req -new -keyout <private_key_name> -out <public_certificate_name>”

    It will ask for a passphrase, and keep it safe for future use cases. After successfully executing the command, we should have two files named private_key_name and public_certificate_name.

    Now, let’s create a KeyStore and trust store for brokers; we need both because brokers also interact internally with each other. Let’s understand with the help of an example: Broker A wants to connect with Broker B, so Broker A acts as a client and Broker B as a server. We are using the SASL_SSL protocol, so A needs SASL credentials, and B needs a certificate for authentication. The reverse is also possible where Broker B wants to connect with Broker A, so we need both a KeyStore and a trust store for authentication.

    Now let’s create a trust store. Execute the below command in the terminal, and it should ask for the password. Save the password for future use:

    “keytool -keystore <truststore_name.jks> -alias <alias name of the entry to process> -import -file <public_certificate_name>”

    Here, we are using the .jks extension for the file, which stands for Java KeyStore. You can also use Public-Key Cryptography Standards #12 (pkcs12) instead of .jks, but that’s totally up to you. public_certificate_name is the same certificate while we create CA.

    For the KeyStore configuration, run the below command and store the password:

    “keytool genkey -keystore <keystore_name.jks> -validity <number_of_days> -storepass <store_password> -genkey -alias <alias_name> -keyalg <key algorithm name> -ext SAN=<“DNS:localhost”>”

    This action creates the KeyStore file in the current working directory. The question “First and Last Name” requires you to enter a fully qualified domain name because some certificate authorities, such as VeriSign, expect this property to be a fully qualified domain name. Not all CAs require a fully qualified domain name, but I recommend using a fully qualified domain name for portability. All other information should be valid. If the information cannot be verified, a certificate authority such as VeriSign will not sign the CSR generated for that record. I’m using localhost for the domain name here, as seen in the above command itself.

    Keystore has an entry with alias_name. It contains the private key and information needed for generating a CSR. Now let’s create a signing certificate request, so it will be used to get a signed certificate from Certificate Authority.

    Execute the below command in your terminal:

    “keytool -keystore <keystore_name.jks> -alias <alias_name> -certreq -file <file_name.csr>”

    So, we have generated a signing certificate request using a KeyStore (the KeyStore name and alias name should be the same). It should ask for the KeyStore password, so enter the same one used while creating the KeyStore.

    Now, execute the below command. It will ask for the password, so enter the CA password, and now we have a signed certificate:

    “openssl x509 -req -CA <public_certificate_name> -CAkey <private_key_name> -in <csr file> -out <signed_file_name> -CAcreateserial”

    Finally, we need to add the public certificate of CA and signed certificate in the KeyStore, so run the below command. It will add the CA certificate to the KeyStore.

    “keytool -keystore <keystore_name.jks> -alias <public_certificate_name> -import -file <public_certificate_name>”

    Now, let’s run the below command; it will add the signed certificate to the KeyStore.

    “keytool -keystore <keystore_name.jks> -alias <alias_name> -import -file <signed_file_name>”

    As of now, we have generated all the security files for the broker. For internal broker communication, we are using SASL_SSL (see security.inter.broker.protocol in server.properties). Now we need to create a broker username and password using the SCRAM method. For more details, click here.

    Run the below command:

    “kafka-configs.sh –zookeeper <host: port> –entity-type users –entity-name <username> –alter –add-config ‘SCRAM-SHA-512=[password=<password>]’”

    NOTE: Credentials for inter-broker communication must be created before Kafka brokers are started.

    Now, we need to configure the Kafka broker property file, so update the file as given below:

    listeners=SASL_SSL://localhost:9092
    advertised.listeners=SASL_SSL://localhost:9092
    ssl.truststore.location={path/to/truststore_name.jks}
    ssl.truststore.password={truststore_password}
    ssl.keystore.location={/path/to/keystore_name.jks}
    ssl.keystore.password={keystore_password}
    security.inter.broker.protocol=SASL_SSL
    ssl.client.auth=none
    ssl.protocol=TLSv1.2
    sasl.enabled.mechanisms=SCRAM-SHA-512
    sasl.mechanism.inter.broker.protocol=SCRAM-SHA-512
    listener.name.sasl_ssl.scram-sha-512.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username={username} password={password};
    super.users=User:{username}

    NOTE: If you are using an external jaas config file, then remove the ScramLoginModule line and set this environment variable before starting broker. “export KAFKA_OPTS=-Djava.security.auth.login.config={path/to/broker.conf}”

    Now, if we run Kafka, the broker should be running on port 9092 without any failure, and if you have multiple brokers inside Kafka, the same config file can be replicated among them, but the port should be different for each broker.

    Producers and consumers need a username and a password to access the broker, so let’s create their credentials and update respective configurations.

    Create a producer user and update producer.properties inside the bin directory, so execute the below command in your terminal.

    “bin/kafka-configs.sh –zookeeper <host: port> –entity-type users –entity-name <producer_name> –alter –add-config ‘SCRAM-SHA-512=[password=<password>]’”

    We need a trust store file for our clients (producer and consumer), but as we already know how to create a trust store, this is a small task for you. It is suggested that producers and consumers should have separate trust stores because when we move Kafka to production, there could be multiple producers and consumers on different machines.

    security.protocol=SASL_SSL
    ssl.protocol=TLSv1.2
    ssl.truststore.location={path/to/client.truststore.jks}
    ssl.truststore.password={password}
    sasl.mechanism=SCRAM-SHA-512
    sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username={producer_name} password={password};

    The below command creates a consumer user, so now let’s update consumer.properties inside the bin directory:

    “bin/kafka-configs.sh –zookeeper <host: port> –entity-type users –entity-name <consumer_name> –alter –add-config ‘SCRAM-SHA-512=[password=<password>]’”

    security.protocol=SASL_SSL
    ssl.protocol=TLSv1.2
    ssl.truststore.location={path/to/client.truststore.jks}
    ssl.truststore.password={password}
    sasl.mechanism=SCRAM-SHA-512
    sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username={consumer_name} password={password};

    As of now, we have implemented encryption and authentication for Kafka brokers. To verify that our producer and consumer are working properly with SCRAM credentials, run the console producer and consumer on some topics.

    Authorization is not implemented yet. Kafka uses access control lists (ACLs) to specify which users can perform which actions on specific resources or groups of resources. Each ACL has a principal, a permission type, an operation, a resource type, and a name.

    The default authorizer is ACLAuthorizer provided by Kafka; Confluent also provides the Confluent Server Authorizer, which is totally different from ACLAuthorizer. An authorizer is a server plugin used by Kafka to authorize actions. Specifically, the authorizer controls whether operations should be authorized based on the principal and resource being accessed.

    Format of ACLs – Principal P is [Allowed/Denied] Operation O from Host H on any Resource R matching ResourcePattern RP

    Execute the below command to create an ACL with writing permission for the producer:

    “bin/kafka-acls.sh –authorizer-properties zookeeper.connect=<host: port> –add –allow-principal User:<producer_name> –operation WRITE –topic <topic_name>”

    The above command should create ACL of write operation for producer_name on topic_name.

    Now, execute the below command to create an ACL with reading permission for the consumer:

    “bin/kafka-acls.sh –authorizer-properties zookeeper.connect=<host: port> –add –allow-principal User:<consumer_name> –operation READ –topic <topic_name>”

    Now we need to define the consumer group ID for this consumer, so the below command associates a consumer with a given consumer group ID.

    “bin/kafka-acls.sh –authorizer-properties zookeeper.connect=<host: port> –add –allow-principal User:<consumer_name> –operation READ –group <consumer_group_name>”

    Now, we need to add some configuration in two files: broker.properties and consumer.properties.

    # Authorizer class
    authorizer.class.name=kafka.security.authorizer.AclAuthorizer

    The above line indicates that AclAuthorizer class is used for authorization.

    # consumer group id
    group.id=<consumer_group_name>

    Consumer group-id is mandatory, and if we do not specify any group, a consumer will not be able to access the data from topics, so to start a consumer, group-id should be provided.

    Let’s test the producer and consumer one by one, run the console producer and also run the console consumer in another terminal; both should be running without error.

    console-producer
    console-consumer

    Voila!! Your Kafka is secured.

    Summary

    In a nutshell, we have implemented security in our Kafka using the SASL_SSL mechanism and learned how to create ACLs and give different permission to different users.

    Apache Kafka is the wild west without security. By default, there is no encryption, authentication, or access control list. Any client can communicate with the Kafka broker using the PLAINTEXT port. Access using this port should be restricted to trusted clients only. You can use network segmentation and/or authentication ACLs to restrict access to trusted IP addresses in these cases. If none of these are used, the cluster is wide open and available to anyone. A basic knowledge of Kafka authentication, authorization, encryption, and audit trails is required to safely move a system into production.

  • Discover the Benefits of Android Clean Architecture

    All architectures have one common goal: to manage the complexity of our application. We may not need to worry about it on a smaller project, but it becomes a lifesaver on larger ones. The purpose of Clean Architecture is to minimize code complexity by preventing implementation complexity.

    We must first understand a few things to implement the Clean Architecture in an Android project.

    • Entities: Encapsulate enterprise-wide critical business rules. An entity can be an object with methods or data structures and functions.
    • Use cases: It demonstrates data flow to and from the entities.
    • Controllers, gateways, presenters: A set of adapters that convert data from the use cases and entities format to the most convenient way to pass the data to the upper level (typically the UI).
    • UI, external interfaces, DB, web, devices: The outermost layer of the architecture, generally composed of frameworks such as database and web frameworks.

    Here is one thumb rule we need to follow. First, look at the direction of the arrows in the diagram. Entities do not depend on use cases and use cases do not depend on controllers, and so on. A lower-level module should always rely on something other than a higher-level module. The dependencies between the layers must be inwards.

    Advantages of Clean Architecture:

    • Strict architecture—hard to make mistakes
    • Business logic is encapsulated, easy to use, and tested
    • Enforcement of dependencies through encapsulation
    • Allows for parallel development
    • Highly scalable
    • Easy to understand and maintain
    • Testing is facilitated

    Let’s understand this using the small case study of the Android project, which gives more practical knowledge rather than theoretical.

    A pragmatic approach

    A typical Android project typically needs to separate the concerns between the UI, the business logic, and the data model, so taking “the theory” into account, we decided to split the project into three modules:

    • Domain Layer: contains the definitions of the business logic of the app, the data models, the abstract definition of repositories, and the definition of the use cases.
    Domain Module
    • Data Layer: This layer provides the abstract definition of all the data sources. Any application can reuse this without modifications. It contains repositories and data sources implementations, the database definition and its DAOs, the network APIs definitions, some mappers to convert network API models to database models, and vice versa.
    Data Module
    • Presentation layer: This is the layer that mainly interacts with the UI. It’s Android-specific and contains fragments, view models, adapters, activities, composable, and so on. It also includes a service locator to manage dependencies.
    Presentation Module

    Marvel’s comic characters App

    To elaborate on all the above concepts related to Clean Architecture, we are creating an app that lists Marvel’s comic characters using Marvel’s developer API. The app shows a list of Marvel characters, and clicking on each character will show details of that character. Users can also bookmark their favorite characters. It seems like nothing complicated, right?

    Before proceeding further into the sample, it’s good to have an idea of the following frameworks because the example is wholly based on them.

    • Jetpack Compose – Android’s recommended modern toolkit for building native UI.
    • Retrofit 2 – A type-safe HTTP client for Android for Network calls.
    • ViewModel – A class responsible for preparing and managing the data for an activity or a fragment.
    • Kotlin – Kotlin is a cross-platform, statically typed, general-purpose programming language with type inference.

    To get a characters list, we have used marvel’s developer API, which returns the list of marvel characters.

    http://gateway.marvel.com/v1/public/characters

    The domain layer

    In the domain layer, we define the data model, the use cases, and the abstract definition of the character repository. The API returns a list of characters, with some info like name, description, and image links.

    data class CharacterEntity(
        val id: Long,
        val name: String,
        val description: String,
        val imageUrl: String,
        val bookmarkStatus: Boolean
    )

    interface MarvelDataRepository {
        suspend fun getCharacters(dataSource: DataSource): Flow<List<CharacterEntity>>
        suspend fun getCharacter(characterId: Long): Flow<CharacterEntity>
        suspend fun toggleCharacterBookmarkStatus(characterId: Long): Boolean
        suspend fun getComics(dataSource: DataSource, characterId: Long): Flow<List<ComicsEntity>>
    }

    class GetCharactersUseCase(
        private val marvelDataRepository: MarvelDataRepository,
        private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
    ) {
        operator fun invoke(forceRefresh: Boolean = false): Flow<List<CharacterEntity>> {
            return flow {
                emitAll(
                    marvelDataRepository.getCharacters(
                        if (forceRefresh) {
                            DataSource.Network
                        } else {
                            DataSource.Cache
                        }
                    )
                )
            }
                .flowOn(ioDispatcher)
        }
    }

    The data layer

    As we said before, the data layer must implement the abstract definition of the domain layer, so we need to put the repository’s concrete implementation in this layer. To do so, we can define two data sources, a “local” data source to provide persistence and a “remote” data source to fetch the data from the API.

    class MarvelDataRepositoryImpl(
        private val marvelRemoteService: MarvelRemoteService,
        private val charactersDao: CharactersDao,
        private val comicsDao: ComicsDao,
        private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
    ) : MarvelDataRepository {
    
        override suspend fun getCharacters(dataSource: DataSource): Flow<List<CharacterEntity>> =
            flow {
                emitAll(
                    when (dataSource) {
                        is DataSource.Cache -> getCharactersCache().map { list ->
                            if (list.isEmpty()) {
                                getCharactersNetwork()
                            } else {
                                list.toDomain()
                            }
                        }
                            .flowOn(ioDispatcher)
    
                        is DataSource.Network -> flowOf(getCharactersNetwork())
                            .flowOn(ioDispatcher)
                    }
                )
            }
    
        private suspend fun getCharactersNetwork(): List<CharacterEntity> =
            marvelRemoteService.getCharacters().body()?.data?.results?.let { remoteData ->
                if (remoteData.isNotEmpty()) {
                    charactersDao.upsert(remoteData.toCache())
                }
                remoteData.toDomain()
            } ?: emptyList()
    
        private fun getCharactersCache(): Flow<List<CharacterCache>> =
            charactersDao.getCharacters()
    
        override suspend fun getCharacter(characterId: Long): Flow<CharacterEntity> =
            charactersDao.getCharacterFlow(id = characterId).map {
                it.toDomain()
            }
    
        override suspend fun toggleCharacterBookmarkStatus(characterId: Long): Boolean {
    
            val status = charactersDao.getCharacter(characterId)?.bookmarkStatus?.not() ?: false
    
            return charactersDao.toggleCharacterBookmarkStatus(id = characterId, status = status) > 0
        }
    
        override suspend fun getComics(
            dataSource: DataSource,
            characterId: Long
        ): Flow<List<ComicsEntity>> = flow {
            emitAll(
                when (dataSource) {
                    is DataSource.Cache -> getComicsCache(characterId = characterId).map { list ->
                        if (list.isEmpty()) {
                            getComicsNetwork(characterId = characterId)
                        } else {
                            list.toDomain()
                        }
                    }
                    is DataSource.Network -> flowOf(getComicsNetwork(characterId = characterId))
                        .flowOn(ioDispatcher)
                }
            )
        }
    
        private suspend fun getComicsNetwork(characterId: Long): List<ComicsEntity> =
            marvelRemoteService.getComics(characterId = characterId)
                .body()?.data?.results?.let { remoteData ->
                    if (remoteData.isNotEmpty()) {
                        comicsDao.upsert(remoteData.toCache(characterId = characterId))
                    }
                    remoteData.toDomain()
                } ?: emptyList()
    
        private fun getComicsCache(characterId: Long): Flow<List<ComicsCache>> =
            comicsDao.getComics(characterId = characterId)
    }

    Since we defined the data source to manage persistence, in this layer, we also need to determine the database for which we are using the room database. In addition, it’s good practice to create some mappers to map the API response to the corresponding database entity.

    fun List<Characters>.toCache() = map { character -> character.toCache() }
    
    fun Characters.toCache() = CharacterCache(
        id = id ?: 0,
        name = name ?: "",
        description = description ?: "",
        imageUrl = thumbnail?.let {
            "${it.path}.${it.extension}"
        } ?: ""
    )
    
    fun List<Characters>.toDomain() = map { character -> character.toDomain() }
    
    fun Characters.toDomain() = CharacterEntity(
        id = id ?: 0,
        name = name ?: "",
        description = description ?: "",
        imageUrl = thumbnail?.let {
            "${it.path}.${it.extension}"
        } ?: "",
        bookmarkStatus = false
    )

    @Entity
    data class CharacterCache(
        @PrimaryKey
        val id: Long,
        val name: String,
        val description: String,
        val imageUrl: String,
        val bookmarkStatus: Boolean = false
    ) : BaseCache

    The presentation layer

    In this layer, we need a UI component like fragments, activity, or composable to display the list of characters; here, we can use the widely used MVVM approach. The view model takes the use cases in its constructors and invokes the corresponding use case according to user actions (get a character, characters & comics, etc.).

    Each use case will invoke the appropriate method in the repository.

    class CharactersListViewModel(
        private val getCharacters: GetCharactersUseCase,
        private val toggleCharacterBookmarkStatus: ToggleCharacterBookmarkStatus
    ) : ViewModel() {
    
        private val _characters = MutableStateFlow<UiState<List<CharacterViewState>>>(UiState.Loading())
        val characters: StateFlow<UiState<List<CharacterViewState>>> = _characters
    
        init {
            _characters.value = UiState.Loading()
            getAllCharacters()
        }
    
        private fun getAllCharacters(forceRefresh: Boolean = false) {
            getCharacters(forceRefresh)
                .catch { error ->
                    error.printStackTrace()
                    when (error) {
                        is UnknownHostException, is ConnectException, is SocketTimeoutException -> _characters.value =
                            UiState.NoInternetError(error)
                        else -> _characters.value = UiState.ApiError(error)
                    }
                }.map { list ->
                    _characters.value = UiState.Loaded(list.toViewState())
                }.launchIn(viewModelScope)
        }
    
        fun refresh(showLoader: Boolean = false) {
            if (showLoader) {
                _characters.value = UiState.Loading()
            }
            getAllCharacters(forceRefresh = true)
        }
    
        fun bookmarkCharacter(characterId: Long) {
            viewModelScope.launch {
                toggleCharacterBookmarkStatus(characterId = characterId)
            }
        }
    }

    /*
    * Scaffold(Layout) for Characters list page
    * */
    
    
    @SuppressLint("UnusedMaterialScaffoldPaddingParameter")
    @Composable
    fun CharactersListScaffold(
        showComics: (Long) -> Unit,
        closeAction: () -> Unit,
        modifier: Modifier = Modifier,
        charactersListViewModel: CharactersListViewModel = getViewModel()
    ) {
        Scaffold(
            modifier = modifier,
            topBar = {
                TopAppBar(
                    title = {
                        Text(text = stringResource(id = R.string.characters))
                    },
                    navigationIcon = {
                        IconButton(onClick = closeAction) {
                            Icon(
                                imageVector = Icons.Filled.Close,
                                contentDescription = stringResource(id = R.string.close_icon)
                            )
                        }
                    }
                )
            }
        ) {
            val state = charactersListViewModel.characters.collectAsState()
    
            when (state.value) {
    
                is UiState.Loading -> {
                    Loader()
                }
    
                is UiState.Loaded -> {
                    state.value.data?.let { characters ->
                        val isRefreshing = remember { mutableStateOf(false) }
                        SwipeRefresh(
                            state = rememberSwipeRefreshState(isRefreshing = isRefreshing.value),
                            onRefresh = {
                                isRefreshing.value = true
                                charactersListViewModel.refresh()
                            }
                        ) {
                            isRefreshing.value = false
    
                            if (characters.isNotEmpty()) {
    
                                LazyVerticalGrid(
                                    columns = GridCells.Fixed(2),
                                    modifier = Modifier
                                        .padding(5.dp)
                                        .fillMaxSize()
                                ) {
                                    items(characters) { state ->
                                        CharacterTile(
                                            state = state,
                                            characterSelectAction = {
                                                showComics(state.id)
                                            },
                                            bookmarkAction = {
                                                charactersListViewModel.bookmarkCharacter(state.id)
                                            },
                                            modifier = Modifier
                                                .padding(5.dp)
                                                .fillMaxHeight(fraction = 0.35f)
                                        )
                                    }
                                }
    
                            } else {
                                Info(
                                    messageResource = R.string.no_characters_available,
                                    iconResource = R.drawable.ic_no_data
                                )
                            }
                        }
                    }
                }
    
                is UiState.ApiError -> {
                    Info(
                        messageResource = R.string.api_error,
                        iconResource = R.drawable.ic_something_went_wrong
                    )
                }
    
                is UiState.NoInternetError -> {
                    Info(
                        messageResource = R.string.no_internet,
                        iconResource = R.drawable.ic_no_connection,
                        isInfoOnly = false,
                        buttonAction = {
                            charactersListViewModel.refresh(showLoader = true)
                        }
                    )
                }
            }
        }
    }
    
    @Preview
    @Composable
    private fun CharactersListScaffoldPreview() {
        MarvelComicTheme {
            CharactersListScaffold(showComics = {}, closeAction = {})
        }
    }

    Let’s see how the communication between the layers looks like.

    Source: Clean Architecture Tutorial for Android

    As you can see, each layer communicates only with the closest one, keeping inner layers independent from lower layers, this way, we can quickly test each module separately, and the separation of concerns will help developers to collaborate on the different modules of the project.

    Thank you so much!

  • API Testing Using Postman and Newman

    In the last few years, we have an exponential increase in the development and use of APIs. We are in the era of API-first companies like Stripe, Twilio, Mailgun etc. where the entire product or service is exposed via REST APIs. Web applications also today are powered by REST-based Web Services. APIs today encapsulate critical business logic with high SLAs. Hence it is important to test APIs as part of the continuous integration process to reduce errors, improve predictability and catch nasty bugs.

    In the context of API development, Postman is great REST client to test APIs. Although Postman is not just a REST Client, it contains a full-featured testing sandbox that lets you write and execute Javascript based tests for your API.

    Postman comes with a nifty CLI tool – Newman. Newman is the Postman’s Collection Runner engine that sends API requests, receives the response and then runs your tests against the response. Newman lets developments easily integrate Postman into continuous integration systems like Jenkins. Some of the important features of Postman & Newman include:-

    1. Ability to test any API and see the response instantly.
    2. Ability to create test suites or collections using a collection of API endpoints.
    3. Ability to collaborate with team members on these collections.
    4. Ability to easily export/import collections as JSON files.

    We are going to look at all these features, some are intuitive and some not so much unless you’ve been using Postman for a while.

    Setting up Your Postman

    You can install Postman either as a Chrome extension or as a native application

    Later, can then look it up in your installed apps and open it. You can choose to Sign Up & create an account if you want, this is important especially for saving your API collections and accessing them anytime on any machine. However, for this article, we can skip this. There’s a button for that towards the bottom when you first launch the app.

    Postman Collections

    Postman Collections in simple words is a collection of tests. It is essentially a test suite of related tests. These tests can be scenario-based tests or sequence/workflow-based tests.

    There’s a Collections tab on the top left of Postman, with an example Postman Echo collection. You can open and go through it.

    Just like in the above screenshot, select a API request and click on the Tests. Check the first line:

    tests["response code is 200"] = responseCode.code === 200;

    The above line is a simple test to check if the response code for the API is 200. This is the pattern for writing Assertions/Tests in Postman (using JavaScript), and this is actually how you are going to write the tests for API’s need to be tested.You can open the other API requests in the POSTMAN Echo collection to get a sense of how requests are made.

    Adding a COLLECTION

    To make your own collection, click on the ‘Add Collection‘ button on the top left of Postman and call it “Test API”

    You will be prompted to give details about the collection, I’ve added a name Github API and given it a description.

    Clicking on Create should add the collection to the left pane, above, or below the example “POSTMAN Echo” collection.

    If you need a hierarchy for maintaining relevance between multiple API’s inside a collection, APIs can further be added to a folder inside a collection. Folders are a great way of separating different parts of your API workflow. You can be added folders through the “3 dot” button beside Collection Name:

    Eg.: name the folder “Get Calls” and give a description once again.

    Now that we have the folder, the next task is to add an API call that is related to the TEST_API_COLLECTION to that folder. That API call is to https://api.github.com/.

    If you still have one of the TEST_API_COLLECTION collections open, you can close it the same way you close tabs in a browser, or just click on the plus button to add a new tab on the right pane where we make requests.

    Type in or paste in https://api.github.com/ and press Send to see the response.

    Once you get the response, you can click on the arrow next to the Save button on the far right, and select Save As, a pop up will be displayed asking where to save the API call.

    Give a name, it can be the request URL, or a name like “GET Github Basic”, and a description, then choose the collection and folder, in this case, TEST_API_COLLECTION> GET CALLS, then click on Save. The API call will be added to the Github Root API folder on the left pane.

    Whenever you click on this request from the collection, it will open in the center pane.

    Write the Tests

    We’ve seen that the GET Github Basic request has a JSON response, which is usually the case for most of the APIs.This response has properties such as current_user_url, emails_url, followers_url and following_url to pick a few. The current_user_url has a value of https://api.github.com/user.  Let’s add a test, for this URL. Click on the ‘GET Github Basic‘ and click on the test tab in the section just below where the URL is put.

    You will notice on the right pane, we have some snippets which Postman creates when you click so that you don’t have to write a lot of code. Let’s add Response Body: JSON value check. Clicking on it produces the following snippet.

    var jsonData = JSON.parse(responseBody);
    tests["Your test name"] = jsonData.value === 100;

    From these two lines, it is apparent that Postman stores the response in a global object called responseBody, and we can use this to access response and assert values in tests as required.

    Postman also has another global variable object called tests, which is an object you can use to name your tests, and equate it to a boolean expression. If the boolean expression returns true, then the test passes.

    tests['some random test'] = x === y

    If you click on Send to make the request, you will see one of the tests failing.

    Lets create a test that relevant to our usecase.

    var jsonData = JSON.parse(responseBody);
    var usersURL = "https://api.github.com/user"
    tests["Gets the correct users url"] = jsonData.current_user_url === usersURL;

    Clicking on ‘Send‘, you’ll see the test passing.

    Let’s modify the test further to test some of the properties we want to check

    Ideally the things to be tested in an API Response Body should be:

    • Response Code ( Assert Correct Response Code for any request)
    • Response Time ( to check api responds in an acceptable time range / is not delayed)
    • Response Body is not empty / null
    tests["Status code is 200"] = responseCode.code === 200;
    tests["Response time is less than 200ms"] = responseTime < 200;
    tests["Response time is acceptable"] = _.inRange(responseTime, 0, 500);
    tests["Body is not empty"] = (responseBody!==null || responseBody.length!==0);

    Newman CLI

    Once you’ve set up all your collections and written tests for them, it may be tedious to go through them one by one and clicking send to see if a given collection test passes. This is where Newman comes in. Newman is a command-line collection runner for Postman.

    All you need to do is export your collection and the environment variables, then use Newman to run the tests from your terminal.

    NOTE: Make sure you’ve clicked on ‘Save’ to save your collection first before exporting.

    USING NEWMAN

    So the first step is to export your collection and environment variables. Click on the Menu icon for Github API collection, and select export.

    Select version 2, and click on “Export”

    Save the JSON file in a location you can access with your terminal. I created a local directory/folder called “postman” and saved it there.

    Install Newman CLI globally, then navigate to the where you saved the collection.

    npm install -g newman 
    cd postman

    Using Newman is quite straight-forward, and the documentation is extensive. You can even require it as a Node.js module and run the tests there. However, we will use the CLI.

    Once you are in the directory, run newman run <collection_name.json>, </collection_name.json> replacing the collection_name with the name you used to save the collection.

    newman run TEST_API_COLLECTION.postman_collection.json     

    NEWMAN CLI Options

    Newman provides a rich set of options to customize a run. A list of options can be retrieved by running it with the -h flag.

    
    $ newman run -h
    Options - Additional args: 
    Utility:
    -h, --help output usage information
    -v, --version output the version number
    Basic setup:
    --folder [folderName] Specify a single folder to run from a collection.
    -e, --environment [file|URL] Specify a Postman environment as a JSON [file]
    -d, --data [file] Specify a data file to use either json or csv
    -g, --global [file] Specify a Postman globals file as JSON [file]
    -n, --iteration-count [number] Define the number of iterations to run
    Request options:
    --delay-request [number] Specify a delay (in ms) between requests [number] --timeout-request [number] Specify a request timeout (in ms) for a request
    Misc.:
    --bail Stops the runner when a test case fails
    --silent Disable terminal output --no-color Disable colored output
    -k, --insecure Disable strict ssl
    -x, --suppress-exit-code Continue running tests even after a failure, but exit with code=0
    --ignore-redirects Disable automatic following of 3XX responses

    Lets try out of some of the options.

    Iterations

    Lets use the -n option to set the number of iterations to run the collection.

    $ newman run mycollection.json -n 10 # runs the collection 10 times

    To provide a different set of data, i.e. variables for each iteration, you can use the -d to specify a JSON or CSV file. For example, a data file such as the one shown below will run 2 iterations, with each iteration using a set of variables.

    [{
    "url": "http://127.0.0.1:5000",
      "user_id": "1",
      "id": "1",
      "token_id": "123123",
    },{
      "url": "http://postman-echo.com",
      "user_id": "2",
      "id": "2",
      "token_id": "899899",
    }]$ newman run mycollection.json -d data.json

    Alternately, the CSV file for the above set of variables would look like:

    url, user_id, id, token_id 
    http://127.0.0.1:5000, 1, 1, 123123123 
    http://postman-echo.com, 2, 2, 899899

    Environment Variables

    Each environment is a set of key-value pairs, with the key as the variable name. These Environment configurations can be used to differentiate between configurations specific to your execution environments eg. Dev, Test & Production.

    To provide a different execution environment, you can use the -e to specify a JSON or CSV file. For example, a environment file such as the one shown below will provide the environment variables globally to all tests during execution.

    postman_dev_env.json
    {
    "id": "b5c617ad-7aaf-6cdf-25c8-fc0711f8941b",
    "name": "dev env",
    "values": [
    {
    "enabled": true,
    "key": "env",
    "value": "dev.example.com",
    "type": "text"
    }  
    ],
    "timestamp": 1507210123364,
    "_postman_variable_scope": "environment",
    "_postman_exported_at": "2017-10-05T13:28:45.041Z",
    "_postman_exported_using": "Postman/5.2.1"
    }

    Bail FLAG

    Newman, by default, exits with a status code of 0 if everything runs well i.e. without any exceptions. Continuous integration tools respond to these exit codes and correspondingly pass or fail a build. You can use the –bail flag to tell Newman to halt on a test case error with a status code of 1 which can then be picked up by a CI tool or build system.

    $ newman run PostmanCollection.json -e environment.json --bail newman

    Conclusion

    Postman and Newman can be used for a number of test cases, including creating usage scenarios, Suites, Packs for your API Test Cases. Further NEWMAN / POSTMAN can be very well Integrated with CI/CD Tools such as Jenkins, Travis etc.

  • A Practical Guide to Deploying Multi-tier Applications on Google Container Engine (GKE)

    Introduction

    All modern era programmers can attest that containerization has afforded more flexibility and allows us to build truly cloud-native applications. Containers provide portability – ability to easily move applications across environments. Although complex applications comprise of many (10s or 100s) containers. Managing such applications is a real challenge and that’s where container orchestration and scheduling platforms like Kubernetes, Mesosphere, Docker Swarm, etc. come into the picture. 
    Kubernetes, backed by Google is leading the pack given that Redhat, Microsoft and now Amazon are putting their weight behind it.

    Kubernetes can run on any cloud or bare metal infrastructure. Setting up & managing Kubernetes can be a challenge but Google provides an easy way to use Kubernetes through the Google Container Engine(GKE) service.

    What is GKE?

    Google Container Engine is a Management and orchestration system for Containers. In short, it is a hosted Kubernetes. The goal of GKE is to increase the productivity of DevOps and development teams by hiding the complexity of setting up the Kubernetes cluster, the overlay network, etc.

    Why GKE? What are the things that GKE does for the user?

    • GKE abstracts away the complexity of managing a highly available Kubernetes cluster.
    • GKE takes care of the overlay network
    • GKE also provides built-in authentication
    • GKE also provides built-in auto-scaling.
    • GKE also provides easy integration with the Google storage services.

    In this blog, we will see how to create your own Kubernetes cluster in GKE and how to deploy a multi-tier application in it. The blog assumes you have a basic understanding of Kubernetes and have used it before. It also assumes you have created an account with Google Cloud Platform. If you are not familiar with Kubernetes, this guide from Deis  is a good place to start.

    Google provides a Command-line interface (gcloud) to interact with all Google Cloud Platform products and services. gcloud is a tool that provides the primary command-line interface to Google Cloud Platform. Gcloud tool can be used in the scripts to automate the tasks or directly from the command-line. Follow this guide to install the gcloud tool.

    Now let’s begin! The first step is to create the cluster.

    Basic Steps to create cluster

    In this section, I would like to explain about how to create GKE cluster. We will use a command-line tool to setup the cluster.

    Set the zone in which you want to deploy the cluster

    $ gcloud config set compute/zone us-west1-a

    Create the cluster using following command,

    $ gcloud container --project <project-name> 
    clusters create <cluster-name> 
    --machine-type n1-standard-2 
    --image-type "COS" --disk-size "50" 
    --num-nodes 2 --network default 
    --enable-cloud-logging --no-enable-cloud-monitoring

    Let’s try to understand what each of these parameters mean:

    –project: Project Name

    –machine-type: Type of the machine like n1-standard-2, n1-standard-4

    –image-type: OS image.”COS” i.e. Container Optimized OS from Google: More Info here.

    –disk-size: Disk size of each instance.

    –num-nodes: Number of nodes in the cluster.

    –network: Network that users want to use for the cluster. In this case, we are using default network.

    Apart from the above options, you can also use the following to provide specific requirements while creating the cluster:

    –scopes: Scopes enable containers to direct access any Google service without needs credentials. You can specify comma separated list of scope APIs. For example:

    • Compute: Lets you view and manage your Google Compute Engine resources
    • Logging.write: Submit log data to Stackdriver.

    You can find all the Scopes that Google supports here: .

    –additional-zones: Specify additional zones to high availability. Eg. –additional-zones us-east1-b, us-east1-d . Here Kubernetes will create a cluster in 3 zones (1 specified at the beginning and additional 2 here).

    –enable-autoscaling : To enable the autoscaling option. If you specify this option then you have to specify the minimum and maximum required nodes as follows; You can read more about how auto-scaling works here. Eg:   –enable-autoscaling –min-nodes=15 –max-nodes=50

    You can fetch the credentials of the created cluster. This step is to update the credentials in the kubeconfig file, so that kubectl will point to required cluster.

    $ gcloud container clusters get-credentials my-first-cluster --project project-name

    Now, your First Kubernetes cluster is ready. Let’s check the cluster information & health.

    $ kubectl get nodes
    NAME    STATUS    AGE   VERSION
    gke-first-cluster-default-pool-d344484d-vnj1  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-kdd7  Ready  2h  v1.6.4
    gke-first-cluster-default-pool-d344484d-ytre2  Ready  2h  v1.6.4

    After creating Cluster, now let’s see how to deploy a multi tier application on it. Let’s use simple Python Flask app which will greet the user, store employee data & get employee data.

    Application Deployment

    I have created simple Python Flask application to deploy on K8S cluster created using GKE. you can go through the source code here. If you check the source code then you will find directory structure as follows:

    TryGKE/
    ├── Dockerfile
    ├── mysql-deployment.yaml
    ├── mysql-service.yaml
    ├── src    
      ├── app.py    
      └── requirements.txt    
      ├── testapp-deployment.yaml    
      └── testapp-service.yaml

    In this, I have written a Dockerfile for the Python Flask application in order to build our own image to deploy. For MySQL, we won’t build an image of our own. We will use the latest MySQL image from the public docker repository.

    Before deploying the application, let’s re-visit some of the important Kubernetes terms:

    Pods:

    The pod is a Docker container or a group of Docker containers which are deployed together on the host machine. It acts as a single unit of deployment.

    Deployments:

    Deployment is an entity which manages the ReplicaSets and provides declarative updates to pods. It is recommended to use Deployments instead of directly using ReplicaSets. We can use deployment to create, remove and update ReplicaSets. Deployments have the ability to rollout and rollback the changes.

    Services:

    Service in K8S is an abstraction which will connect you to one or more pods. You can connect to pod using the pod’s IP Address but since pods come and go, their IP Addresses change.  Services get their own IP & DNS and those remain for the entire lifetime of the service. 

    Each tier in an application is represented by a Deployment. A Deployment is described by the YAML file. We have two YAML files – one for MySQL and one for the Python application.

    1. MySQL Deployment YAML

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: mysql
    spec:
      template:
        metadata:
          labels:
            app: mysql
        spec:
          containers:
            - env:
                - name: MYSQL_DATABASE
                  value: admin
                - name: MYSQL_ROOT_PASSWORD
                  value: admin
              image: 'mysql:latest'
              name: mysql
              ports:
                - name: mysqlport
                  containerPort: 3306
                  protocol: TCP

    2. Python Application Deployment YAML

    apiVersion: apps/v1beta1
    kind: Deployment
    metadata:
      name: test-app
    spec:
      replicas: 1
      template:
        metadata:
          labels:
            app: test-app
        spec:
          containers:
          - name: test-app
            image: ajaynemade/pymy:latest
            imagePullPolicy: IfNotPresent
            ports:
            - containerPort: 5000

    Each Service is also represented by a YAML file as follows:

    1. MySQL service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: mysql-service
    spec:
      ports:
      - port: 3306
        targetPort: 3306
        protocol: TCP
        name: http
      selector:
        app: mysql

    2. Python Application service YAML

    apiVersion: v1
    kind: Service
    metadata:
      name: test-service
    spec:
      type: LoadBalancer
      ports:
      - name: test-service
        port: 80
        protocol: TCP
        targetPort: 5000
      selector:
        app: test-app

    You will find a ‘kind’ field in each YAML file. It is used to specify whether the given configuration is for deployment, service, pod, etc.

    In the Python app service YAML, I am using type = LoadBalancer. In GKE, There are two types of cloud load balancers available to expose the application to outside world.

    1. TCP load balancer: This is a TCP Proxy-based load balancer. We will use this in our example.
    2. HTTP(s) load balancer: It can be created using Ingress. For more information, refer to this post that talks about Ingress in detail.

    In the MySQL service, I’ve not specified any type, in that case, type ‘ClusterIP’ will get used, which will make sure that MySQL container is exposed to the cluster and the Python app can access it.

    If you check the app.py, you can see that I have used “mysql-service.default” as a hostname. “Mysql-service.default” is a DNS name of the service. The Python application will refer to that DNS name while accessing the MySQL Database.

    Now, let’s actually setup the components from the configurations. As mentioned above, we will first create services followed by deployments.

    Services:

    $ kubectl create -f mysql-service.yaml
    $ kubectl create -f testapp-service.yaml

    Deployments:

    $ kubectl create -f mysql-deployment.yaml
    $ kubectl create -f testapp-deployment.yaml

    Check the status of the pods and services. Wait till all pods come to the running state and Python application service to get external IP like below:

    $ kubectl get services
    NAME            CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
    kubernetes      10.55.240.1     <none>        443/TCP        5h
    mysql-service   10.55.240.57    <none>        3306/TCP       1m
    test-service    10.55.246.105   35.185.225.67     80:32546/TCP   11s

    Once you get the external IP, then you should be able to make APIs calls using simple curl requests.

    Eg. To Store Data :

    curl -H "Content-Type: application/x-www-form-urlencoded" -X POST  http://35.185.225.67:80/storedata -d id=1 -d name=NoOne

    Eg. To Get Data :

    curl 35.185.225.67:80/getdata/1

    At this stage your application is completely deployed and is externally accessible.

    Manual scaling of pods

    Scaling your application up or down in Kubernetes is quite straightforward. Let’s scale up the test-app deployment.

    $ kubectl scale deployment test-app --replicas=3

    Deployment configuration for test-app will get updated and you can see 3 replicas of test-app are running. Verify it using,

    kubectl get pods

    In the same manner, you can scale down your application by reducing the replica count.

    Cleanup :

    Un-deploying an application from Kubernetes is also quite straightforward. All we have to do is delete the services and delete the deployments. The only caveat is that the deletion of the load balancer is an asynchronous process. You have to wait until it gets deleted.

    $ kubectl delete service mysql-service
    $ kubectl delete service test-service

    The above command will deallocate Load Balancer which was created as a part of test-service. You can check the status of the load balancer with the following command.

    $ gcloud compute forwarding-rules list

    Once the load balancer is deleted, you can clean-up the deployments as well.

    $ kubectl delete deployments test-app
    $ kubectl delete deployments mysql

    Delete the Cluster:

    $ gcloud container clusters delete my-first-cluster

    Conclusion

    In this blog, we saw how easy it is to deploy, scale & terminate applications on Google Container Engine. Google Container Engine abstracts away all the complexity of Kubernetes and gives us a robust platform to run containerised applications. I am super excited about what the future holds for Kubernetes!

    Check out some of Velotio’s other blogs on Kubernetes.

  • An Introduction To Cloudflare Workers And Cloudflare KV store

    Cloudflare Workers

    This post gives a brief introduction to Cloudflare Workers and Cloudflare KV store. They address a fairly common set of problems around scaling an application globally. There are standard ways of doing this but they usually require a considerable amount of upfront engineering work and developers have to be aware of the ‘scalability’ issues to some degree. Serverless application tools target easy scalability and quick response times around the globe while keeping the developers focused on the application logic rather than infra nitty-gritties.

    Global responsiveness

    When an application is expected to be accessed around the globe, requests from users sitting in different time-zones should take a similar amount of time. There can be multiple ways of achieving this depending upon how data intensive the requests are and what those requests actually do.

    Data intensive requests are harder and more expensive to globalize, but again not all the requests are same. On the other hand, static requests like getting a documentation page or a blog post can be globalized by generating markup at build time and deploying them on a CDN.

    And there are semi-dynamic requests. They render static content either with some small amount of data or their content change based on the timezone the request came from.

    The above is a loose classification of requests but there are exceptions, for example, not all the static requests are presentational.

    Serverless frameworks are particularly useful in scaling static and semi-static requests.

    Cloudflare Workers Overview

    Cloudflare worker is essentially a function deployment service. They provide a serverless execution environment which can be used to develop and deploy small(although not necessarily) and modular cloud functions with minimal effort.

    It is very trivial to start with workers. First, lets install wrangler, a tool for managing Cloudfare Worker projects.

    npm i @cloudflare/wrangler -g

    Wrangler handles all the standard stuff for you like project generation from templates, build, config, publishing among other things.

    A worker primarily contains 2 parts: an event listener that invokes a worker and an event handler that returns a response object. Creating a worker is as easy as adding an event listener to a button.

    addEventListener('fetch', event => {
        event.respondWith(handleRequest(event.request))
    })
    
    async function handleRequest(request) {
        return new Response("hello world")
    }

    Above is a simple hello world example. Wrangler can be used to build and get a live preview of your worker.

    wrangler build

    will build your worker. And 

    wrangler preview 

    can be used to take a live preview on the browser. The preview is only meant to be used for testing(either by you or others). If you want the workers to be triggered by your own domain or a workers.dev subdomain, you need to publish it.

    Publishing is fairly straightforward and requires very less configuration on both wrangler and your project.

    Wrangler Configuration

    Just create an account on Cloudflare and get API key. To configure wrangler, just do:

    wrangler config

    It will ask for the registered email and API key, and you are good to go.

    To publish your worker on a workers.dev subdomain, just fill your account ID in the wrangler.toml and hit wrangler publish. The worker will be deployed and live at a generated workers.dev subdomain.

    Regarding Routes

    When you publish on a {script-name}.{subdomain}.workers.dev domain, the script or project associated with script-name will be invoked. There is no way to call a script just from {subdomain}.workers.dev.

    Worker KV

    Workers alone can’t be used to make anything complex without any persistent storage, that’s where Workers KV comes into the picture. Workers KV as it sounds, is a low-latency, high-volume, key-value store that is designed for efficient reads.

    It optimizes the read latency by dynamically spreading the most frequently read entries to the edges(replicated in several regions) and storing less frequent entries centrally.

    Newly added keys(or a CREATE) are immediately reflected in every region while a value change in the keys(or an UPDATE) may take as long as 60 seconds to propagate, depending upon the region.

    Workers KV is only available to paid users of Cloudflare.

    Writing Data in Workers KV

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces" 
    -X POST 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 
    -H "Content-Type: application/json" 
    --data '{"title": "Requests"}'
    The above HTTP request will create a namespace by the name Requests. The response should look something like this:
    {
        "result": {
            "id": "30b52f55aafb41d88546d01d5f69440a",
            "title": "Requests",
            "supports_url_encoding": true
        },
        "success": true,
        "errors": [],
        "messages": []
    }

    Now we can write KV pairs in this namespace. The following HTTP requests will do the same:

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/values/first-key" 
    -X PUT 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 
    --data 'My first value!'

    Here the NAMESPACE_ID is the same ID that we received in the last request. First-key is the key name and the My first value is the value.

    Let’s complicate things a little

    Above overview just introduces the managed cloud workers with a ‘hello world’ app and basics of the Workers KV, but now let’s make something more complicated. We will make an app which will tell how many requests have been made from your country till now. For example, if you pinged the worker from the US then it will return number of requests made so far from the US.

    We will need: 

    • Some place to store the count of requests for each country. 
    • Find from which country the Worker was invoked.

    For the first part, we will use the Workers KV to store the count for every request.

    Let’s start

    First, we will create a new project using wrangler: wrangler generate request-count.

    We will be making HTTP calls to write values in the Workers KV, so let’s add ‘node-fetch’ to the project:

    npm install node-fetch

    Now, how do we find from which country each request is coming from? The answer is the cf object that is provided with each request to a worker.

    The cf object is a special object that is passed with each request and can be accessed with request.cf. This mainly contains region specific information along with TLS and Auth information. The details of what is provided in the cf, can be found here.

    As we can see from the documentation, we can get country from

    request.cf.country.

    The cf object is not correctly populated in the wrangler preview, you will need to publish your worker in order to test cf’s usage. An open issue mentioning the same can be found here.

    Now, the logic is pretty straightforward here. When we get a request from a country for which we don’t have an entry in the Worker’s KV, we make an entry with value 1, else we increment the value of the country key.

    To use Workers KV, we need to create a namespace. A namespace is just a collection of key-value pairs where all the keys have to be unique.

    A namespace can be created under the KV tab in Cloudflare web UI by giving the name or using the API call above. You can also view/browse all of your namespaces from the web UI. Following API call can be used to read the value of a key from a namespace:

    curl "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/values/first-key" 
    -H "X-Auth-Email: $CLOUDFLARE_EMAIL" 
    -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" 

    But, it is neither the fastest nor the easiest way. Cloudflare provides a better and faster way to read data from your namespaces. It’s called binding. Each KV namespace can be bound to a worker script so to make it available in the script by the variable name. Any namespace can be bound with any worker. A KV namespace can be bound to a worker by going to the editing menu of a worker from the Cloudflare UI. 

    Following steps show you how to bind a namespace to a worker:

    Go to the edit page of the worker in Cloudflare web UI and click on the KV tab:

    Then add a binding by clicking the ‘Add binding’ button.

    You can select the namespace name and the variable name by which it will be bound. More details can be found here. A binding that I’ve made can be seen in the above image.

    That’s all we need to get this to work. Following is the relevant part of the script:

    const fetch = require('node-fetch')
    
    addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request))
    })
    
    /**
    * Fetch and log a request
    * @param {Request} request
    */
    async function handleRequest(request) {
        const country = request.cf.country
    
        const url = `https://api.cloudflare.com/client/v4/accounts/account-id/storage/kv/namespaces/namespace-id/values/${country}`
    
        let count = await requests.get(country)
    
        if (!count) {
            count = 1
        } else {
            count = parseInt(count) + 1
        }
    
        try {
            response = await fetch(url, {
            method: 'PUT',
            headers: {"X-Auth-Email": "email", "X-Auth-Key": "auth-key"},
            body: `${count}`
            })
        } catch (error) {
            return new Response(error, { status: 500 })
        }
    
        return new Response(`${country}: ${count}`, { status: 200 }) 
    }

    In the above code, I bound the Requests namespace that we created by the requests variable that would be dynamically resolved when we publish.

    The full source of this can be found here.

    This small application also demonstrates some of the practical aspects of the workers. For example, you would notice that the updates take some time to get reflected and response time of the workers is quick, especially when they are deployed on a .workers.dev subdomain here.

    Side note: You will have to recreate the namespace-worker binding everytime you deploy the worker or you do wrangler publish.

    Workers vs. AWS Lambda

    AWS Lambda has been a major player in the serverless market for a while now. So, how is Cloudflare Workers as compared to it? Let’s see.

    Architecture:

    Cloudflare Workers `Isolates` instead of a container based underlying architecture. `Isolates` is the technology that allows V8(Google Chrome’s JavaScript Engine) to run thousands of processes on a single server in an efficient and secure manner. This effectively translates into faster code execution and lowers memory usage. More details can be found here.

    Price:

    The above mentioned architectural difference allows Workers to be significantly cheaper than Lambda. While a Worker offering 50 milliseconds of CPU costs $0.50 per million requests, the equivalent Lambda costs $1.84 per million. A more detailed price comparison can be found here.

    Speed:

    Workers also show significantly better performance numbers than Lambda and Lambda@Edge. Tests run by Cloudflare claim that they are 441% faster than Lambda and 192% faster than Lambda@Edge. A detailed performance comparison can be found here.

    This better performance is also confirmed by serverless-benchmark.

    Wrapping Up:

    As we have seen, Cloudflare Workers along with the KV Store does make it very easy to start with a serverless application. They provide fantastic performance while using less cost along with intuitive deployment. These properties make them ideal for making globally accessible serverless applications.

  • Explanatory vs. Predictive Models in Machine Learning

    My vision on Data Analysis is that there is continuum between explanatory models on one side and predictive models on the other side. The decisions you make during the modeling process depend on your goal. Let’s take Customer Churn as an example, you can ask yourself why are customers leaving? Or you can ask yourself which customers are leaving? The first question has as its primary goal to explain churn, while the second question has as its primary goal to predict churn. These are two fundamentally different questions and this has implications for the decisions you take along the way. The predictive side of Data Analysis is closely related to terms like Data Mining and Machine Learning.

    SPSS & SAS

    When we’re looking at SPSS and SAS, both of these languages originate from the explanatory side of Data Analysis. They are developed in an academic environment, where hypotheses testing plays a major role. This makes that they have significant less methods and techniques in comparison to R and Python. Nowadays, SAS and SPSS both have data mining tools (SAS Enterprise Miner and SPSS Modeler), however these are different tools and you’ll need extra licenses.

    I have spent some time to build extensive macros in SAS EG to seamlessly create predictive models, which also does a decent job at explaining the feature importance. While a Neural Network may do a fair job at making predictions, it is extremely difficult to explain such models, let alone feature importance. The macros that I have built in SAS EG does precisely the job of explaining the features, apart from producing excellent predictions.

    Open source TOOLS: R & PYTHON

    One of the major advantages of open source tools is that the community continuously improves and increases functionality. R was created by academics, who wanted their algorithms to spread as easily as possible. R has the widest range of algorithms, which makes R strong on the explanatory side and on the predictive side of Data Analysis.

    Python is developed with a strong focus on (business) applications, not from an academic or statistical standpoint. This makes Python very powerful when algorithms are directly used in applications. Hence, we see that the statistical capabilities are primarily focused on the predictive side. Python is mostly used in Data Mining or Machine Learning applications where a data analyst doesn’t need to intervene. Python is therefore also strong in analyzing images and videos. Python is also the easiest language to use when using Big Data Frameworks like Spark. With the plethora of packages and ever improving functionality, Python is a very accessible tool for data scientists.

    MACHINE LEARNING MODELS

    While procedures like Logistic Regression are very good at explaining the features used in a prediction, some others like Neural Networks are not. The latter procedures may be preferred over the former when it comes to only prediction accuracy and not explaining the models. Interpreting or explaining the model becomes an issue for Neural Networks. You can’t just peek inside a deep neural network to figure out how it works. A network’s reasoning is embedded in the behavior of numerous simulated neurons, arranged into dozens or even hundreds of interconnected layers. In most cases the Product Marketing Officer may be interested in knowing what are the factors that are most important for a specific advertising project. What can they concentrate on to get the response rates higher, rather than, what will be their response rate, or revenues in the upcoming year. These questions are better answered by procedures which can be interpreted in an easier way. This is a great article about the technical and ethical consequences of the lack of explanations provided by complex AI models.

    Procedures like Decision Trees are very good at explaining and visualizing what exactly are the decision points (features and their metrics). However, those do not produce the best models. Random Forests, Boosting are the procedures which use Decision Trees as the basic starting point to build the predictive models, which are by far some of the best methods to build sophisticated prediction models.

    While Random Forests use fully grown (highly complex) Trees, and by taking random samples from the training set (a process called Bootstrapping), then each split uses only a proper subset of features from the entire feature set to actually make the split, rather than using all of the features. This process of bootstrapping helps with lower number of training data (in many cases there is no choice to get more data). The (proper) subsetting of the features has a tremendous effect on de-correlating the Trees grown in the Forest (hence randomizing it), leading to a drop in Test Set error. A fresh subset of features is chosen at each step of splitting, making the method robust. The strategy also stops the strongest feature from appearing each time a split is considered, making all the trees in the forest similar. The final result is obtained by averaging the result over all trees (in case of Regression problems), or by taking a majority class vote (in case of classification problem).

    On the other hand, Boosting is a method where a Forest is grown using Trees which are NOT fully grown, or in other words, with Weak Learners. One has to specify the number of trees to be grown, and the initial weights of those trees for taking a majority vote for class selection. The default weight, if not specified is the average of the number of trees requested. At each iteration, the method fits these weak learners, finds the residuals. Then the weights of those trees which failed to predict the correct class is increased so that those trees can concentrate better on the failed examples. This way, the method proceeds by improving the accuracy of the Boosted Trees, stopping when the improvement is below a threshold. One particularly implementation of Boosting, AdaBoost has very good accuracy over other implementations. AdaBoost uses Trees of depth 1, known as Decision Stump as each member of the Forest. These are slightly better than random guessing to start with, but over time they learn the pattern and perform extremely well on test set. This method is more like a feedback control mechanism (where the system learns from the errors). To address overfitting, one can use the hyper-parameter Learning Rate (lambda) by choosing values in the range: (0,1]. Very small values of lambda will take more time to converge, however larger values may have difficulty converging. This can be achieved by a iterative process to select the correct value for lambda, plotting the test error rate against values of lambda. The value of lambda with the lowest test error should be chosen.

    In all these methods, as we move from Logistic Regression, to Decision Trees to Random Forests and Boosting, the complexity of the models increase, making it almost impossible to EXPLAIN the Boosting model to marketers/product managers. Decision Trees are easy to visualize, Logisitic Regression results can be used to demonstrate the most important factors in a customer acquisition model and hence will be well received by business leaders. On the other hand, the Random Forest and Boosting methods are extremely good predictors, without much scope for explaining. But there is hope: These models have functions for revealing the most important variables, although it is not possible to visualize why. 

    USING A BALANCED APPROACH

    So I use a mixed strategy: Use the previous methods as a step in Exploratory Data Analysis, present the importance of features, characteristics of the data to the business leaders in phase one, and then use the more complicated models to build the prediction models for deployment, after building competing models. That way, one not only gets to understand what is happening and why, but also gets the best predictive power. In most cases that I have worked, I have rarely seen a mismatch between the explanation and the predictions using different methods. After all, this is all math and the way of delivery should not change end results. Now that’s a happy ending for all sides of the business!

  • Installing Redis Service in DC/OS With Persistent Storage Using AWS Volumes

    Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker.

    It supports various data structures such as Strings, Hashes, Lists, Sets etc. DCOS offers Redis as a service

    Why Do We Use External Persistent Storage for Redis Mesos Containers?

    Since Redis is an in-memory database, an instance/service restart will result in loss of data. To counter this, it is always advisable to snapshot the Redis in-memory database from time to time.

    This helps Redis instance to recover from the point in time failure.

    In DCOS, Redis is deployed as a stateless service. To make it a stateful and persistent data, we can configure local volumes or external volumes.

    The disadvantage of having a local volume mapped to Mesos containers is when a slave node goes down, your local volume becomes unavailable, and the data loss occurs.

    However, with external persistent volumes, as they are available on each node of the DCOS cluster, a slave node failure does not impact the data availability.

    Rex-Ray

    REX-Ray is an open source, storage management solution designed to support container runtimes such as Docker and Mesos.

    REX-Ray enables stateful applications such as databases to persist and maintain its data after the life cycle of the container has ended. Built-in high availability enables orchestrators such as Docker Swarm, Kubernetes, and Mesos Frameworks like Marathon to automatically orchestrate storage tasks between hosts in a cluster.

    Built on top of the libStorage framework, REX-Ray’s simplified architecture consists of a single binary and runs as a stateless service on every host using a configuration file to orchestrate multiple storage platforms.

    Objective: To create a Redis service in DC/OS environment with persistent storage.

    Warning: The Persistent Volume feature is still in beta Phase for DC/OS Version 1.11.

    Prerequisites:

    • Make sure the rexray service is running and is in a healthy state for the cluster.

    Steps:

    • Click on the Add button in Services component of DC/OS GUI.
    • Click on JSON Configuration.  

    Note: For persistent storage, below code should be added in the normal Redis service configuration JSON file to mount external persistent volumes.

    "volumes": [
          {
            "containerPath": "/data",
            "mode": "RW",
            "external": {
              "name": "redis4volume",
              "provider": "dvdi",
              "options": {
                "dvdi/driver": "rexray"
              }
            }
          }
        ],

    • Make sure the service is up and in a running state:

    If you look closely, the service was suspended and respawned on a different slave node. We populated the database with dummy data and saved the snapshot in the data directory.

    When the service did come upon a different node 10.0.3.204, the data persisted and the volume was visible on the new node.

    core@ip-10-0-3-204 ~ $ /opt/mesosphere/bin/rexray volume list
    
    - name: datavolume
      volumeid: vol-00aacade602cf960c
      availabilityzone: us-east-1a
      status: in-use
      volumetype: standard
      iops: 0
      size: "16"
      networkname: ""
      attachments:
      - volumeid: vol-00aacade602cf960c
        instanceid: i-0d7cad91b62ec9a64
        devicename: /dev/xvdb
    

    •  Check the volume tab :

    Note: For external volumes, the status will be unavailable. This is an issue with DC/OS.

    The Entire Service JSON file:

    {
      "id": "/redis4.0-new-failover-test",
      "instances": 1,
      "cpus": 1.001,
      "mem": 2,
      "disk": 0,
      "gpus": 0,
      "backoffSeconds": 1,
      "backoffFactor": 1.15,
      "maxLaunchDelaySeconds": 3600,
      "container": {
        "type": "DOCKER",
        "volumes": [
          {
            "containerPath": "/data",
            "mode": "RW",
            "external": {
              "name": "redis4volume",
              "provider": "dvdi",
              "options": {
                "dvdi/driver": "rexray"
              }
            }
          }
        ],
        "docker": {
          "image": "redis:4",
          "network": "BRIDGE",
          "portMappings": [
            {
              "containerPort": 6379,
              "hostPort": 0,
              "servicePort": 10101,
              "protocol": "tcp",
              "name": "default",
              "labels": {
                "VIP_0": "/redis4.0:6379"
              }
            }
          ],
          "privileged": false,
          "forcePullImage": false
        }
      },
      "healthChecks": [
        {
          "gracePeriodSeconds": 60,
          "intervalSeconds": 5,
          "timeoutSeconds": 5,
          "maxConsecutiveFailures": 3,
          "portIndex": 0,
          "protocol": "TCP"
        }
      ],
      "upgradeStrategy": {
        "minimumHealthCapacity": 0.5,
        "maximumOverCapacity": 0
      },
      "unreachableStrategy": {
        "inactiveAfterSeconds": 300,
        "expungeAfterSeconds": 600
      },
      "killSelection": "YOUNGEST_FIRST",
      "requirePorts": true
    }

    Redis entrypoint

    To connect with Redis service, use below host:port in your applications:

    redis.marathon.l4lb.thisdcos.directory:6379

    Conclusion

    We learned about Standalone Redis Service deployment from DCOS catalog on DCOS.  Also, we saw how to add Persistent storage to it using RexRay. We also learned how RexRay automatically manages volumes over AWS ebs and how to integrate them in DCOS apps/services.  Finally, we saw how other applications can communicate with this Redis service.

    References

  • Micro Frontends: Reinventing UI In The Microservices World

    It is amazing how the software industry has evolved. Back in the day, a software was a simple program. Some of the first software applications like The Apollo Missions Landing modules and Manchester Baby were basic stored procedures. Software was primarily used for research and mathematical purposes.

    The invention of personal computers and the prominence of the Internet changed the software world. Desktop applications like word processors, spreadsheets, and games grew. Websites gradually emerged. Back then, simple pages were delivered to the client as static documents for viewing. By the mid-1990s, with Netscape introducing client-side scripting language, JavaScript and Macromedia bringing in Flash, the browser became more powerful, allowing websites to become richer and more interactive. In 1999, the Java language introduced Servlets. And thus born the Web Application. Nevertheless, these developments and applications were still simpler. Engineers didn’t emphasize enough on structuring them and mostly built unstructured monolithic applications.

    The advent of disruptive technologies like cloud computing and Big data paved the way for more intricate, convolute web and native mobile applications. From e-commerce and video streaming apps to social media and photo editing, we had applications doing some of the most complicated data processing and storage tasks. The traditional monolithic way now posed several challenges in terms of scalability, team collaboration and integration/deployment, and often led to huge and messy The Ball Of Mud codebases.

    Fig: Monolithic Application Problems – Source

    To untangle this ball of software, came in a number of service-oriented architectures. The most promising of them was Microservices – breaking an application into smaller chunks that can be developed, deployed and tested independently but worked as a single cohesive unit. Its benefits of scalability and ease of deployment by multiple teams proved as a panacea to most of the architectural problems. A few front-end architectures also came up, such as MVC, MVVM, Web Components, to name a few. But none of them were fully able to reap the benefits of Microservices.

    Fig: Microservice Architecture – Source

    ‍Micro Frontends: The Concept‍

    Micro Frontends first came up in ThoughtWorks Technology Radar where they assessed, tried and eventually adopted the technology after noticing significant benefits. It is a Microservice approach to front-end web development where independently deliverable front-end applications are composed as a whole. 

    With Microservices, Micro Frontends breaks the last monolith to create a complete Micro-Architecture design pattern for web applications. It is entirely composed of loosely coupled vertical slices of business functionality rather than in horizontals. We can term these verticals as ‘Microapps’. This concept is not new and has appeared in Scaling with Microservices and Vertical Decomposition. It first presented the idea of every vertical being responsible for a single business domain and having its presentation layer, persistence layer, and a separate database. From the development perspective, every vertical is implemented by exactly one team and no code is shared among different systems.

    Fig: Micro Frontends with Microservices (Micro-architecture)

    Why Micro Frontends?

    A microservice architecture has a whole slew of advantages when compared to monolithic architectures.

    Ease of Upgrades – Micro Frontends build strict bounded contexts in the application. Applications can be updated in a more incremental and isolated fashion without worrying about the risks of breaking up another part of the application.

    Scalability – Horizontal scaling is easy for Micro Frontends. Each Micro Frontend has to be stateless for easier scalability.

    Ease of deployability: Each Micro Frontend has its CI/CD pipeline, that builds, tests and deploys it to production. So it doesn’t matter if another team is working on a feature and has pushed a bug fix or if a cutover or refactoring is taking place. There should be no risks involved in pushing changes done on a Micro Frontend as long as there is only one team working on it.

    Team Collaboration and Ownership: The Scrum Guide says that “Optimal Development Team size is small enough to remain nimble and large enough to complete significant work within a Sprint”. Micro Frontends are perfect for multiple cross-functional teams that can completely own a stack (Micro Frontend) of an application from UX to Database design. In case of an E-commerce site, the Product team and the Payment team can concurrently work on the app without stepping on each other’s toes.

    Micro Frontend Integration Approaches

    There is a multitude of ways to implement Micro Frontends. It is recommended that any approach for this should take a Runtime integration route instead of a Build Time integration, as the former has to re-compile and release on every single Micro Frontend to release any one of the Micro Frontend’s changes.

    We shall learn some of the prominent approaches of Micro Frontends by building a simple Pet Store E-Commerce site. The site has the following aspects (or Microapps, if you will) – Home or Search, Cart, Checkout, Product, and Contact Us. We shall only be working on the Front-end aspect of the site. You can assume that each Microapp has a microservice dedicated to it in the backend. You can view the project demo here and the code repository here. Each way of integration has a branch in the repo code that you can check out to view.

    Single Page Frontends –

    The simplest way (but not the most elegant) to implement Micro Frontends is to treat each Micro Frontend as a single page.

    Fig: Single Page Micro Frontends: Each HTML file is a frontend.
    !DOCTYPE html>
    <html lang="zxx">
    <head>
    	<title>The MicroFrontend - eCommerce Template</title>
    </head>
    <body>
      <header class="header-section header-normal">
        <!-- Header is repeated in each frontend which is difficult to maintain -->
        ....
        ....
      </header
      <main>
      </main>
      <footer
        <!-- Footer is repeated in each frontend which means we have to multiple changes across all frontends-->
      </footer>
      <script>
        <!-- Cross Cutting features like notification, authentication are all replicated in all frontends-->
      </script>
    </body>

    It is one of the purest ways of doing Micro Frontends because no container or stitching element binds the front ends together into an application. Each Micro Frontend is a standalone app with each dependency encapsulated in it and no coupling with the others. The flipside of this approach is that each frontend has a lot of duplication in terms of cross-cutting concerns like headers and footers, which adds redundancy and maintenance burden.

    JavaScript Rendering Components (Or Web Components, Custom Element)-

    As we saw above, single-page Micro Frontend architecture has its share of drawbacks. To overcome these, we should opt for an architecture that has a container element that builds the context of the app and the cross-cutting concerns like authentication, and stitches all the Micro Frontends together to create a cohesive application.

    // A virtual class from which all micro-frontends would extend
    class MicroFrontend {
      
      beforeMount() {
        // do things before the micro front-end mounts
      }
    
      onChange() {
        // do things when the attributes of a micro front-end changes
      }
    
      render() {
        // html of the micro frontend
        return '<div></div>';
      }
    
      onDismount() {
        // do things after the micro front-end dismounts 
      }
    }

    class Cart extends MicroFrontend {
      beforeMount() {
        // get previously saved cart from backend
      }
    
      render() {
        return `<!-- Page -->
        <div class="page-area cart-page spad">
          <div class="container">
            <div class="cart-table">
              <table>
                <thead>
                .....
                
         `
      }
    
      addItemToCart(){
        ...
      }
        
      deleteItemFromCart () {
        ...
      }
    
      applyCouponToCart() {
        ...
      }
        
      onDismount() {
        // save Cart for the user to get back to afterwards
      }
    }

    class Product extends MicroFrontend {
      static get productDetails() {
        return {
          '1': {
            name: 'Cat Table',
            img: 'img/product/cat-table.jpg'
          },
          '2': {
            name: 'Dog House Sofa',
            img: 'img/product/doghousesofa.jpg'
          },
        }
      }
      getProductDetails() {
        var urlParams = new URLSearchParams(window.location.search);
        const productId = urlParams.get('productId');
        return this.constructor.productDetails[productId];
      }
      render() {
        const product = this.getProductDetails();
        return `	<!-- Page -->
        <div class="page-area product-page spad">
          <div class="container">
            <div class="row">
              <div class="col-lg-6">
                <figure>
                  <img class="product-big-img" src="${product.img}" alt="">`
      }
      selectProductColor(color) {}
    
      selectProductSize(size) {}
     
      addToCart() {
        // delegate call to MicroFrontend Cart.addToCart function
      }
      
    }

    <!DOCTYPE html>
    <html lang="zxx">
    <head>
    	<title>PetStore - because Pets love pampering</title>
    	<meta charset="UTF-8
      <link rel="stylesheet" href="css/style.css"/>
    
    </head>
    <body>
    	<!-- Header section -->
    	<header class="header-section">
      ....
      </header>
    	<!-- Header section end -->
    	<main id='microfrontend'>
        <!-- This is where the Micro-frontend gets rendered by utility renderMicroFrontend.js-->
    	</main>
                                    <!-- Header section -->
    	<footer class="header-section">
      ....
      </footer>
    	<!-- Footer section end -->
      	<script src="frontends/MicroFrontend.js"></script>
    	<script src="frontends/Home.js"></script>
    	<script src="frontends/Cart.js"></script>
    	<script src="frontends/Checkout.js"></script>
    	<script src="frontends/Product.js"></script>
    	<script src="frontends/Contact.js"></script>
    	<script src="routes.js"></script>
    	<script src="renderMicroFrontend.js"></script>

    function renderMicroFrontend(pathname) {
      const microFrontend = routes[pathname || window.location.hash];
      const root = document.getElementById('microfrontend');
      root.innerHTML = microFrontend ? new microFrontend().render(): new Home().render();
      $(window).scrollTop(0);
    }
    
    $(window).bind( 'hashchange', function(e) { renderFrontend(window.location.hash); });
    renderFrontend(window.location.hash);
    
    utility routes.js (A map of the hash route to the Microfrontend class)
    const routes = {
      '#': Home,
      '': Home,
      '#home': Home,
      '#cart': Cart,
      '#checkout': Checkout,
      '#product': Product,
      '#contact': Contact,
    };

    As you can see, this approach is pretty neat and encapsulates a separate class for Micro Frontends. All other Micro Frontends extend from this. Notice how all the functionality related to Microapp is encapsulated in the respective Micro Frontend. This makes sure that concurrent work on a Micro Frontend doesn’t mess up some other Micro Frontends.

    Everything will work in a similar paradigm when it comes to Web Components and Custom Elements.

    React

    With the client-side JavaScript frameworks being very popular, it is impossible to leave React from any Front End discussion. React being a component-based JS library, much of the things discussed above will also apply to React. I am going to discuss some of the technicalities and challenges when it comes to Micro Frontends with React.

    Styling

    Since there should be minimum sharing of code between any Micro Frontend, styling the React components can be challenging, considering the global and cascading nature of CSS. We should make sure styles are targeted on a specific Micro Frontend without spilling over to other Micro Frontends. Inline CSS, CSS in JS libraries like Radium,  and CSS Modules, can be used with React.

    Redux

    Using React with Redux is kind of a norm in today’s front-end world. The convention is to use Redux as a single global store for the entire app for cross application communication. A Micro Frontend should be self-contained with no dependencies. Hence each Micro Frontend should have its own Redux store, moving towards a multiple Redux store architecture. 

    Other Noteworthy Integration Approaches  –

    Server-side Rendering – One can use a server to assemble Micro Frontend templates before dispatching it to the browser. SSI techniques can be used too.

    iframes – Each Micro Frontend can be an iframe. They also offer a good degree of isolation in terms of styling, and global variables don’t interfere with each other.

    Summary

    With Microservices, Micro Frontends promise to  bring in a lot of benefits when it comes to structuring a complex application and simplifying its development, deployment and maintenance.

    But there is a wonderful saying that goes “there is no one-size-fits-all approach that anyone can offer you. The same hot water that softens a carrot hardens an egg”. Micro Frontend is no silver bullet for your architectural problems and comes with its own share of downsides. With more repositories, more tools, more build/deploy pipelines, more servers, more domains to maintain, Micro Frontends can increase the complexity of an app. It may render cross-application communication difficult to establish. It can also lead to duplication of dependencies and an increase in application size.

    Your decision to implement this architecture will depend on many factors like the size of your organization and the complexity of your application. Whether it is a new or legacy codebase, it is advisable to apply the technique gradually over time and review its efficacy over time.