Category: Engineering blogs

  • Policy Insights: Chatbots and RAG in Health Insurance Navigation

    Introduction

    Understanding health insurance policies can often be complicated, leaving individuals to tackle lengthy and difficult documents. The complexity introduced by these policies’ language not only adds to the confusion but also leaves policyholders uncertain about the actual extent of their coverage, the best plan for their needs, and how to seek answers to their specific policy-related questions. In response to these ongoing challenges and to facilitate better access to information, a fresh perspective is being explored—an innovative approach to revolutionize how individuals engage with their health insurance policies.

    Challenges in Health Insurance Communication

    Health insurance queries are inherently complex, often involving nuanced details that require precision. Traditional chatbots, lacking the finesse of generative AI (GenAI), struggle to handle the intricacies of healthcare-related questions. The envisioned health insurance chatbot powered by GenAI overcomes these limitations, offering a sophisticated understanding of queries and delivering responses that align with the complexities of the healthcare sphere.

    Retrieval-Augmented Generation

    Retrieval-augmented generation, or RAG, is an architectural approach that can improve the efficacy of large language model (LLM) applications by leveraging custom data. This is done by retrieving relevant data/documents relevant to a question or task and providing them as context for the LLM. RAG has shown success in supporting chatbots and Q&A systems that need to maintain up-to-date information or access domain-specific knowledge.

    To know more about this topic, check here for technical insights and additional information.

    1. https://www.oracle.com/in/artificial-intelligence/generative-ai/retrieval-augmented-generation-rag/

    2. https://research.ibm.com/blog/retrieval-augmented-generation-RAG

    The Dual Phases of RAG: Retrieval and Content Generation

    Retrieval-augmented generation (RAG) smoothly combines two essential steps, carefully blending retrieval and content generation. Initially, algorithms diligently explore external knowledge bases to find relevant data that matches user queries. This gathered information then becomes the foundation for the next phase—content generation. In this step, the large language model uses both the enhanced prompt and its internal training data to create responses that are not only accurate but also contextually appropriate.

    Advantages of Deploying RAG in AI Chatbots

    Scalability is a key advantage of RAG over traditional models. Instead of relying on a monolithic model attempting to memorize vast amounts of information, RAG models can easily scale by updating or expanding the external database. This flexibility enables them to manage and incorporate a broader range of data efficiently.

    Memory efficiency is another strength of RAG in comparison to models like GPT. While traditional models have limitations on the volume of data they can store and recall, RAG efficiently utilizes external databases. This approach allows RAG to fetch fresh, updated, or detailed information as needed, surpassing the memory constraints of conventional models.

    Moreover, RAG offers flexibility in its knowledge sources. By modifying or enlarging the external knowledge base, a RAG model can be adapted to specific domains without the need for retraining the underlying generative model. This adaptability ensures that RAG remains a versatile and efficient solution for various applications.

    The displayed image outlines the application flow. In the development of our health insurance chatbot, we follow a comprehensive training process. Initially, essential PDF documents are loaded to familiarize our model with the intricacies of health insurance. These documents undergo tokenization, breaking them into smaller units for in-depth analysis. Each of these units, referred to as tokens, is then transformed into numerical vectors through a process known as vectorization. These numerical representations are efficiently stored in ChromaDB for quick retrieval.

    When a question is posed by a user, the numerical version of the query is retrieved from ChromaDB by the chatbot. Employing a language model (LLM), the chatbot crafts a nuanced response based on this numerical representation. This method ensures a smooth and efficient conversational experience. Armed with a wealth of health insurance information, the chatbot delivers precise and contextually relevant responses to user inquiries, establishing itself as a valuable resource for navigating the complexities of health insurance queries.

    Role of Vector Embedding

    Traditional search engines mainly focus on finding specific words in your search. For example, if you search “best smartphone,” it looks for pages with exactly those words. On the other hand, semantic search is like a more understanding search engine. It tries to figure out what you really mean by considering the context of your words.

    Imagine you are planning a vacation and want to find a suitable destination, and you input the query “warm places to visit in winter.” In a traditional search, the engine would look for exact matches of these words on web pages. Results might include pages with those specific terms, but the relevance might vary.

    Text, audio, and video can be embedded:

    An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness, and large distances suggest low relatedness.

    For example:

    bat: [0.6, -0.3, 0.8, …]

    ball: [0.4, -0.2, 0.7, …]

    wicket: [-0.5, 0.6, -0.2, …]

    In this cricket-themed example, each word (bat, ball, wicket) is represented as a vector in a multi-dimensional space, capturing the semantic relationships between cricket-related terms.

    For a deeper understanding, you may explore additional insights in the following articles:

    1. https://www.datastax.com/guides/what-is-a-vector-embedding

    2. https://www.pinecone.io/learn/vector-embeddings/

    3. https://weaviate.io/blog/vector-embeddings-explained/

    A specialized type of database known as a vector database is essential for storing these numerical representations. In a vector database, data is stored as mathematical vectors, providing a unique way to store and retrieve information. This specialized database greatly facilitates machine learning models in retaining and recalling previous inputs, enabling powerful applications in search, recommendations, and text generation.

    Vector retrieval in a database involves finding the nearest neighbors or most similar vectors to a given query vector. These are metrics for finding similar vectors:

    1. The Euclidean distance metric considers both magnitudes and direction, providing a comprehensive measure for assessing the spatial separation between vectors.

    2. Cosine similarity focuses solely on the direction of vectors, offering insights into their alignment within the vector space.

    3. Dot product similarity metric takes into account both magnitudes and direction, offering a versatile approach for evaluating the relationships between vectors.

    ChromaDB, PineCone, and Milvus are a few examples of vector databases.

    For our application, we will be using LangChain, OpenAI embedding and LLM, and ChromaDB.

    1. We need to install Python packages required for this application.
    !pip install -U langchain openai chromadb langchainhub pypdf tiktoken

    A. LangChain is a tool that helps you build intelligent applications using language models. It allows you to develop chatbots, personal assistants, and applications that can summarize, analyze, or respond to questions about documents or data. It’s useful for tasks like coding assistance, working with APIs, and other activities that gain an advantage from AI technology.

    B. OpenAI is a renowned artificial intelligence research lab. Installing the OpenAI package provides access to OpenAI’s language models, including powerful ones like GPT-3. This library is crucial if you plan to integrate OpenAI’s language models into your applications.

    C. As mentioned earlier, ChromaDB is a vector database package designed to handle vector data efficiently, making it suitable for applications that involve similarity searches, clustering, or other operations on vectors.

    D. LangChainHub is a handy tool to make your language tasks easier. It begins with helpful prompts and will soon include even more features like chains and agents.

    E. PyPDF2 is a library for working with PDF files in Python. It allows reading and manipulating PDF documents, making it useful for tasks such as extracting text or merging PDF files.

    F. Tiktoken is a Python library designed for counting the number of tokens in a text string without making an API call. This can be particularly useful for managing token limits when working with language models or APIs that have usage constraints.

    1. Importing Libraries
    from langchain.chat_models import ChatOpenAI 
    from langchain.embeddings import OpenAIEmbeddings 
    from langchain.vectorstores import Chroma 
    from langchain.prompts import ChatPromptTemplate 
    from langchain.prompts import PromptTemplate
    from langchain.schema import StrOutputParser 
    from langchain.schema.runnable import RunnablePassthrough

    1. Initializing OpenAI LLM
    llm = ChatOpenAI(
    api_key=OPENAI_API_KEY”,
    model_name="gpt-4", 
    temperature=0.1
    )

    This line of code initializes a language model (LLM) using OpenAI’s GPT-4 model with 8192 tokens. Temperature parameter influences the randomness of text generated, and increased temperature results in more creative responses, while decreased temperature leads to more focused and deterministic answers.

    1. We will be loading a PDF consisting of material for training the model and also need to divide it into chunks of texts that can be fed to the model.
    from langchain.document_loaders import PyPDFLoader
    loader = PyPDFLoader(
    "HealthInsureBot_GenerativeAI_TrainingGuide.pdf") 
    docs = loader.load_and_split()

    1. We will be loading this chunk of text into the vector Database Chromadb, later used for retrieval and using OpenAI embeddings.
    vectorstore = Chroma.from_documents
    (
    documents=docs,
    embedding=OpenAIEmbeddings(api_key=OPENAI_API_KEY”)
    )

    1. Creating a retrieval object will return the top 3 similar vector matches for the query.
    retriever = vectorstore.as_retriever
    (
    search_type="similarity",
    search_kwargs={"k": 3}
    )

    7. Creating a prompt to pass to the LLM for obtaining specific information involves crafting a well-structured question or instruction that clearly outlines the desired details. RAG chain initiates with the retriever and formatted documents, progresses through the custom prompt template, involves the LLM , and concludes by utilizing a string output parser (StrOutputParser()) to handle the resulting response.

    def format_docs(docs): 
       return "nn".join(doc.page_content for doc in docs)
    
    template = """Use the following pieces of context as a virtual health insurance agent to answer the question and provide relevance score out of 10 for each response. If you don't know the answer, just say that you don't know, don't try to make up an answer. {context} Question: {question} Helpful Answer:""" 
    rag_prompt_custom = PromptTemplate.from_template(template) 
    rag_chain = ( 
    {"context": retriever | format_docs, "question": RunnablePassthrough()} | rag_prompt_custom 
    | llm 
    | StrOutputParser()
    )

    1. Create a function to get a response from the chatbot.
    def chatbot_response(user_query):
        return rag_chain.invoke(user_query)

    We can integrate the Streamlit tool for building a powerful generative app, using this function in Streamlit application to get the AI response.

    import openai 
    import streamlit as st
    from health_insurance_bot import chatbot_response 
    st.title("Health Insurance Chatbot")
    if "messages" not in st.session_state: 
       st.session_state["messages"] = 
       [{"role": "assistant", "content": "How can I help you?"}] 
    
    for msg in st.session_state.messages: 
       st.chat_message(msg["role"]).write(msg["content"]) 
    
    if prompt := st.chat_input(): 
      openai.api_key = st.secrets['openai_api_key']     
      st.session_state.messages.append({"role": "user", "content": prompt}) 
      st.chat_message(name="user").write(prompt) 
      response = chatbot_response(prompt)   
      st.session_state.messages.append({"role": "assistant", "content": response})  
      st.chat_message(name="assistant").write(response)

    Performance Insights

    Conclusion

    In our exploration of developing health insurance chatbots, we’ve dived into the innovative world of retrieval-augmented generation (RAG), where advanced technologies are seamlessly combined to reshape user interactions. The adoption of RAG has proven to be a game-changer, significantly enhancing the chatbot’s abilities to understand, retrieve, and generate contextually relevant responses. However, it’s worth mentioning a couple of limitations, including challenges in accurately calculating premium quotes and occasional inaccuracies in semantic searches.

  • Revolutionizing Android UI with MotionLayout: A Beginner’s Guide

    In the ever-evolving world of Android app development, seamless integration of compelling animations is key to a polished user experience. MotionLayout, a robust tool in the Android toolkit, has an effortless and elegant ability to embed animations directly into the UI. Join us as we navigate through its features and master the skill of effortlessly designing stunning visuals.

    1. Introduction to MotionLayout

    MotionLayout transcends conventional layouts, standing as a specialized tool to seamlessly synchronize a myriad of animations with screen updates in your Android application.

    1.1 Advantages of MotionLayout

    Animation Separation:

    MotionLayout distinguishes itself with the ability to compartmentalize animation logic into a separate XML file. This not only optimizes Java or Kotlin code but also enhances its overall manageability.

    No Dependence on Manager or Controller:

    An exceptional feature of MotionLayout is its user-friendly approach, enabling developers to attach intricate animations to screen changes without requiring a dedicated animation manager or controller.

    Backward Compatibility:

    Of paramount importance, MotionLayout maintains backward compatibility, ensuring its applicability across Android systems starting from API level 14.

    Android Studio Integration:

    Empowering developers further is the seamless integration with Android Studio. The graphical tooling provided by the visual editor facilitates the design and fine-tuning of MotionLayout animations, offering an intuitive workflow.

    Derivation from ConstraintLayout:

    MotionLayout, being a subclass of ConstraintLayout, serves as an extension specifically designed to facilitate the implementation of complex motion and animation design within a ConstraintLayout.

    1.2 Important Tags

    As elucidated earlier, Animation XML is separated into the following important tags and attributes:

    <MotionScene>: The topmost tag in XML, wrapping all subsequent tags.

    <ConstraintSet>: Describes one screen state, with two sets required for animations between states. For example, if we desire an animation where the screen transitions from state A to state B, we necessitate the definition of two ConstraintSets.

    <Transition>: Attaches to two ConstraintSets, triggering animation between them.

    <ViewTransition>: Utilized for changes within a single ConstraintSet.

    As explained before, animation XML is separate following are important tags and attributes that we should know

    1.3 Why It’s Better Than Its Alternatives

    It’s important to note that MotionLayout is not the sole solution for every animation scenario. Similar to the saying that a sword cannot replace a needle, MotionLayout can be a better solution when planning for complex animations. MotionLayout can replace animation created using threads and runnables. Apart from MotionLayout, several common alternatives for creating animations include:

    • Animated Vector Drawable
    • Property animation frameworks
    • LayoutTransition animation
    • Layout Transitions with TransitionManager
    • CoordinatorLayout

    Each alternative has unique advantages and disadvantages compared to MotionLayout. For smaller animations like icon changes, Animated Vector Drawable might be preferred. The choice between alternatives depends on the specific requirements of the animation task at hand.

    MotionLayout is a comprehensive solution, bridging the gap between layout transitions and complex motion handling. It seamlessly integrates features from the property animation framework, TransitionManager, and CoordinatorLayout. Developers can describe transitions between layouts, animate any property, handle touch interactions, and achieve a fully declarative implementation, all through the expressive power of XML.

    2. Configuration

    2.1 System setup

    For optimal development and utilization of the Motion Editor, Android Studio is a prerequisite. Kindly follow this link for the Android Studio installation guide.

    2.2 Project Implementation

    1. Initiate a new Android project and opt for the “Empty View Activity” template.

    1. Since MotionLayout is an extension of ConstraintLayout, it’s essential to include ConstraintLayout in the build.gradle file.

    implementation ‘androidx.constraintlayout:constraintlayout:x.x.x’

    Substitute “x.x.x” with the most recent version of ConstraintLayout.

    1. Replace “ConstraintLayout” with “MotionLayout.” Opting for the right-click method is recommended, as it facilitates automatically creating the necessary animation XML.
    Figure 1

    When converting our existing layout to MotionLayout by right-clicking, a new XML file named “activity_main_scene.xml” is generated in the XML directory. This file is dedicated to storing animation details for MotionLayout.

    1. Execute the following steps:
    1. Click on the “start” ConstraintSet.
    2. Move the Text View by dragging it to a desired position on your screen.
    3. Click on the “end” ConstraintSet.
    4. Move the Text View to another position on your screen.
    5. Click on the arrow above “start” and “end” ConstraintSet.
    6. Click on the “+” symbol in the “Attributes” tab.
    7. Add the attribute “autoTransition” with the value “jumpToEnd.”
    8. Click the play button on the “Transition” tab.

    Preview the animation in real time by running the application. The animation will initiate when called from the associated Java class.

    Note: You can also manually edit the activity_main_scene.xml file to make these changes.

    3. Sample Project and Result

    Until now, we’ve navigated through the complexities of MotionLayout and laid the groundwork for an Android project. Now, let’s transition from theory to practical application by crafting a sample project. In this endeavor, we’ll keep the animation simple and accessible for a clearer understanding.

    3.1 Adding Dependencies

    Include the following lines of code in your gradle.build file (Module: app), and then click on “Sync Now” to ensure synchronization with the project:

    android {
       buildFeatures {
           viewBinding true
       }
    }
    
    dependencies {
       implementation 'androidx.lifecycle:lifecycle-viewmodel-ktx:2.5.1'
       implementation 'androidx.lifecycle:lifecycle-runtime-ktx:2.5.1'
       implementation 'androidx.annotation:annotation:1.5.0'
    }

    3.2 Adding code

    Include the following code snippets in their corresponding classes:

    <?xml version="1.0" encoding="utf-8"?>
    <androidx.constraintlayout.motion.widget.MotionLayout xmlns:android="http://schemas.android.com/apk/res/android"
       xmlns:app="http://schemas.android.com/apk/res-auto"
       xmlns:tools="http://schemas.android.com/tools"
       android:layout_width="match_parent"
       android:layout_height="match_parent"
       android:background="@android:color/darker_gray"
       app:layoutDescription="@xml/activity_main_scene"
       android:id="@+id/layoutMain"
       tools:context=".MainActivity">
    
       <ImageView
           android:id="@+id/image_view_00"
           android:layout_width="80dp"
           android:layout_height="100dp"
           android:layout_margin="32dp"
           android:src="@drawable/card_back"
           app:layout_constraintEnd_toEndOf="parent"
           app:layout_constraintStart_toStartOf="parent"
           app:layout_constraintTop_toTopOf="parent" />
    
       <ImageView
           android:id="@+id/image_view_01"
           android:layout_width="80dp"
           android:layout_height="100dp"
           android:layout_margin="32dp"
           android:src="@drawable/card_back"
           app:layout_constraintEnd_toEndOf="parent"
           app:layout_constraintStart_toStartOf="parent"
           app:layout_constraintTop_toTopOf="parent" />
    
       <ImageView
           android:id="@+id/image_view_02"
           android:layout_width="80dp"
           android:layout_height="100dp"
           android:layout_margin="32dp"
           android:src="@drawable/card_back"
           app:layout_constraintEnd_toEndOf="parent"
           app:layout_constraintStart_toStartOf="parent"
           app:layout_constraintTop_toTopOf="parent" />
    
    </androidx.constraintlayout.motion.widget.MotionLayout>

    <?xml version="1.0" encoding="utf-8"?>
    <MotionScene xmlns:android="http://schemas.android.com/apk/res/android"
       xmlns:motion="http://schemas.android.com/apk/res-auto">
    
       <ConstraintSet android:id="@+id/image_00">
           <Constraint
               android:id="@+id/image_view_00"
               android:layout_width="80dp"
               android:layout_height="100dp"
               android:layout_marginTop="32dp"
               motion:layout_constraintEnd_toEndOf="parent"
               motion:layout_constraintStart_toStartOf="parent"
               motion:layout_constraintTop_toTopOf="parent" />
           <Constraint
               android:id="@+id/image_view_01"
               android:layout_width="80dp"
               android:layout_height="100dp"
               android:layout_marginTop="32dp"
               motion:layout_constraintEnd_toEndOf="parent"
               motion:layout_constraintStart_toStartOf="parent"
               motion:layout_constraintTop_toTopOf="parent" />
           <Constraint
               android:id="@+id/image_view_02"
               android:layout_width="80dp"
               android:layout_height="100dp"
               android:layout_marginTop="32dp"
               motion:layout_constraintEnd_toEndOf="parent"
               motion:layout_constraintStart_toStartOf="parent"
               motion:layout_constraintTop_toTopOf="parent" />
       </ConstraintSet>
       <ConstraintSet android:id="@+id/image_01">
           <Constraint
               android:id="@+id/image_view_00"
               android:layout_width="80dp"
               android:layout_height="100dp"
               android:layout_marginBottom="32dp"
               android:layout_marginEnd="44dp"
               android:src="@drawable/card_back"
               motion:layout_constraintBottom_toBottomOf="parent"
               motion:layout_constraintEnd_toEndOf="@id/image_view_01" />
           <Constraint
               android:id="@+id/image_view_01"
               android:layout_width="80dp"
               android:layout_height="100dp"
               android:layout_marginBottom="32dp"
               android:src="@drawable/card_back"
               motion:layout_constraintBottom_toBottomOf="parent"
               motion:layout_constraintEnd_toEndOf="parent"
               motion:layout_constraintStart_toStartOf="parent" />
           <Constraint
               android:id="@+id/image_view_02"
               android:layout_width="80dp"
               android:layout_height="100dp"
               android:layout_marginBottom="32dp"
               android:layout_marginStart="44dp"
               android:src="@drawable/card_back"
               motion:layout_constraintBottom_toBottomOf="parent"
               motion:layout_constraintStart_toStartOf="@id/image_view_01" />
       </ConstraintSet>
    
       <Transition
           motion:autoTransition="animateToEnd"
           motion:constraintSetEnd="@+id/image_01"
           motion:constraintSetStart="@id/image_00"
           motion:duration="5000" />
    </MotionScene>

    class MainActivity : AppCompatActivity() {
       private lateinit var binding: ActivityMainBinding
       override fun onCreate(savedInstanceState: Bundle?) {
           super.onCreate(savedInstanceState)
           setContentView(R.layout.activity_main)
           binding = ActivityMainBinding.inflate(layoutInflater)
           setContentView(binding.root)
           binding.apply {
               lifecycleScope.launch {
                   delay(1000)
                   layoutMain.transitionToEnd()
               }
           }
       }
    }

    3.3 Result

    For a thorough comprehension of the implementation specifics and complete access to the source code, allowing you to delve into the intricacies of the project and harness its functionalities adeptly, please refer to this repository.

    4. Assignment

    Expanding the animation’s complexity becomes seamless by incorporating additional elements with meticulous handling. Here’s an assignment for you: endeavor to create the specified output below.

    4.1 Assignment 1

    4.2 Assignment 2

    5. Conclusion

    In conclusion, this guide has explored the essentials of using MotionLayout in Android development, highlighting its superiority over other animation methods. While we’ve touched on its basic capabilities here, future installments will explore more advanced features and uses. We hope this piece has ignited your interest in MotionLayout’s potential to enhance your Android apps.

    Thank you for dedicating your time to this informative read!

    6. References

    1. https://developer.android.com/reference/androidx/constraintlayout/motion/widget/MotionLayout
    2. https://developer.android.com/reference/androidx/constraintlayout/widget/ConstraintLayout
    3. https://android-developers.googleblog.com/2021/02/mad-skills-motion-layout-wrap-up.html
    4. https://www.bacancytechnology.com/blog/motionlayout-in-android
    5. https://medium.com/mindful-engineering/getting-started-with-motion-layout-in-android-c52af5d5076c
    6. https://blog.mindorks.com/getting-started-with-motion-layout-android-tutorials/
    7. https://gorillalogic.com/blog/a-motionlayout-tutorial-create-motions-and-animations-for-android
    8. https://taglineinfotech.com/motion-layout-in-android/
    9. https://applover.com/blog/how-to-create-motionlayout-part1/
    10. https://medium.com/simform-engineering/animations-in-android-featuring-motionlayout-from-scratch-3ec5cbd6b616
    11. https://www.nomtek.com/blog/motionlayout
  • The Responsible Use of Artificial Intelligence – Shaping a Safer Tomorrow

    Introduction:

    Artificial intelligence (AI) stands at the forefront of technological innovation, promising transformative changes in our lives. With its continuous advancements, AI has become an integral part of our daily routines, from virtual assistants and personalized recommendations to healthcare diagnostics and autonomous vehicles. However, the rapid integration of AI into our society raises pertinent ethical questions, necessitating a closer examination of its responsible use. In this blog post, we delve into the responsible use of artificial intelligence, exploring the principles that guide its ethical deployment and emphasizing the collaborative efforts required to shape a safer future.

    Understanding Responsible AI:

    Responsible AI signifies more than just technological progress; it embodies the ethical development and deployment of AI systems, emphasizing principles such as fairness, transparency, accountability, and privacy. To ensure that AI benefits society as a whole, it is crucial to address the following key aspects:

    1. Ethical Considerations:
      Ethics serve as the cornerstone of AI development. Collaboration among engineers, data scientists, and policymakers is paramount in establishing ethical guidelines that prevent AI from being used for harmful purposes. It is imperative to avoid deploying AI in situations that could lead to discrimination, manipulation, or privacy erosion. Ethical considerations must permeate every stage of AI development, guiding decisions and actions.

    2. Transparency and Accountability:
      Understanding the functioning of AI algorithms is pivotal for their ethical deployment. Striving for transparency, developers should elucidate, in plain language, how AI-driven decisions are made. Accountability mechanisms must be in place to address errors, biases, and unintended consequences. Regular audits and assessments ensure that AI systems remain ethical and accountable, promoting trust among users.

    3. Bias Mitigation:
      The quality of AI algorithms hinges on the data they are trained on. Identifying and mitigating biases in datasets is imperative to create fair and equitable AI applications. Diverse and representative datasets are essential to reducing biases, ensuring that AI systems work impartially for everyone, irrespective of their background. Bias mitigation is an ongoing process, demanding continuous vigilance throughout AI development.

    4. Privacy Protection:
      Responsible AI use involves safeguarding user privacy. As AI applications require extensive data, concerns arise about how this data is collected, stored, and utilized. Regulations and standards, such as GDPR in Europe, play a pivotal role in protecting user privacy rights and ensuring responsible handling of personal data. Developers and companies must prioritize user privacy in all facets of AI development to foster user trust and confidence.

    5. Continuous Monitoring and Adaptation:
      AI’s landscape is in constant flux, necessitating continuous monitoring and adaptation to evolving ethical standards. Regular updates, feedback loops, and adaptive learning enable AI technologies to remain responsive to societal needs and concerns. Developers and companies must vigilantly monitor AI system performance, ready to make necessary changes to align outcomes with ethical standards.

    6. Public Awareness and Education:
      Raising public awareness about AI and its implications is crucial. Educating the public about the ethical use of AI, potential biases, and privacy concerns empowers individuals to make informed decisions. Workshops, seminars, and accessible online resources can bridge the knowledge gap, ensuring that society comprehends the responsible use of AI and actively participates in shaping its trajectory.

    7. Collaboration Across Sectors:
      Collaboration between governments, private sectors, and non-profit organizations is vital. By working together, diverse perspectives can be integrated, leading to comprehensive policies and guidelines. Initiatives like joint research projects, cross-industry collaborations, and international summits facilitate the exchange of ideas and foster a unified approach to responsible AI deployment.

    Potential Risks of Irresponsible AI Use:

    Irresponsible AI use can have detrimental consequences, impacting individuals and society profoundly. Here are the key risks associated with irresponsible AI deployment:

    1. Bias and Discrimination:
      AI systems can perpetuate existing biases, leading to discriminatory outcomes, particularly in areas such as hiring, lending, and law enforcement.

      – For example, a study revealed that AI algorithms used in criminal justice systems exhibited racial bias, leading to disproportionately harsher sentences for people of color. This instance underscores the critical need for rigorous bias detection and mitigation strategies in AI development.

    2. Privacy Violations:
      Improper AI use can result in unauthorized access to personal data, compromising individuals’ privacy and security. This breach can lead to identity theft, financial fraud, and other cybercrimes, highlighting the urgency of robust data protection measures.

      – For instance, the Cambridge Analytica scandal demonstrated how AI-driven data analysis could lead to unauthorized access to millions of users’ personal information on social media platforms, emphasizing the need for stringent data privacy regulations and ethical data management practices.

    3. Job Displacement:
      AI-driven automation could lead to job displacement, posing economic challenges and social unrest. Industries reliant on routine tasks susceptible to automation are particularly vulnerable, necessitating proactive measures to address potential workforce transitions.

      – An example can be found in the manufacturing sector, where AI-driven robotics have significantly reduced the need for human workers in certain tasks, leading to workforce challenges and economic disparities. Initiatives focusing on retraining and upskilling programs can help mitigate these challenges.

    4. Security Threats:
      AI systems are vulnerable to attacks and manipulation. Malicious actors could exploit these vulnerabilities, causing widespread disruption, manipulating financial markets, or spreading misinformation. Vigilance and robust security measures are paramount to mitigate these threats effectively.

      – For instance, the rise of deepfake technology, enabled by AI, poses significant threats to political, social, and economic stability. Misinformation and manipulated media can influence public opinion and decision-making processes, emphasizing the need for advanced AI-driven detection tools and awareness campaigns to combat this issue.

    5. Loss of Human Control:
      Inadequate regulation could lead to AI systems, especially in military applications and autonomous vehicles, operating beyond human control. This lack of oversight might result in unintended consequences and ethical dilemmas, underscoring the need for stringent regulation and ethical guidelines.

      – A notable example is the debate surrounding autonomous vehicles, where AI-driven decision-making processes raise ethical questions, such as how these vehicles prioritize different lives in emergency situations. Robust ethical frameworks and regulatory guidelines are essential to navigate these complex scenarios responsibly.

    Best AI Practices:

    To ensure responsible AI development and usage, adhering to best practices is imperative:

    1. Research and Development Ethics:
      Organizations should establish research ethics boards to oversee AI projects, ensuring strict adherence to ethical guidelines during the development phase.
    2. Collaboration and Knowledge Sharing:
      Encourage collaboration between industry, academia, and policymakers to facilitate knowledge sharing and establish common ethical standards for AI development and deployment. Collaboration fosters a holistic approach, incorporating diverse perspectives and expertise.
    3. User Education:
      Educating users about AI capabilities and limitations is essential. Raising awareness about the collected data and how it will be used promotes informed decision-making and user empowerment.

      – For instance, companies can provide interactive online resources and workshops explaining how AI algorithms work, demystifying complex concepts for the general public. Transparent communication about data collection and usage practices can enhance user trust and confidence.
    4. Responsible Data Management:
      Implement robust data management practices to ensure data privacy, security, and compliance with regulations. Regularly update privacy policies to reflect the evolving nature of AI technology, demonstrating a commitment to user privacy and data protection.

      – Companies can employ advanced encryption techniques to protect user data and regularly undergo independent audits to assess their data security measures. Transparent reporting of these security practices can build trust with users and regulatory bodies.
    5. Ethical AI Training:
      AI developers and engineers should receive training in ethics, emphasizing the significance of fairness, accountability, and transparency in AI algorithms. Ethical AI training fosters a culture of responsibility, guiding developers to create systems that benefit society.

      Educational institutions and industry organizations can collaborate to offer specialized courses on AI ethics, ensuring that future developers are well-equipped to navigate the ethical challenges of AI technology. Industry-wide certification programs can validate developers’ expertise in ethical AI practices, setting industry standards.

    Examples of Responsible AI Implementation:

    Google’s Ethical Guidelines:

    Google has been a trailblazer in implementing responsible AI practices, demonstrating a commitment to ethical use. The establishment of the Advanced Technology External Advisory Council (ATEAC) underscores Google’s dedication to addressing ethical challenges related to AI. Additionally, their release of guidelines for the ethical use of AI emphasizes fairness, accountability, and transparency. Google’s AI Principles outline their pledge to avoid bias, ensure safety, and provide broad benefits to humanity, setting a commendable precedent for the industry.

    An illustrative example of Google’s commitment to transparency can be seen in their Explainable AI (XAI) research, where efforts are made to make AI algorithms interpretable for users. By providing users with insights into how AI systems make decisions, Google enhances transparency and user understanding, contributing to responsible AI usage.

    Microsoft’s AETHER Committee:

    Microsoft has taken significant strides to ensure the responsible use of AI. The creation of the AI and Ethics in Engineering and Research (AETHER) Committee exemplifies Microsoft’s proactive approach to addressing ethical considerations. Furthermore, their active involvement in AI policy advocacy emphasizes the need for regulation to prevent the misuse of facial recognition technology. Microsoft’s initiatives promote transparency, accountability, and privacy protection, exemplifying a commitment to responsible AI implementation.

    A noteworthy initiative by Microsoft is its collaboration with academic institutions to research bias detection techniques in AI algorithms. By actively engaging in research, Microsoft contributes valuable knowledge to the industry, addressing the challenge of bias mitigation in AI systems and promoting responsible AI development.

    Conclusion:

    Artificial Intelligence holds immense potential to revolutionize our world, but this power must be wielded responsibly. By adhering to ethical guidelines, promoting transparency, and prioritizing privacy and fairness, we can harness the benefits of AI while mitigating its risks. Understanding the responsible use of AI empowers us as users and consumers to demand ethical practices from the companies and developers creating these technologies. Let us collectively work towards ensuring that AI becomes a force for good, shaping a safer and more equitable future for all.

  • Confluent Kafka vs. Amazon Managed Streaming for Apache Kafka (AWS MSK) vs. on-premise Kafka

    Overview:

    In the evolving landscape of distributed streaming platforms, Kafka has become the foundation of real-time data processing. As organizations opt for multiple deployment options, choosing Confluent Kafka, AWS MSK, and on-premises Kafka becomes important. This blog aims to provide an overview and comparison of these three Kafka deployment methods to help readers decide based on their needs.

    Here’s a list of key bullet points to consider when comparing Confluent Kafka, AWS MSK, and on-premise Kafka:

    1. Deployment and Management
    2. Communication
    3. Scalability
    4. Performance
    5. Cost Considerations
    6. Security

    So, let’s go through all the parameters one by one.

    Deployment and Management: 

    Deployment and management refer to the processes involved in setting up, configuring, and overseeing the operation of a Kafka infrastructure. Each deployment option—Confluent Kafka, AWS MSK, and On-Premise Kafka—has its own approach to these aspects.

    1. Deployment:

    -> Confluent Kafka

    • Confluent Kafka deployment involves setting up the Confluent Platform, which extends the open-source Kafka with additional tools.
    • Users must install and configure Confluent components, such as Confluent Server, Connect, and Control Center.
    • Deployment may vary based on whether it’s on-premise or in the cloud using Confluent Cloud.

    -> AWS MSK

    • AWS MSK streamlines deployment as it is a fully managed service.
    • Users create an MSK cluster through the AWS Management Console or API, specifying configurations like instance types, storage, and networking.
    • AWS handles the underlying infrastructure, automating tasks like scaling, patching, and maintenance.

    -> On-Premise Kafka:

    • On-premise deployment requires manual setup and configuration of Kafka brokers, Zookeeper, and other components.
    • Organizations must provision and maintain the necessary hardware and networking infrastructure.
    • Deployment may involve high availability, fault tolerance, and scalability considerations based on organizational needs.

    2. Management:

    -> Confluent Kafka:

    • Confluent provides additional management tools beyond what’s available in open-source Kafka, including Confluent Control Center for monitoring and Confluent Hub for extensions.
    • Users have control over configurations and can leverage Confluent’s tools for streamlining tasks like data integration and schema management.

    -> AWS MSK:

    • AWS MSK offers a managed environment where AWS takes care of routine management tasks.
    • AWS Console and APIs provide tools for monitoring, scaling, and configuring MSK clusters.
    • AWS handles maintenance tasks such as applying patches and updates to the Kafka software.

    -> On-Premise Kafka:

    • On-premise management involves manual oversight of the entire Kafka infrastructure.
    • Organizations have full control but must handle tasks like software updates, monitoring configurations, and addressing any issues that arise.
    • Management may require coordination between IT and operations teams to ensure smooth operation.

    Communication

    There are three main components of communication protocols, network configurations, and inter-cluster communication, so let’s look at them one by one.

    1. Protocols

    -> Confluent Kafka:

    • Utilizes standard Kafka protocols such as TCP for communication between Kafka brokers and clients.
    • Confluent components may communicate using REST APIs and other protocols specific to Confluent Platform extensions.

    -> AWS MSK:

    • Relies on standard Kafka protocols for communication between clients and brokers.
    • Also employs protocols like TLS/SSL for secure communication.

    -> On-Premise Kafka:

    • Standard Kafka protocols are used for communication between components, including TCP for broker-client communication.
    • Specific protocols may vary based on the organization’s network configuration.

    2. Network Configuration

    -> Confluent Kafka:

    • Requires network configuration for Kafka brokers and other Confluent components.
    • Configuration may include specifying listener ports, security settings, and inter-component communication settings.

    -> AWS MSK:

    • Network configuration involves setting up Virtual Private Cloud (VPC) settings, security groups, and subnet configurations.
    • AWS MSK integrates with AWS networking services, allowing organizations to define network parameters.

    -> On-Premise Kafka:

    • Network configuration includes defining IP addresses, ports, and firewall rules for Kafka brokers.
    • Organizations have full control over the on-premise network infrastructure, allowing for custom configurations.

    3. Inter-Cluster Communication:

    -> Confluent Kafka:

    • Confluent components within the same cluster communicate using Kafka protocols.
    • Communication between Confluent Kafka clusters or with external systems may involve additional configurations.

    -> AWS MSK:

    • AWS MSK clusters can communicate with each other within the same VPC or across different VPCs using standard Kafka protocols.
    • AWS networking services facilitate secure and efficient inter-cluster communication.

    -> On-Premise Kafka:

    • Inter-cluster communication on-premise involves configuring network settings to enable communication between Kafka clusters.
    • Organizations have full control over the network architecture for inter-cluster communication.

    Scaling

    Scalability is crucial for ensuring that Kafka deployments can handle varying workloads and grow as demands increase. Whether through vertical or horizontal scaling, the goal is to maintain performance, reliability, and efficiency in the face of changing requirements. Each deployment option—Confluent Kafka, AWS MSK, and on-premise Kafka—provides mechanisms for achieving scalability, with differences in how scaling is implemented and managed.

    -> Confluent Kafka:

    • Horizontal scaling is a fundamental feature of Kafka, allowing for adding more broker nodes to a Kafka cluster to handle increased message throughput.
    • Kafka’s partitioning mechanism allows data to be distributed across multiple broker nodes, enabling parallel processing and improved scalability.
    • Scaling can be achieved dynamically by adding or removing broker nodes, providing flexibility in adapting to varying workloads.

    -> AWS MSK:

    • AWS MSK leverages horizontal scaling by allowing users to adjust the number of broker nodes in a cluster based on demand.
    • The managed service handles the underlying infrastructure and scaling tasks automatically, simplifying the process for users.
    • Scaling with AWS MSK is designed to be seamless, with the service managing the distribution of partitions across broker nodes.

    -> On-Premise Kafka:

    • Scaling Kafka on-premise involves manually adding or removing broker nodes based on the organization’s infrastructure and capacity planning.
    • Organizations need to consider factors such as hardware limitations, network configurations, and load balancing when scaling on-premise Kafka.

    Performance

    Performance in Kafka, whether it’s Confluent Kafka, AWS MSK (managed streaming for Kafka), or on-premise Kafka, is a critical aspect that directly impacts the throughput, latency, and efficiency of the system. Here’s a brief overview of performance considerations for each:

    -> Confluent Kafka

    • Enhancements and Extensions: Confluent Kafka builds upon the open-source Kafka with additional tools and features, potentially providing optimizations and performance enhancements. This may include tools for monitoring, data integration, and schema management.
    • Customization: Organizations using Confluent Kafka have the flexibility to customize configurations and performance settings based on their specific use cases and requirements.
    • Scalability: Confluent Kafka inherits Kafka’s fundamental scalability features, allowing for horizontal scaling by adding more broker nodes to a cluster.

    -> AWS MSK

    • Managed Environment: AWS MSK is a fully managed service, meaning AWS takes care of many operational aspects, including patches, updates, and scaling. This managed environment can contribute to a more streamlined and optimized performance.
    • Scalability: AWS MSK allows for horizontal scaling by adjusting the number of broker nodes in a cluster dynamically. This elasticity contributes to the efficient handling of varying workloads.
    • Integration with AWS Services: Integration with other AWS services can enhance performance in specific scenarios, such as leveraging high-performance storage solutions or integrating with AWS networking services.

    -> On-Premise Kafka

    • Control and Customization: On-premise Kafka deployments provide organizations with complete control over the infrastructure, allowing for fine-tuning and customization to meet specific performance requirements.
    • Scalability Challenges: Scaling on-premise Kafka may present challenges related to manual provisioning of hardware, potential limitations in scalability due to physical constraints, and increased complexity in managing a distributed system.
    • Hardware Considerations: Performance is closely tied to the quality of hardware chosen for on-premise deployment. Organizations must invest in suitable hardware to meet performance expectations.

    Cost Considerations

    Cost considerations for Kafka deployments involve evaluating the direct and indirect expenses associated with setting up, managing, and maintaining a Kafka infrastructure. Kafka cost considerations span licensing, infrastructure, scalability, operations, data transfer, and ongoing support. The choice between Confluent Kafka, AWS MSK, or on-premise Kafka should be made by evaluating these costs in alignment with the organization’s budget, requirements, and preferences.

    Here’s a brief overview of key cost considerations:

    1. Deployment Costs

    -> Confluent Kafka:

    • Involves licensing costs for Confluent Platform, which provides additional tools and features beyond open-source Kafka.
    • Infrastructure costs for hosting Confluent Kafka, whether on-premise or in the cloud.

    -> AWS MSK:

    • Pay-as-you-go pricing model for AWS MSK, where users pay for the resources consumed by the Kafka cluster.
    • Costs include AWS MSK service charges, as well as associated AWS resource costs such as EC2 instances, storage, and data transfer.

    -> On-Premise Kafka:

    • Upfront costs for hardware, networking equipment, and software licenses.
    • Ongoing operational costs, including maintenance, power, cooling, and any required hardware upgrades.

    2. Scalability Costs

    -> Confluent Kafka:

    • Scaling Confluent Kafka may involve additional licensing costs if more resources are needed.
    • Infrastructure costs increase with the addition of more nodes or resources.

    -> AWS MSK:

    • Scaling AWS MSK is dynamic, and users are billed based on the resources consumed during scaling events.
    • Costs may increase with additional broker nodes and associated AWS resources.

    -> On-Premise Kafka:

    • Scaling on-premise Kafka requires manual provisioning of additional hardware, incurring upfront and ongoing costs.
    • Organizations must consider the total cost of ownership (TCO) when planning for scalability.

    3. Operational Costs

    -> Confluent Kafka:

    • Operational costs include manpower for managing and monitoring Confluent Kafka.
    • Costs associated with maintaining Confluent extensions and tools.

    -> AWS MSK:

    • AWS MSK is a managed service, reducing the operational burden on the user.
    • Operational costs may include AWS support charges and personnel for configuring and monitoring the Kafka environment.

    -> On-Premise Kafka:

    • Organizations bear full operational responsibility, incurring staffing, maintenance, monitoring tools, and ongoing support costs.

    4. Data Transfer Costs:

    -> Confluent Kafka:

    • Costs associated with data transfer depend on the chosen deployment model (e.g., cloud-based deployments may have data transfer costs).

    -> AWS MSK:

    • Data transfer costs may apply for communication between AWS MSK clusters and other AWS services.

    -> On-Premise Kafka:

    • Data transfer costs may be associated with network usage, especially if there are multiple geographically distributed Kafka clusters.

    Check the below images of pricing figures for Confluent Kafka and AWS MSK side by side.

    Security

    Security involves implementing measures to protect data, ensure confidentiality, integrity, and availability, and mitigate risks associated with unauthorized access or data breaches. Here’s a brief overview of key security considerations for Kafka deployments:

    1. Authentication:

    -> Confluent Kafka:

    • Supports various authentication mechanisms, including SSL/TLS for encrypted communication and SASL (Simple Authentication and Security Layer) for user authentication.
    • Role-based access control (RBAC) allows administrators to define and manage user roles and permissions.

    -> AWS MSK:

    • Integrates with AWS Identity and Access Management (IAM) for user authentication and authorization.
    • IAM policies control access to AWS MSK resources and actions.

    -> On-Premise Kafka:

    • Authentication mechanisms depend on the chosen Kafka distribution, but SSL/TLS and SASL are common.
    • LDAP or other external authentication systems may be integrated for user management.

    2. Authorization:

    -> Confluent Kafka:

    • RBAC allows fine-grained control over what users and clients can do within the Kafka environment.
    • Access Control Lists (ACLs) can be used to restrict access at the topic level.

    -> AWS MSK:

    • IAM policies define permissions for accessing and managing AWS MSK resources.
    • Fine-grained access control can be enforced using IAM roles and policies.

    -> On-Premise Kafka:

    • ACLs are typically used to specify which users or applications have access to specific Kafka topics or resources.
    • Access controls depend on the Kafka distribution and configuration.

    3. Encryption:

    -> Confluent Kafka:

    • Supports encryption in transit using SSL/TLS for secure communication between clients and brokers.
    • Optionally supports encryption at rest for data stored on disks.

    -> AWS MSK:

    • Encrypts data in transit using SSL/TLS for communication between clients and brokers.
    • Provides options for encrypting data at rest using AWS Key Management Service (KMS).

    -> On-Premise Kafka:

    • Encryption configurations depend on the Kafka distribution and may involve configuring SSL/TLS for secure communication and encryption at rest using platform-specific tools.

    Here are some key points to evaluate AWS MSK, Confluent Kafka, and on-premise Kafka, along with their advantages and disadvantages.

    Conclusion

    • The best approach is to evaluate the level of Kafka expertise in-house and the consistency of your workload.
    • On one side of the spectrum, with less expertise and uncertain workloads, Confluent’s breadth of features and support comes out ahead.
    • On the other side, with experienced Kafka engineers and a very predictable workload, you can save money with MSK.
    • For specific requirements and full control, you can go with on-premise Kafka.
  • Streamline Kubernetes Storage Upgrades

    Introduction:

    As technology advances, organizations are constantly seeking ways to optimize their IT infrastructure to enhance performance, reduce costs, and gain a competitive edge. One such approach involves migrating from traditional storage solutions to more advanced options that offer superior performance and cost-effectiveness. 

    In this blog post, we’ll explore a recent project (On Azure) where we successfully migrated our client’s applications from Disk type Premium SSD to Premium SSD v2. This migration led to performance improvements and cost savings for our client.

    Prerequisites:

    Before initiating this migration, ensure the following prerequisites are in place:

    1. Kubernetes Cluster: Ensure you have a working K8S cluster to host your applications.
    2. Velero Backup Tool: Install Velero, a widely-used backup and restoration tool tailored for Kubernetes environments.

    Overview of Velero:

    Velero stands out as a powerful tool designed for robust backup, restore, and migration solutions within Kubernetes clusters. It plays a crucial role in ensuring data safety and continuity during complex migration operations.

    Refer to the article on Velero installation and configuration.

    Strategic Plan Overview:

    There is two methods for upgrading storage classes:

    • Migration via Velero and CSI Integration: 

    This approach leverages Velero’s capabilities in conjunction with CSI integration to achieve a seamless and efficient migration.

    • Using Cloud Methods: 

    This method involves leveraging cloud provider-specific procedures. It includes steps like taking a snapshot of the disk, creating a new disk from the snapshot, and then establishing a Kubernetes volume using disk referencing. 

    Step-by-Step Guide:

    Migration via Velero and CSI Integration:

    Step 1 : Storage Class for Premium SSD v2

    Define a new storage class that supports Azure Premium SSD v2 disks. This storage class will be used to provision new persistent volumes during the restore process.

    # We have taken azure storage class example
    
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
     name: premium-ssd-v2
    parameters:
     cachingMode: None
     skuName: PremiumV2_LRS # (Disk Type)
    provisioner: disk.csi.azure.com
    reclaimPolicy: Delete
    volumeBindingMode: WaitForFirstConsumer
    allowVolumeExpansion: true

    Step 2: Volume Snapshot Class

    Introduce a Volume Snapshot Class to enable snapshot creation for persistent volumes. This class will be utilized for capturing the current state of persistent volumes before restoring them using Premium SSD v2.

    apiVersion: snapshot.storage.k8s.io/v1
    kind: VolumeSnapshotClass
    metadata:
      name: disk-snapshot-class
    driver: disk.csi.azure.com
    deletionPolicy: Delete
    parameters:
      incremental: "false"

    Step 3: Update Velero Deployment and Daemonset

    Enable CSI (Container Storage Interface) support in both the Velero deployment and the node-agent daemonset. This modification allows Velero to interact with the Cloud Disk CSI driver for provisioning and managing persistent volumes. Additionally, configure the Velero client to utilize the CSI plugin, ensuring that Velero utilizes the Cloud Disk CSI driver for backup and restore operations.

    # Enable CSI Server side features 
    
    $ Kubectl -n velero edit deployment/velero
    $ kubectl -n velero edit daemonset/restic
    
    # Add below --features=EnableCSI flag in both resources 
    
        spec:
          containers:
          - args:
            - server
            - --features=EnableCSI
    
    # Enable client side features 
    
    $ velero client config set features=EnableCSI

    Step 4: Take Velero Backup

    Create a Velero backup of all existing persistent volumes stored on Disk Premium SSD. These backups serve as a safety net in case of any unforeseen issues during the migration process. And we can use the include and exclude flags with the velero backup commands.

    Reference Article : https://velero.io/docs/v1.12/resource-filtering 

    # run the below command for taking backup 
    $ velero backup create backup_name --include-namespaces namespace_name

    Step 5: ConfigMap Deployment 

    Deploy a ConfigMap in the Velero namespace. This ConfigMap defines the mapping between the old storage class (Premium SSD) and the new storage class (Premium SSD v2). During the restore process, Velero will use this mapping to recreate the persistent volumes using the new storage class.

    apiVersion: v1
    data:
      # managed-premium : premium-ssd-v2
      older storage_class name : new storage_class name
    kind: ConfigMap
    metadata:
      labels:
        velero.io/change-storage-class: RestoreItemAction
        velero.io/plugin-config: ""
      name: storage-class-config
      namespace: velero

    Step 6: Velero Restore Operation

    Initiate the Velero restore process. This will replace the existing persistent volumes with new ones provisioned using Disk Premium SSD v2. The ConfigMap will ensure that the restored persistent volumes utilize the new storage class. 

    Reference article: https://velero.io/docs/v1.12/restore-reference 

    # run the below command for restoring from backups to different namespace 
    $ velero restore create restore-name --from-backup backup-name --namespace-mappings namespace1:namespace2
    # verify the new restored resources in namespace2
    $ kubectl get pvc,pv,pod -n namespace2

    Step 7: Verification & Testing

    Verify that all applications continue to function correctly after the restore process. Check for any performance improvements and cost savings as a result of the migration to Premium SSD v2.

    Step 8: Post-Migration Cleanup

    Remove any temporary resources created during the migration process, such as Volume Snapshots, and the custom Volume Snapshot Class. And delete the old persistent volume claims (PVCs) that were associated with the Premium SSD disks. This will trigger the automatic deletion of the corresponding persistent volumes (PVs) and Azure Disk storage.

    Impact:

    It’s less risky because all new objects are created while retaining other copies with snapshots. And during the scheduling of new pods, the new Premium SSD v2 disks will be provisioned in the same zone as the node where the pod is being scheduled. While the content of the new disks is restored from the snapshot, there may be some downtime expected. The duration of downtime depends on the size of the disks being restored.

    Conclusion:

    Migrating from any storage class to a newer, more performant one using Velero can provide significant benefits for your organization. By leveraging Velero’s comprehensive backup and restore functionalities, you can effectively migrate your applications to the new storage class while maintaining data integrity and application functionality. Whether you’re upgrading from Premium SSD to Premium SSD v2 or transitioning to a completely different storage provider. By adopting this approach, organizations can reap the rewards of enhanced performance, reduced costs, and simplified storage management.

  • Unlocking the Potential of Knowledge Graphs: Exploring Graph Databases

    There is a growing demand for data-driven insights to help businesses make better decisions and stay competitive. To meet this need, organizations are turning to knowledge graphs as a way to access and analyze complex data sets. In this blog post, I will discuss what knowledge graphs are, what graph databases are, how they differ from hierarchical databases, the benefits of graphical representation of data, and more. Lastly, we’ll discuss some of the challenges of graph databases and how they can be overcome.

    What Is a Knowledge Graph?

    A knowledge graph is a visual representation of data or knowledge. In order to make the relationships between various types of facts and data easy to see and understand, facts and data are organized into a graph structure. A knowledge graph typically consists of nodes, which stand in for entities like people or objects, and edges, which stand in for the relationships among these entities.

    Each node in a knowledge graph has characteristics and attributes that describe it. For instance, the node of a person might contain properties like name, age, and occupation. Edges between nodes reveal information about their connections. This makes knowledge graphs a powerful tool for representing and understanding data.

    Benefits of a Knowledge Graph

    There are a number of benefits to using knowledge graphs. 

    • Knowledge graphs(KG) provide a visual representation of data that can be easily understood. This makes it easier to quickly identify patterns, and correlations. 
    • Additionally, knowledge graphs make it simple to locate linkage data by allowing us to quickly access a particular node and obtain all of its child information.
    • These graphs are highly scalable, meaning they can support huge volumes of data. This makes them ideal for applications such as artificial intelligence (AI) and machine learning (ML).
    • Finally, knowledge graphs can be used to connect various types of data, including text, images, and videos, in addition to plain text. This makes them a great tool for data mining and analysis.

    ‍What are Graph Databases?

    Graph databases are used to store and manage data in the form of a graph. Unlike traditional databases, they offer a more flexible representation of data using nodes, edges, and properties. Graph databases are designed to support queries that require traversing relationships between different types of data.

    Graph databases are well-suited for applications that require complex data relationships, such as AI and ML. They are also more efficient than traditional databases in queries that involve intricate data relationships, as they can quickly process data without having to make multiple queries.

    Source: Techcrunch

    Comparing Graph Databases to Hierarchical Databases

    It is important to understand the differences between graph databases and hierarchical databases. But first, what is a hierarchical database? Hierarchical databases are structured in a tree-like form, with each record in the database linked to one or more other records. This structure makes hierarchical databases ideal for storing data that is organized in a hierarchical manner, such as an organizational chart. However, hierarchical databases are less efficient at handling complex data relationships. To understand with an example, suppose we have an organization with a CEO at the top, followed by several vice presidents, who are in turn responsible for several managers, who are responsible for teams of employees.

     

    In a hierarchical database, this structure would be represented as a tree, with the CEO at the root, and each level of the organization represented by a different level of the tree. For example:

     

    In a graph database, this same structure would be represented as a graph, with each node representing an entity (e.g., a person), and each edge representing a relationship (e.g., reporting to). For example:

    (Vice President A) — reports_to –> (CEO)

    (Vice President B) — reports_to –> (CEO)

    (Vice President A) — manages –> (Manager A1)

    (Vice President B) — manages –> (Manager B1)

    (Manager A1) — manages –> (Employee A1.1)

    (Manager B1) — manages –> (Employee B1.1)

    (Manager B1) — manages –> (Employee B1.2)

     

    As you can see, in a graph database, the relationships between entities are explicit and can be easily queried and traversed. In a hierarchical database, the relationships are implicit and can be more difficult to work with if the hierarchy becomes more complex. Hence the reason graph databases are better suited for complex data relationships is that it gives them the flexibility to easily store and query data.

    Creating a Knowledge Graph from Scratch

    We will now understand how to create a knowledge graph using an example below where we’ll use a simple XML file that contains information about some movies, and we’ll use an XSLT stylesheet to transform the XML data into RDF format along with some python libraries to help us in the overall process.

    Let’s consider an XML file having movie information:

    <movies>
      <movie id="tt0083658">
        <title>Blade Runner</title>
        <year>1982</year>
        <director rid="12341">Ridley Scott</director>
        <genre>Action</genre>
      </movie>
    
      <movie id="tt0087469">
        <title>Top Gun</title>
        <year>1986</year>
        <director rid="65217">Tony Hank</director>
        <genre>Thriller</genre>
      </movie>
    </movies>

    As discussed, to convert this data into a knowledge graph, we will be using an XSL file, now a question may arise that what is an XSL file? Well, XSL files are stylesheet documents that are used to transform XML data. To explore more on XSL, visit here, but don’t worry as we will be starting from scratch. 

    Moving ahead, we also need to know that to convert any data into graph data, we need to use an ontology; there are many ontologies available, like OWL ontology or EBUCore ontology. But what is an ontology? Well, in the context of knowledge graphs, an ontology is a formal specification of the relationships and constraints that exist within a specific domain or knowledge domain. It provides a vocabulary and a set of rules for representing and sharing knowledge, allowing machines to reason about the data they are working with. EBUCore is an ontology developed by the European Broadcasting Union (EBU) to provide a standardized metadata model for the broadcasting industry (OTT platforms, media companies, etc.). Further references on EBUCore can be found here.

    We will be using the below XSL for transforming the above XML with movie info.

    <?xml version="1.0"?>
    <xsl:stylesheet version="1.0"
                    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
                    xmlns:ebucore="http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#"
                    >
        <xsl:template match="movies">
            <rdf:RDF>
                <xsl:apply-templates select="movie"/>
            </rdf:RDF>
        </xsl:template>
    
        <xsl:template match="movie">
            <ebucore:Feature>
    
                <xsl:template match="title">
                    <ebucore:title>
                        <xsl:value-of select="."/>
                    </ebucore:title>
                </xsl:template>
    
                <xsl:template match="year">
                    <ebucore:dateBroadcast>
                        <xsl:value-of select="."/>
                    </ebucore:dateBroadcast>
                </xsl:template>
    
                <xsl:template match="director">
                    <ebucore:hasParticipatingAgent>
                        <ebucore:Agent>                                        
                            <ebucore:hasRole>Director</ebucore:hasRole>
                            <ebucore:agentName>
                                <xsl:value-of select="."/>
                            </ebucore:agentName>
                        </ebucore:Agent>
                    </ebucore:hasParticipatingAgent>
                </xsl:template>
    
                <xsl:template match="genre">
                    <ebucore:hasGenre>
                        <xsl:value-of select="."/>
                    </ebucore:hasGenre>
                </xsl:template>
    
            </ebucore:Feature>
        </xsl:template>
    </xsl:stylesheet>

    To start with XSL, the first line, “<?xml version=”1.0″?>” defines the version of document. The second line opens a stylesheet defining the XSL version we will be using and further having XSL, RDF, EBUCore as their namespaces. These namespaces are required as we will be using elements of those classes to avoid name conflicts in our XML document. The xsl:template match defines which element to match in the XML, as we want to match from the start of the XML. Since movies are the root element of our XML, we will be using xsl:template match=”movies”. 

    After that, we open an RDF tag to start our knowledge graph, this element will contain all the movie details, and hence we are using xsl:apply-templates on “movie” as in our XML we have multiple <movie> elements nested inside <movies> tag. To get further details from <movie> elements, we define a template matching all movie elements, which will help us to fetch all the required details. The tag <ebucore:Feature> defines that all of our contents belong to a feature which is an alternate name for “movie” in EBUCore ontology. We then match details like title, year, genre, etc., from XML and define their corresponding value from EBUCore, like ebucore:title, ebucore:dateBroadcast, and ebucore:hasGenre respectively. 

    Now that we have the XSL ready, we will need to apply this XSL on our XML and get RDF data out of it by following the below Python code:

    import lxml.etree as ET
    import xml.dom.minidom as xm
    
    movie_data = """
                <movies>
                    <movie id="tt0083658">
                        <title>Blade Runner</title>
                        <year>1982</year>
                        <director rid="12341">Ridley Scott</director>
                        <genre>Action</genre>
                    </movie>
                        <movie id="tt0087469">
                        <title>Top Gun</title>
                        <year>1986</year>
                        <director rid="65217">Tony Hank</director>
                        <genre>Thriller</genre>
                    </movie>
                </movies>
                """
    
    xslt_file = "transform_movies.xsl"
    xslt_root = ET.parse(xslt_file)
    transform = ET.XSLT(xslt_root)
    rss_root = ET.fromstring(movie_data)
    result_tree = transform(rss_root)
    result_string = ET.tostring(result_tree)
    
    # Converting bytes to string and pretty formatting
    dom_transformed = xm.parseString(result_string)
    pretty_xml_as_string = dom_transformed.toprettyxml()
    
    # Saving the output
    with open('Path_to_downloads/output_movie.xml', "w") as f:
        f.write(pretty_xml_as_string)
    # Print Output
    print(pretty_xml_as_string)

    The above code will generate the following output:

    <?xml version="1.0" ?>
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ebucore="http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#">
    
    <ebucore:Feature>
       <ebucore:title>Blade Runner</ebucore:title>
       <ebucore:dateBroadcast>1982</ebucore:dateBroadcast>
       <ebucore:hasParticipatingAgent>
           <ebucore:Agent>
               <ebucore:hasRole>Director</ebucore:hasRole>
               <ebucore:agentName>Ridley Scott</ebucore:agentName>
           </ebucore:Agent>
       </ebucore:hasParticipatingAgent>
       <ebucore:hasGenre>Action</ebucore:hasGenre>
    </ebucore:Feature>
    
    <ebucore:Feature>
       <ebucore:title>Top Gun</ebucore:title>
       <ebucore:dateBroadcast>1986</ebucore:dateBroadcast>
       <ebucore:hasParticipatingAgent>
           <ebucore:Agent>
               <ebucore:hasRole>Director</ebucore:hasRole>
               <ebucore:agentName>Tony Hank</ebucore:agentName>
           </ebucore:Agent>
       </ebucore:hasParticipatingAgent>
       <ebucore:hasGenre>Thriller</ebucore:hasGenre>
    </ebucore:Feature>
    
    </rdf:RDF>

    This output is an RDF XML, which will now be converted to a Graph and we will also visualize it using the following code:

    Note: Install the following library before proceeding.

    pip install graphviz rdflib

    from rdflib import Graph
    from rdflib import Graph, Namespace
    from rdflib.tools.rdf2dot import rdf2dot
    from graphviz import render
    
    # Creating empty graph object
    graph = Graph()
    graph.parse(result_string, format="xml")
    # Saving graph/ttl(Terse RDF Triple Language) in movies.ttl file
    graph.serialize(destination=f"Downloads/movies.ttl", format="ttl")
    
    # Steps to Visualize the generated graph
    # Define a namespace for the RDF data
    ns = Namespace("http://example.com/movies#")
    graph.bind("ex", ns)
    
    # Serialize the RDF data to a DOT file
    dot_file = open("Downloads/movies.dot", "w")
    rdf2dot(graph, dot_file, opts={"label": "Movies Graph", "rankdir": "LR"})
    dot_file.close()
    
    # Render the DOT file to a PNG image
    
    render("dot", "png", "Downloads/movies.dot")

    Finally, the above code will yield a movies.dot.png file in the Downloads folder location, which will look something like this:

    This clearly represents the relationship between edges and nodes along with all information in a well-formatted way.

    Examples of Knowledge Graphs

    Now that we have knowledge of how we can create a knowledge graph, let’s explore the big players that are using such graphs for their operations.

    Google Knowledge Graph: This is one of the most well-known examples of a knowledge graph. It is used by Google to enhance its search results with additional information about entities, such as people, places, and things. For example, if you search for “Barack Obama,” the Knowledge Graph will display a panel with information about his birthdate, family members, education, career, and more. All this information is stored in the form of nodes and edges making it easier for the Google search engine to retrieve related information of any topic.

    DBpedia: This is a community-driven project that extracts structured data from Wikipedia and makes it available as a linked data resource. It is primarily used for graph analysis and executing SPARQL queries. It contains information on millions of entities, such as people, places, and things, and their relationships with one another. DBpedia can be used to power applications like question-answering systems, recommendation engines, and more. One of the key advantages of DBpedia is that it is an open and community-driven project, which means that anyone can contribute to it and use it for their own applications. This has led to a wide variety of applications built on top of DBpedia, from academic research to commercial products.

    As we have discussed the examples of knowledge graphs, one should know that they all use SPARQL queries to retrieve data from their huge corpus of graphs. So, let’s write one such query to retrieve data from the knowledge graph created by us for movie data. We will be writing a query to retrieve all movies’ Genre information along with Movie Titles.

    from rdflib.plugins.sparql import prepareQuery
    
    # Define the SPARQL query
    query = prepareQuery('''
        PREFIX ebucore: <http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#>
        SELECT ?genre ?title
        WHERE {
          ?movie a ebucore:Feature ;
                 ebucore:hasGenre ?genre ;
                 ebucore:title ?title .
        }
    ''', initNs={"ebucore": ns})
    
    
    # Execute the query and print the results
    results = graph.query(query)
    for row in results:
        genre, title = row
        print(f"Movie Genre: {genre}, Movie Title: {title}")

    Challenges with Graph Databases:

    Data Complexity: One of the primary challenges with graph databases and knowledge graphs is data complexity. As the size and complexity of the data increase, it can become challenging to manage and query the data efficiently.

    Data Integration: Graph databases and knowledge graphs often need to integrate data from different sources, which can be challenging due to differences in data format, schema, and structure.

    Query Performance: Knowledge graphs are often used for complex queries, which can be slow to execute, especially for large datasets.

    Knowledge Representation: Representing knowledge in a graph database or knowledge graph can be challenging due to the diversity of concepts and relationships that need to be modeled accurately. One should have experience with ontologies, relationships, and business use cases to curate a perfect representation

    Bonus: How to Overcome These Challenges:

    • Use efficient indexing and query optimization techniques to handle data complexity and improve query performance.
    • Use data integration tools and techniques to standardize data formats and structures to improve data integration.
    • Use distributed computing and partitioning techniques to scale the database horizontally.
    • Use caching and precomputing techniques to speed up queries.
    • Use ontology modeling and semantic reasoning techniques to accurately represent knowledge and relationships in the graph database or knowledge graph.

    Conclusion

    In conclusion, graph databases, and knowledge graphs are powerful tools that offer several advantages over traditional relational databases. They enable flexible modeling of complex data and relationships, which can be difficult to achieve using a traditional tabular structure. Moreover, they enhance query performance for complex queries and enable new use cases such as recommendation engines, fraud detection, and knowledge management.

    Despite the aforementioned challenges, graph databases and knowledge graphs are gaining popularity in various industries, ranging from finance to healthcare, and are expected to continue playing a significant role in the future of data management and analysis.

  • Vector Search: The New Frontier in Personalized Recommendations

    Introduction

    Imagine you are a modern-day treasure hunter, not in search of hidden gold, but rather the wealth of knowledge and entertainment hidden within the vast digital ocean of content. In this realm, where every conceivable topic has its own sea of content, discovering what will truly captivate you is like finding a needle in an expansive haystack.

    This challenge leads us to the marvels of recommendation services, acting as your compass in this digital expanse. These services are the unsung heroes behind the scenes of your favorite platforms, from e-commerce sites that suggest enticing products to streaming services that understand your movie preferences better than you might yourself. They sift through immense datasets of user interactions and content features, striving to tailor your online experience to be more personalized, engaging, and enriching.

    But what if I told you that there is a cutting-edge technology that can take personalized recommendations to the next level? Today, I will take you through a journey to build a blog recommendation service that understands the contextual similarities between different pieces of content, transcending beyond basic keyword matching. We’ll harness the power of vector search, a technology that’s revolutionizing personalized recommendations. We’ll explore how recommendation services are traditionally implemented, and then briefly discuss how vector search enhances them.

    Finally, we’ll put this knowledge to work, using OpenAI’s embedding API and Elasticsearch to create a recommendation service that not only finds content but also understands and aligns it with your unique interests.

    Exploring the Landscape: Traditional Recommendation Systems and Their Limits

    Traditionally, these digital compasses, or recommendation systems, employ methods like collaborative and content-based filtering. Imagine sitting in a café where the barista suggests a coffee based on what others with similar tastes enjoyed (collaborative filtering) or based on your past coffee choices (content-based filtering). While these methods have been effective in many scenarios, they come with some limitations. They often stumble when faced with the vast and unstructured wilderness of web data, struggling to make sense of the diverse and ever-expanding content landscape. Additionally, when user preferences are ambiguous or when you want to recommend content by truly understanding it on a semantic level, traditional methods may fall short.

    Enhancing Recommendation with Vector Search and Vector Databases

    Our journey now takes an exciting turn with vector search and vector databases, the modern tools that help us navigate this unstructured data. These technologies transform our café into a futuristic spot where your coffee preference is understood on a deeper, more nuanced level.

    Vector Search: The Art of Finding Similarities

    Vector search operates like a seasoned traveler who understands the essence of every place visited. Text, images, or sounds can be transformed into numerical vectors, like unique coordinates on a map. The magic happens when these vectors are compared, revealing hidden similarities and connections, much like discovering that two seemingly different cities share a similar vibe.

    Vector Databases: Navigating Complex Data Landscapes

    Imagine a vast library of books where each book captures different aspects of a place along with its coordinates. Vector databases are akin to this library, designed to store and navigate these complex data points. They easily handle intricate queries over large datasets, making them perfect for our recommendation service, ensuring no blog worth reading remains undiscovered.

    Embeddings: Semantic Representation

    In our journey, embeddings are akin to a skilled artist who captures not just the visuals but the soul of a landscape. They map items like words or entire documents into real-number vectors, encapsulating their deeper meaning. This helps in understanding and comparing different pieces of content on a semantic level, letting the recommendation service show you things that really match your interests.

    Sample Project: Blog Recommendation Service

    Project Overview

    Now, let’s craft a simple blog recommendation service using OpenAI’s embedding APIs and Elasticsearch as a vector database. The goal is to recommend blogs similar to the current one the user is reading, which can be shown in the read more or recommendation section.

    Our blogs service will be responsible for indexing the blogs, finding similar one,  and interacting with the UI Service.

    Tools and Setup

    We will need the following tools to build our service:

    • OpenAI Account: We will be using OpenAI’s embedding API to generate the embeddings for our blog content. You will need an OpenAI account to use the APIs. Once you have created an account, please create an API key and store it in a secure location.
    • Elasticsearch: A popular database renowned for its full-text search capabilities, which can also be used as a vector database, adept at storing and querying complex embeddings with its dense_vector field type.
    • Docker: A tool that allows developers to package their applications and all the necessary dependencies into containers, ensuring that the application runs smoothly and consistently across different computing environments.
    • Python: A versatile programming language for developers across diverse fields, from web development to data science.

    The APIs will be created using the FastAPI framework, but you can choose any framework.

    Steps

    First, we’ll create a BlogItem class to represent each blog. It has only three fields, which will be enough for this demonstration, but real-world entities would have more details to accommodate a wider range of properties and functionalities.

    class BlogItem:
        blog_id: int
        title: str
        content: str
    
        def __init__(self, blog_id: int, title: str, content: str):
            self.blog_id = blog_id
            self.title = title
            self.content = content

    Elasticsearch Setup:

    • To store the blog data along with its embedding in Elasticsearch, we need to set up a local Elasticsearch cluster and then create an index for our blogs. You can also use a cloud-based version if you have already procured one for personal use.
    • Install Docker or Docker Desktop on your machine and create Elasticsearch and Kibana docker containers using the below docker compose file. Run the following command to create and start the services in the background:
    • docker compose -f /path/to/your/docker-compose/file up -d. 
    • You can exclude the file path if you are in the same directory as your docker-compose.yml file. The advantage of using docker compose is that it allows you to clean up these resources with just one command.
    • docker compose -f /path/to/your/docker-compose/file down.
    version: '3'
    
    services:
    
      elasticsearch:
        image: docker.elastic.co/elasticsearch/elasticsearch:<version>
        container_name: elasticsearch
        environment:
          - node.name=docker-cluster
          - discovery.type=single-node
          - cluster.routing.allocation.disk.threshold_enabled=false
          - bootstrap.memory_lock=true
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
          - xpack.security.enabled=true
          - ELASTIC_PASSWORD=YourElasticPassword
          - "ELASTICSEARCH_USERNAME=elastic"
        ulimits:
          memlock:
            soft: -1
            hard: -1
        volumes:
          - esdata:/usr/share/elasticsearch/data
        ports:
          - "9200:9200"
        networks:
          - esnet
    
      kibana:
        image: docker.elastic.co/kibana/kibana:<version>
        container_name: kibana
        environment:
          ELASTICSEARCH_HOSTS: http://elasticsearch:9200
          ELASTICSEARCH_USERNAME: elastic
          ELASTICSEARCH_PASSWORD: YourElasticPassword
        ports:
          - "5601:5601"
        networks:
          - esnet
        depends_on:
          - elasticsearch
    
    networks:
      esnet:
        driver: bridge
    
    volumes:
      esdata:
        driver: local

    • Connect to the local ES instance and create an index. Our “blogs” index will have a unique blog ID, blog title, blog content, and an embedding field to store the vector representation of blog content. The text-embedding-ada-002 model we have used here produces vectors with 1536 dimensions; hence, it’s important to use the same in our embeddings field in the blogs index.
    import logging
    from elasticsearch import Elasticsearch
    
    # Setting up logging
    logging.basicConfig(level=logging.INFO, format='%(asctime)s [%(levelname)s] %(filename)s:%(lineno)d - %(message)s',
                        datefmt='%Y-%m-%d %H:%M:%S')
    # Elasticsearch client setup
    es = Elasticsearch(hosts="http://localhost:9200",
                       basic_auth=("elastic", "YourElasticPassword"))
    
    
    # Create an index with a mappings for embeddings
    def create_index(index_name: str, embedding_dimensions: int):
        try:
            es.indices.create(
                index=index_name,
                body={
                    'mappings': {
                        'properties': {
                            'blog_id': {'type': 'long'},
                            'title': {'type': 'text'},
                            'content': {'type': 'text'},
                            'embedding': {'type': 'dense_vector', 'dims': embedding_dimensions}
                        }
                    }
                }
            )
        except Exception as e:
            logging.error(f"An error occurred while creating index {index_name} : {e}")
    
    
    # Sample usage
    create_index("blogs", 1536)

    Create Embeddings AND Index Blogs:

    • We use OpenAI’s Embedding API to get a vector representation of our blog title and content. I am using the 002 model here, which is recommended by Open AI for most use cases. The input to the text-embedding-ada-002 should not exceed 8291 tokens (1000 tokens are roughly equal to 750 words) and cannot be empty.
    from openai import OpenAI, OpenAIError
    
    api_key = ‘YourApiKey’
    client = OpenAI(api_key=api_key)
    
    def create_embeddings(text: str, model: str = "text-embedding-ada-002") -> list[float]:
        try:
            text = text.replace("n", " ")
            response = client.embeddings.create(input=[text], model=model)
            logging.info(f"Embedding created successfully")
            return response.data[0].embedding
        except OpenAIError as e:
            logging.error(f"An OpenAI error occurred while creating embedding : {e}")
            raise
        except Exception as e:
            logging.exception(f"An unexpected error occurred while creating embedding  : {e}")
            raise

    • When the blogs get created or the content of the blog gets updated, we will call the create_embeddings function to get text embedding and store it in our blogs index.
    # Define the index name as a global constant
    
    ELASTICSEARCH_INDEX = 'blogs'
    
    def index_blog(blog_item: BlogItem):
        try:
            es.index(index=ELASTICSEARCH_INDEX, body={
                'blog_id': blog_item.blog_id,
                'title': blog_item.title,
                'content': blog_item.content,
                'embedding': create_embeddings(blog_item.title + "n" + blog_item.content)
            })
        except Exception as e:
            logging.error(f"Failed to index blog with blog id {blog_item.blog_id} : {e}")
            raise

    • Create a Pydantic model for the request body:
    class BlogItemRequest(BaseModel):
        title: str
        content: str

    • Create an API to save blogs to Elasticsearch. The UI Service would call this API when a new blog post gets created.
    @app.post("/blogs/")
    def save_blog(response: Response, blog_item: BlogItemRequest) -> dict[str, str]:
        # Create a BlogItem instance from the request data
        try:
            blog_id = get_blog_id()
            blog_item_obj = BlogItem(
                blog_id=blog_id,
                title=blog_item.title,
                content=blog_item.content,
            )
    
            # Call the index_blog method to index the blog
            index_blog(blog_item_obj)
            return {"message": "Blog indexed successfully", "blog_id": str(blog_id)}
        except Exception as e:
            logging.error(f"An error occurred while indexing blog with blog_id {blog_id} : {e}")
            response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
            return {"error": "Failed to index blog"}

    Finding Relevant Blogs:

    • To find blogs that are similar to the current one, we will compare the current blog’s vector representation with other blogs present in the Elasticsearch index using the cosine similarity function.
    • Cosine similarity is a mathematical measure used to determine the cosine of the angle between two vectors in a multi-dimensional space, often employed to assess the similarity between two documents or data points. 
    • The cosine similarity score ranges from -1 to 1. As the cosine similarity score increases from -1 to 1, it indicates an increasing degree of similarity between the vectors. Higher values represent greater similarity.
    • Create a custom exception to handle a scenario when a blog for a given ID is not present in Elasticsearch.
    class BlogNotFoundException(Exception):
        def __init__(self, message="Blog not found"):
            self.message = message
            super().__init__(self.message)

    • First, we will check if the current blog is present in the blogs index and get its embedding. This is done to prevent unnecessary calls to Open AI APIs as it consumes tokens. Then, we would construct an Elasticsearch dsl query to find the nearest neighbors and return their blog content.
    def get_blog_embedding(blog_id: int) -> Optional[Dict]:
        try:
            response = es.search(index=ELASTICSEARCH_INDEX, body={
                'query': {
                    'term': {
                        'blog_id': blog_id
                    }
                },
                '_source': ['title', 'content', 'embedding']  # Fetch title, content and embedding
            })
    
            if response['hits']['hits']:
                logging.info(f"Blog found with blog_id {blog_id}")
                return response['hits']['hits'][0]['_source']
            else:
                logging.info(f"No blog found with blog_id {blog_id}")
                return None
        except Exception as e:
            logging.error(f"Error occurred while searching for blog with blog_id {blog_id}: {e}")
            raise
    
    
    def find_similar_blog(current_blog_id: int, num_neighbors=2) -> list[dict[str, str]]:
        try:
            blog_data = get_blog_embedding(current_blog_id)
            if not blog_data:
                raise BlogNotFoundException(f"Blog not found for id:{current_blog_id}")
            blog_embedding = blog_data['embedding']
            if not blog_embedding:
                blog_embedding = create_embeddings(blog_data['title'] + 'n' + blog_data['content'])
            # Find similar blogs using the embedding
            response = es.search(index=ELASTICSEARCH_INDEX, body={
                'size': num_neighbors + 1,  # Retrieve extra result as we'll exclude the current blog
                '_source': ['title', 'content', 'blog_id', '_score'],
                'query': {
                    'bool': {
                        'must': {
                            'script_score': {
                                'query': {'match_all': {}},
                                'script': {
                                    'source': "cosineSimilarity(params.query_vector, 'embedding')",
                                    'params': {'query_vector': blog_embedding}
                                }
                            }
                        },
                        'must_not': {
                            'term': {
                                'blog_id': current_blog_id  # Exclude the current blog
                            }
                        }
                    }
                }
            })
    
            # Extract and return the hits
            hits = [
                {
                    'title': hit['_source']['title'],
                    'content': hit['_source']['content'],
                    'blog_id': hit['_source']['blog_id'],
                    'score': f"{hit['_score'] * 100:.2f}%"
                }
                for hit in response['hits']['hits']
                if hit['_source']['blog_id'] != current_blog_id
            ]
    
            return hits
        except Exception as e:
            logging.error(f"An error while finding similar blogs: {e}")
            raise

    • Define a Pydantic model for the response:
    class BlogRecommendation(BaseModel):
        blog_id: int
        title: str
        content: str
        score: str

    • Create an API that would be used by UI Service to find similar blogs as the current one user is reading:
    @app.get("/recommend-blogs/{current_blog_id}")
    def recommend_blogs(
            response: Response,
            current_blog_id: int,
            num_neighbors: Optional[int] = 2) -> Union[Dict[str, str], List[BlogRecommendation]]:
        try:
            # Call the find_similar_blog function to get recommended blogs
            recommended_blogs = find_similar_blog(current_blog_id, num_neighbors)
            return recommended_blogs
        except BlogNotFoundException as e:
            response.status_code = status.HTTP_400_BAD_REQUEST
            return {"error": f"Blog not found for id:{current_blog_id}"}
        except Exception as e:
            response.status_code = status.HTTP_500_INTERNAL_SERVER_ERROR
            return {"error": "Unable to process the request"}

    • The below flow diagram summarizes all the steps we have discussed so far:

    Testing the Recommendation Service

    • Ideally, we would be receiving the blog ID from the UI Service and passing the recommendations back, but for illustration purposes, we’ll be calling the recommend blogs API with some test inputs from my test dataset. The blogs in this sample dataset have concise titles and content, which are sufficient for testing purposes, but real-world blogs will be much more detailed and have a significant amount of data. The test dataset has around 1000 blogs on various categories like healthcare, tech, travel, entertainment, and so on.
    • A sample from the test dataset:

     

    • Test Result 1: Medical Research Blog

      Input Blog: Blog_Id: 1, Title: Breakthrough in Heart Disease Treatment, Content: Researchers have developed a new treatment for heart disease that promises to be more effective and less invasive. This breakthrough could save millions of lives every year.


    • Test Result 2: Travel Blog

      Input Blog: Blog_Id: 4, Title: Travel Tips for Sustainable Tourism, Content: How to travel responsibly and sustainably.


    I manually tested multiple blogs from the test dataset of 1,000 blogs, representing distinct topics and content, and assessed the quality and relevance of the recommendations. The recommended blogs had scores in the range of 87% to 95%, and upon examination, the blogs often appeared very similar in content and style.

    Based on the test results, it’s evident that utilizing vector search enables us to effectively recommend blogs to users that are semantically similar. This approach ensures that the recommendations are contextually relevant, even when the blogs don’t share identical keywords, enhancing the user’s experience by connecting them with content that aligns more closely with their interests and search intent.

    Limitations

    This approach for finding similar blogs is good enough for our simple recommendation service, but it might have certain limitations in real-world applications.

    • Our similarity search returns the nearest k neighbors as recommendations, but there might be scenarios where no similar blog might exist or the neighbors might have significant score differences. To deal with this, you can set a threshold to filter out recommendations below a certain score. Experiment with different threshold values and observe their impact on recommendation quality. 
    • If your use case involves a small dataset and the relationships between user preferences and item features are straightforward and well-defined, traditional methods like content-based or collaborative filtering might be more efficient and effective than vector search.

    Further Improvements

    • Using LLM for Content Validation: Implement a verification step using large language models (LLMs) to assess the relevance and validity of recommended content. This approach can ensure that the suggestions are not only similar in context but also meaningful and appropriate for your audience.
    • Metadata-based Embeddings: Instead of generating embeddings from the entire blog content, utilize LLMs to extract key metadata such as themes, intent, tone, or key points. Create embeddings based on this extracted metadata, which can lead to more efficient and targeted recommendations, focusing on the core essence of the content rather than its entirety.

    Conclusion

    Our journey concludes here, but yours is just beginning. Armed with the knowledge of vector search, vector databases, and embeddings, you’re now ready to build a recommendation service that doesn’t just guide users to content but connects them to the stories, insights, and experiences they seek. It’s not just about building a service; it’s about enriching the digital exploration experience, one recommendation at a time.

  • Mage: Your New Go-To Tool for Data Orchestration

    In our journey to automate data pipelines, we’ve used tools like Apache Airflow, Dagster, and Prefect to manage complex workflows. However, as data automation continues to change, we’ve added a new tool to our toolkit: Mage AI.

    Mage AI isn’t just another tool; it’s a solution to the evolving demands of data automation. This blog aims to explain how Mage AI is changing the way we automate data pipelines by addressing challenges and introducing innovative features. Let’s explore this evolution, understand the problems we face, and see why we’ve adopted Mage AI.

    What is Mage AI?

    Mage is a user-friendly open-source framework created for transforming and merging data. It’s a valuable tool for developers handling substantial data volumes efficiently. At its heart, Mage relies on “data pipelines,” made up of code blocks. These blocks can run independently or as part of a larger pipeline. Together, these blocks form a structure known as a directed acyclic graph (DAG), which helps manage dependencies. For example, you can use Mage for tasks like loading data, making transformations, or exportation.

    Mage Architecture:

    Before we delve into Mage’s features, let’s take a look at how it works.

    When you use Mage, your request begins its journey in the Mage Server Container, which serves as the central hub for handling requests, processing data, and validation. Here, tasks like data processing and real-time interactions occur. The Scheduler Process ensures tasks are scheduled with precision, while Executor Containers, designed for specific tasks like Python or AWS, carry out the instructions.

    Mage’s scalability is impressive, allowing it to handle growing workloads effectively. It can expand both vertically and horizontally to maintain top-notch performance. Mage efficiently manages data, including code, data, and logs, and takes security seriously when handling databases and sensitive information. This well-coordinated system, combined with Mage’s scalability, guarantees reliable data pipelines, blending technical precision with seamless orchestration.

    Scaling Mage:

    To enhance Mage’s performance and reliability as your workload expands, it’s crucial to scale its architecture effectively. In this concise guide, we’ll concentrate on four key strategies for optimizing Mage’s scalability:

    1. Horizontal Scaling: Ensure responsiveness by running multiple Mage Server and Scheduler instances. This approach keeps the system running smoothly, even during peak usage.
    2. Multiple Executor Containers: Deploy several Executor Containers to handle concurrent task execution. Customize them for specific executors (e.g., Python, PySpark, or AWS) to scale task processing horizontally as needed.
    3. External Load Balancers: Utilize external load balancers to distribute client requests across Mage instances. This not only boosts performance but also ensures high availability by preventing overloading of a single server.
    4. Scaling for Larger Datasets: To efficiently handle larger datasets, consider:

    a. Allocating more resources to executors, empowering them to tackle complex data transformations.

    b. Mage supports direct data warehouse transformation and native Spark integration for massive datasets.

    Features: 

    1) Interactive Coding Experience

    Mage offers an interactive coding experience tailored for data preparation. Each block in the editor is a modular file that can be tested, reused, and chained together to create an executable data pipeline. This means you can build your data pipeline piece by piece, ensuring reliability and efficiency.

    2) UI/IDE for Building and Managing Data Pipelines

    Mage takes data pipeline development to the next level with a user-friendly integrated development environment (IDE). You can build and manage your data pipelines through an intuitive user interface, making the process efficient and accessible to both data scientists and engineers.

    3) Multiple Languages Support

    Mage supports writing pipelines in multiple languages such as Python, SQL, and R. This language versatility means you can work with the languages you’re most comfortable with, making your data preparation process more efficient.

    4) Multiple Types of Pipelines

    Mage caters to diverse data pipeline needs. Whether you require standard batch pipelines, data integration pipelines, streaming pipelines, Spark pipelines, or DBT pipelines, Mage has you covered.

    5) Built-In Engineering Best Practices

    Mage is not just a tool; it’s a promoter of good coding practices. It enables reusable code, data validation in each block, and operationalizes data pipelines with built-in observability, data quality monitoring, and lineage. This ensures that your data pipelines are not only efficient but also maintainable and reliable.

    6) Dynamic Blocks

    Dynamic blocks in Mage allow the output of a block to dynamically create additional blocks. These blocks are spawned at runtime, with the total number of blocks created being equal to the number of items in the output data of the dynamic block multiplied by the number of its downstream blocks.

    7) Triggers

    • Schedule Triggers: These triggers allow you to set specific start dates and intervals for pipeline runs. Choose from daily, weekly, or monthly, or even define custom schedules using Cron syntax. Mage’s Schedule Triggers put you in control of when your pipelines execute.
    • Event Triggers: With Event Triggers, your pipelines respond instantly to specific events, such as the completion of a database query or the creation of a new object in cloud storage services like Amazon S3 or Google Storage. Real-time automation at your fingertips.
    • API Triggers: API Triggers enable your pipelines to run in response to specific API calls. Whether it’s customer requests or external system interactions, these triggers ensure your data workflows stay synchronized with the digital world.

    Different types of Block: 

    Data Loader: Within Mage, Data Loaders are ready-made templates designed to seamlessly link up with a multitude of data sources. These sources span from Postgres, Bigquery, Redshift, and S3 to various others. Additionally, Mage allows for the creation of custom data loaders, enabling connections to APIs. The primary role of Data Loaders is to facilitate the retrieval of data from these designated sources.

    Data Transformer: Much like Data Loaders, Data Transformers provide predefined functions such as handling duplicates, managing missing data, and excluding specific columns. Alternatively, you can craft your own data transformations or merge outputs from multiple data loaders to preprocess and sanitize the data before it advances through the pipeline.

    Data Exporter: Data Exporters within Mage empower you to dispatch data to a diverse array of destinations, including databases, data lakes, data warehouses, or local storage. You can opt for predefined export templates or craft custom exporters tailored to your precise requirements.

    Custom Blocks: Custom blocks in the Mage framework are incredibly flexible and serve various purposes. They can store configuration data and facilitate its transmission across different pipeline stages. Additionally, they prove invaluable for logging purposes, allowing you to categorize and visually distinguish log entries for enhanced organization.

    Sensor: A Sensor, a specialized block within Mage, continuously assesses a condition until it’s met or until a specified time duration has passed. When a block depends on a sensor, it remains inactive until the sensor confirms that its condition has been satisfied. Sensors are especially valuable when there’s a need to wait for external dependencies or handle delayed data before proceeding with downstream tasks.

    Getting Started with Mage

    There are two ways to run mage, either using docker or using pip:
    Docker Command

    Create a new working directory where all the mage files will be stored.

    Then, in that working directory, execute this command:

    Windows CMD: 

    docker run -it -p 6789:6789 -v %cd%:/home/src mageai/mageai /app/run_app.sh mage start [project_name]

    Linux CMD:

    docker run -it -p 6789:6789 -v $(pwd):/home/src mageai/mageai /app/run_app.sh mage start [project_name]

    Using Pip (Working directory):

    pip install mage-ai

    Mage start [project_name]

    You can browse to http://localhost:6789/overview to get to the Mage UI.

    Let’s build our first pipelineto fetch CSV files from the API for data loading, do some useful transformations, and export that data to our local database.

    Dataset invoices CSV files stored in the current directory of columns:

     (1) First Name; (2) Last Name; (3) E-mail; (4) Product ID; (5) Quantity; (6) Amount; (7) Invoice Date; (8) Address; (9) City; (10) Stock Code

    Create a new pipeline and select a standard batch (we’ll be implementing a batch pipeline) from the dashboard and give it a unique ID.

    Project structure:

    ├── mage_data

    └── [project_name]

        ├── charts

        ├── custom

        ├── data_exporters

        ├── data_loaders

        ├── dbt

        ├── extensions

        ├── pipelines

        │   └── [pipeline_name]

        │       ├── __init__.py

        │       └── metadata.yaml

        ├── scratchpads

        ├── transformers

        ├── utils

        ├── __init__.py

        ├── io_config.yaml

        ├── metadata.yaml

        └── requirements.txt

    This pipeline consists of all the block files, including data loader, transformer, charts, and configuration files for our pipeline io_config.yaml and metadata.yaml files. All block files will contain decorators’ inbuilt function where we will be writing our code.

    1. We begin by loading a CSV file from our local directory, specifically located at /home/src/invoice.csv. To achieve this, we select the “Local File” option from the Templates dropdown and configure the Data Loader block accordingly. Running this configuration will allow us to confirm if the CSV file loads successfully.

    2. In the next step, we introduce a Transformer block using a generic template. On the right side of the user interface, we can observe the directed acyclic graph (DAG) tree. To establish the data flow, we edit the parent of the Transformer block, linking it either directly to the Data Loader block or via the user interface.

    The Transformer block operates on the data frame received from the upstream Data Loader block, which is passed as the first argument to the Transformer function.

    3. Our final step involves exporting the DataFrame to a locally hosted PostgreSQL database. We incorporate a Data Export block and connect it to the Transformer block.

    To establish a connection with the PostgreSQL database, it is imperative to configure the database credentials in the io_config.yaml file. Alternatively, these credentials can be added to environmental variables.

    With these steps completed, we have successfully constructed a foundational batch pipeline. This pipeline efficiently loads, transforms, and exports data, serving as a fundamental building block for more advanced data processing tasks.

    Mage vs Other tools:

    Consistency Across Environments: Some orchestration tools may exhibit inconsistencies between local development and production environments due to varying configurations. Mage tackles this challenge by providing a consistent and reproducible workflow environment through a single configuration file that can be executed uniformly across different environments.

    Reusability: Achieving reusability in workflows can be complex in some tools. Mage simplifies this by allowing tasks and workflows to be defined as reusable components within a Magefile, making it effortless to share them across projects and teams.

    Data Passing: Efficiently passing data between tasks can be a challenge in certain tools, especially when dealing with large datasets. Mage streamlines data passing through straightforward function arguments and returns, enabling seamless data flow and versatile data handling.

    Testing: Some tools lack user-friendly testing utilities, resulting in manual testing and potential coverage gaps. Mage simplifies testing with a robust testing framework that enables the definition of test cases, inputs, and expected outputs directly within the Mage file.

    Debugging: Debugging failed tasks can be time-consuming with certain tools. Mage enhances debugging with detailed logs and error messages, offering clear insights into the causes of failures and expediting issue resolution.

    Conclusion: 

    Mage offers a streamlined and user-friendly approach to data pipeline orchestration, addressing common challenges with simplicity and efficiency. Its single-container deployment, visual interface, and robust features make it a valuable tool for data professionals seeking an intuitive and consistent solution for managing data workflows.

  • Discover App Features with TipKit

    In today’s digital age, the user experience is paramount. Mobile applications need to be intuitive and user-friendly so that users not only enjoy the app’s main functionalities but also easily navigate through its features. There are instances where a little extra guidance can go a long way, whether it’s introducing users to a fresh feature, showing them shortcuts to complete tasks more efficiently, or simply offering tips on getting the most out of an app. Many developers have traditionally crafted custom overlays or tooltips to bridge this gap, often requiring a considerable amount of effort. But the wait for a streamlined solution is over. After much anticipation, Apple has introduced the TipKit framework, a dedicated tool to simplify this endeavor, enhancing user experience with finesse.

    TipKit

    Introduced at WWDC 2023, TipKit emerges as a beacon for app developers aiming to enhance user engagement and experience. This framework is ingeniously crafted to present mini-tutorials, shining a spotlight on new, intriguing, or yet-to-be-discovered features within an application. Its utility isn’t just confined to a single platform—TipKit boasts integration with iCloud to ensure data synchronization across various devices.

    At the heart of TipKit lies its two cornerstone components: the Tip Protocol and the TipView. These components serve as the foundation, enabling developers to craft intuitive and informative tips that resonate with their user base.

    Tip Protocol

    The essence of TipKit lies in its Tip Protocol, which acts as the blueprint for crafting and configuring content-driven tips. To create your tips tailored to your application’s needs, it’s imperative to conform to the Tip Protocol.

    While every Tip demands a title for identification, the protocol offers flexibility by introducing a suite of properties that can be optionally integrated, allowing developers to craft a comprehensive and informative tip.

    1. title(Text): The title of the Tip.
    2. message(Text): A concise description further elaborates the essence of the Tip, providing users with a deeper understanding.
    3. asset(Image): An image to display on the left side of the Tip view.
    4. id(String): A unique identifier to your tip. Default will be the name of the type that conforms to the Tip protocol.
    5. rules(Array of type Tips.Rule): This can be used to add rules to the Tip that can determine when the Tip needs to be displayed.
    6. options(Array of type Tips.Option): Allows to add options for defining the behavior of the Tip.
    7. actions(Array of type Tips.Action): This will provide primary and secondary buttons in the TipView that could help the user learn more about the Tip or execute a custom action when the user interacts with it.

    Creating a Custom Tip

    Let’s create our first Tip. Here, we are going to show a Tip to help the user understand the functionality of the cart button.

    struct CartItemsTip: Tip {
        var title: Text {
            Text("Click the cart button to see what's in your cart")
        }
        var message: Text? {
            Text("You can edit/remove the items from your cart")
        }
        var image: Image? {
            Image(systemName: "cart")
        }
    }

    TipView

    As the name suggests, TipView is a user interface that represents the Inline Tip. The initializer of TipView requires an instance of the Tip protocol we discussed above, an Edge parameter, which is optional, for deciding the edge of the tip view that displays the arrow.

    Displaying a Tip

    Following are the two ways the Tip can be displayed.

    • Inline

    You can display the tip along with other views. An object of TipView requires a type conforming Tip protocol used to display the Inline tip. As a developer, handling multiple views on the screen could be a complex and time-consuming task. TipKit framework makes it easy for the developers as it automatically adjusts the layout and the position of the TipView to ensure other views are accessible to the user. 

    struct ProductList: View {
        private let cartTip = CartItemsTip()
        var body: some View {
            \ Other views
            
            TipView(cartTip)
            
            \ Other views
        }
    }

    • Popover

    TipKit Frameworks allow you to show a popover Tip for any UI element, e.g., Button, Image, etc. The popover tip appears over the entire screen, thus blocking the other views from user interaction until the tip is dismissed. A popoverTip modifier displays a Popover Tip for any UI element. Consider an example below where a Popover tip is displayed for a cart image.

    private let cartTip = CartItemsTip()
    Button {
       cartTip.invalidate(reason: .actionPerformed)
    } label: {
       Image(systemName: "cart")
           .popoverTip(cartTip)
    }

    Dismissing the Tip

    A TipView can be dismissed in two ways.

    1. The user needs to click on X icon.
    2. Developers can dismiss the Tip programmatically using the invalidate(reason:) method.

    There are 3 options to pass as a reason for dismissing the tip: 

    actionPerformed, userClosedTip, maxDisplayCountExceeded

    private let cartTip = CartItemsTip()
    cartTip.invalidate(reason: .actionPerformed)

    Tips Center

    We have discussed essential points to define and display a tip using Tip protocol and TipView, respectively. Still, there is one last and most important step—to configure and load the tip using the configure method as described in the below example. This is mandatory to display the tips within your application. Otherwise, you will not see tips.

    import SwiftUI
    import TipKit
    
    @main
    struct TipkitDemoApp: App {
        var body: some Scene {
            WindowGroup {
                ContentView()
                    .task {
                        try? Tips.configure([
                            .displayFrequency(.immediate),
                            .datastoreLocation(.applicationDefault)
                        ])
                    }
            }
        }
    }

    If you see the definition of configure method, it should be something like:

    static func configure(@Tips.ConfigurationBuilder options: @escaping () -> some TipsConfiguration = { defaultConfiguration }) async throws

    If you notice, the configure method accepts a list of types conforming to TipsConfiguration. There are two options available for TipsConfiguration, DisplayFrequency and DataStoreLocation. 

    You can set these values as per your requirement. 

    DisplayFrequency

    DisplayFrequnecy allows you to control the frequency of your tips and has multiple options. 

    • Use the immediate option when you do not want to set any restrictions.
    • Use the hourly, daily, weekly, and monthly values to display no more than one tip hourly, weekly, and so on, respectively. 
    • For some situations, you need to set the custom display frequency as TimeInterval, when all the available options could not serve the purpose. In the below example, we have set a custom display frequency that restricts the tips to be displayed once per two days.
    let customDisplayFrequency: TimeInterval = 2 * 24 * 60 * 60
    try? Tips.configure([
         .displayFrequency(customDisplayFrequency),
         .datastoreLocation(.applicationDefault)
    ])

    DatastoreLocation

    This will be used for persisting tips and associated data. 

    You can use the following initializers to decide how to persist tips and data.

    public init(url: URL, shouldReset: Bool = false)

    url: A specific URL location where you want to persist the data.

    shouldReset: If set to true, it will erase all data from the datastore. Resetting all tips present in the application.

    public init(_ location: DatastoreLocation, shouldReset: Bool = false)

    location: A predefined datastore location. Setting a default value ‘applicationDefault’ would persist the datastore in the app’s support directory. 

    public init(groupIdentifier: String, directoryName: String? = nil, shouldReset: Bool = false) throws

    groupIdentifier: The name of the group whose shared directory is used by the group of your team’s applications. Use the optional directoryName to specify a directory within this group.

    directoryName: The optional directoryName to specify a directory within the group.

    Max Display Count

    As discussed earlier, we can set options to define tip behavior. One such option is MaxDisplayCount. Consider that you want to show CartItemsTip whenever the user is on the Home screen. Showing the tip every time a user comes to the Home screen can be annoying or frustrating. To prevent this, one of the solutions, perhaps the easiest, is using MaxDisplayCount. The other solution could be defining a Rule that determines when the tip needs to be displayed. Below is an example showcasing the use of the MaxDisplayCount option for defining CartItemsTip.

    struct CartItemsTip: Tip {
        var title: Text {
            Text("Click here to see what's in your cart")
        }
        
        var message: Text? {
            Text("You can edit/remove the items from your cart")
        }
        
        var image: Image? {
            Image(systemName: "cart")
        }
        
        var options: [TipOption] {
            [ MaxDisplayCount(2) ]
        }
    }

    Rule Based Tips

    Let’s understand how Rules can help you gain more control over displaying your tips. There are two types of Rules: parameter-based rules and event-based rules.

    Parameter Rules

    These are persistent and more useful for State and Boolean comparisons. There are Macros (#Rule, @Parameter) available to define a rule. 

    In the below example, we define a rule that checks if the value stored in static itemsInCart property is greater than or equal to 3. 

    Defining rules ensures displaying tips only when all the conditions are satisfied.

    struct CartTip: Tip {
        
        var title: Text {
            Text("Proceed with buying cart items.")
        }
        
        var message: Text? {
            Text("There are 3 or more items in your cart.")
        }
        
        var image: Image? {
            Image(systemName: "cart")
        }
        
        @Parameter
        static var itemsInCart: Int = 0
        
        var rules: [Rule] {
            #Rule(Self.$itemsInCart) { $0 >= 3 }
        }
    }

    Event Rules

    Event-based rules are useful when we want to track occurrences of certain actions in the app. Each event has a unique identifier id of type string, with which we can differentiate between various events. Whenever the action occurs, we need to use the denote() method to increment the counter. 

    Let’s consider the below example where we want to show a Tip to the user when the user selects the iPhone 14 Pro (256 GB) – Purple product more than 2 times.

    The example below creates a didViewProductDetail event with an associated donation value and donates it anytime the ProductDetailsView appears:

    struct ProductDetailsView: View {
        static let didViewProductDetail = Tips.Event<DidViewProduct>(id: "didViewProductDetail")
        var product: ProductModel
        var body: some View {
            VStack(alignment: .leading) {
                HStack(alignment: .top, content: {
                    Spacer()
                    Image(product.productImage, bundle: Bundle.main)
                    Spacer()
                })
                Text(product.productName)
                    .font(.title3)
                    .lineLimit(nil)
                Text(product.productPrice)
                    .font(.title2)
                Text("Get it by Wednesday, 18 October")
                    .font(.caption2)
                    .lineLimit(nil)
                Spacer()
            }
            .padding()
            .onAppear {
                Self.didViewProductDetail.sendDonation(.init(productID: product.productID, productName: product.productName))
                    }
        }
    }
    
    struct DidViewProduct: Codable, Sendable {
        let productID: UUID
        let productName: String
    }

    The example below creates a display rule for ProductDetailsTip based on the didViewProductDetail event.

    struct ProductDetailsTip: Tip {
        var title: Text {
            Text("Add iPhone 14 Pro (256 GB) - Purple to your cart")
        }
        
        var message: Text? {
            Text("You can edit/remove the items from your cart")
        }
        
        var image: Image? {
            Image(systemName: "cart")
        }
        
        var rules: [Rule] {
            // Tip will only display when the didViewProductDetail event for product name 'iPhone 14 Pro (256 GB) - Purple' has been donated 3 or more times in a day.
            #Rule(ProductDetailsView.didViewProductDetail) {
                $0.donations.donatedWithin(.day).filter( { $0.productName == "iPhone 14 Pro (256 GB) - Purple" }).count >= 3
            }
        }
        
        var actions: [Action] {
            [
                Tip.Action(id: "add-product-to-cart", title: "Add to cart", perform: {
                    print("Product added into the cart")
                })
            ]
        }
    }

    Customization for Tip

    Customization is the key feature as every app has its own theme throughout the application. Customizing tips to gale along with application themes surely enhances the user experience. Although, as of now, there is not much customization offered by the TipKit framework, but we expect it to get upgraded in the future. Below are the available methods for customization of tips.

    public func tipAssetSize(_ size: CGSize) -> some View
    
    public func tipCornerRadius(_ cornerRadius: Double, 
                                antialiased: Bool = true) -> some View
    
    public func tipBackground(_ style: some ShapeStyle) -> some View

    Testing

    Testing tips is very important as a small issue in the implementation of this framework can ruin your app’s user experience. We can construct UI test cases for various scenarios, and tthe following methods can be helpful to test tips.

    • showAllTips
    • hideAllTips
    • showTips([<instance-of-your-tip>])
    • hideTips([<instance-of-your-tip>])

    Pros

    • Compatibility: TipKit is compatible across all the Apple platforms, including iOS, macOs, watchOs, visionOS.
    • Supports both SwiftUI and UIKit
    • Easy implementation and testing
    • Avoiding dependency on third-party libraries

    Cons 

    • Availability: Only available from iOS 17.0, iPadOS 17.0, macOS 14.0, Mac Catalyst 17.0, tvOS 17.0, watchOS 10.0 and visionOS 1.0 Beta. So no backwards compatibility as of now.
    • It might frustrate the user if the application incorrectly implements this framework

    Conclusion

    The TipKit framework is a great way to introduce new features in our application to the user. It is easy to implement, and it enhances the user experience. Having said that, we should avoid extensive use of it as it may frustrate the user. We should always avoid displaying promotional and error messages in the form of tips.

  • Optimizing iOS Memory Usage with Instruments Xcode Tool

    Introduction

    Developing iOS applications that deliver a smooth user experience requires more than just clean code and engaging features. Efficient memory management helps ensure that your app performs well and avoids common pitfalls like crashes and excessive battery drain. 

    In this blog, we’ll explore how to optimize memory usage in your iOS app using Xcode’s powerful Instruments and other memory management tools.

    Memory Management and Usage

    Before we delve into the other aspects of memory optimization, it’s important to understand why it’s so essential:

    Memory management in iOS refers to the process of allocating and deallocating memory for objects in an iOS application to ensure efficient and reliable operation. Proper memory management prevents issues like memory leaks, crashes, and excessive memory usage, which can degrade an app’s performance and user experience. 

    Memory management in iOS primarily involves the use of Automatic Reference Counting (ARC) and understanding how to manage memory effectively.

    Here are some key concepts and techniques related to memory management in iOS:

    1. Automatic Reference Counting (ARC): ARC is a memory management technique introduced by Apple to automate memory management in Objective-C and Swift. With ARC, the compiler automatically inserts retain, release, and autorelease calls, ensuring that memory is allocated and deallocated as needed. Developers don’t need to manually manage memory by calling “retain,” “release,” or “autorelease`” methods as they did in manual memory management in pre-ARC era.
    2. Strong and Weak References: In ARC, objects have strong, weak, and unowned references. A strong reference keeps an object in memory as long as at least one strong reference to it exists. A weak reference, on the other hand, does not keep an object alive. It’s commonly used to avoid strong reference cycles (retain cycles) and potential memory leaks.
    3. Retain Cycles: A retain cycle occurs when two or more objects hold strong references to each other, creating a situation where they cannot be deallocated, even if they are no longer needed. To prevent retain cycles, you can use weak references, unowned references, or break the cycle manually by setting references to “nil” when appropriate.
    4. Avoiding Strong Reference Cycles: To avoid retain cycles, use weak references (and unowned references when appropriate) in situations where two objects reference each other. Also, consider using closure capture lists to prevent strong reference cycles when using closures.
    5. Resource Management: Memory management also includes managing other resources like files, network connections, and graphics contexts. Ensure you release or close these resources when they are no longer needed.
    6. Memory Profiling: The Memory Report in the Debug Navigator of Xcode is a tool used for monitoring and analyzing the memory usage of your iOS or macOS application during runtime. It provides valuable insights into how your app utilizes memory, helps identify memory-related issues, and allows you to optimize the application’s performance.

    Also, use tools like Instruments to profile your app’s memory usage and identify memory leaks and excessive memory consumption.

    Instruments: Your Ally for Memory Optimization

    In Xcode, “Instruments” refer to a set of performance analysis and debugging tools integrated into the Xcode development environment. These instruments are used by developers to monitor and analyze the performance of their iOS, macOS, watchOS, and tvOS applications during development and testing. Instruments help developers identify and address performance bottlenecks, memory issues, and other problems in their code.

     

    Some of the common instruments available in Xcode include:

    1. Allocations: The Allocations instrument helps you track memory allocations and deallocations in your app. It’s useful for detecting memory leaks and excessive memory usage.
    2. Leaks: The Leaks instrument finds memory leaks in your application. It can identify objects that are not properly deallocated.
    3. Time Profiler: Time Profiler helps you measure and analyze the CPU usage of your application over time. It can identify which functions or methods are consuming the most CPU resources.
    4. Custom Instruments: Xcode also allows you to create custom instruments tailored to your specific needs using the Instruments development framework.

    To use these instruments, you can run your application with profiling enabled, and then choose the instrument that best suits your performance analysis goals. 

    Launching Instruments

    Because Instruments is located inside Xcode’s app bundle, you won’t be able to find it in the Finder. 

    To launch Instruments on macOS, follow these steps:

    1. Open Xcode: Instruments is bundled with Xcode, Apple’s integrated development environment for macOS, iOS, watchOS, and tvOS app development. If you don’t have Xcode installed, you can download it from the Mac App Store or Apple’s developer website.
    2. Open Your Project: Launch Xcode and open the project for which you want to use Instruments. You can do this by selecting “File” > “Open” and then navigating to your project’s folder.
    3. Choose Instruments: Once your project is open, go to the “Xcode” menu at the top-left corner of the screen. From the drop-down menu, select “Open Developer Tool” and choose “Instruments.”
    4. Select a Template: Instruments will open, and you’ll see a window with a list of available performance templates on the left-hand side. These templates correspond to the different types of analysis you can perform. Choose the template that best matches the type of analysis you want to conduct. For example, you can select “Time Profiler” for CPU profiling or “Leaks” for memory analysis.
    5. Configure Settings: Depending on the template you selected, you may need to configure some settings or choose the target process (your app) you want to profile. These settings can typically be adjusted in the template configuration area.
    6. Start Recording: Click the red record button in the top-left corner of the Instruments window to start profiling your application. This will launch your app with the selected template and begin collecting performance data.
    7. Analyze Data: Interact with your application as you normally would to trigger the performance scenarios you want to analyze. Instruments will record data related to CPU usage, memory usage, network activity, and other aspects of your app’s performance.
    8. Stop Recording: When you’re done profiling your app, click the square “Stop” button in Instruments to stop recording data.
    9. Analyze Results: After stopping the recording, Instruments will display a detailed analysis of your app’s performance. You can explore various graphs, timelines, and reports to identify and address performance issues.
    10. Save or Share Results: You can save your Instruments session for future reference or share it with colleagues if needed.

    Using the Allocations Instrument

    The “Allocations” instrument helps you monitor memory allocation and deallocation. Here’s how to use it:

    1. Start the Allocations Instrument: In Instruments, select “Allocations” as your instrument.

    2. Profile Your App: Use your app as you normally would to trigger the scenarios you want to profile.

    3. Examine the Memory Allocation Graph: The graph displays memory usage over time. Look for spikes or steady increases in memory usage.

    4. Inspect Objects: The instrument provides a list of objects that have been allocated and deallocated. You can inspect these objects and their associated memory usage.

    5. Call Tree and Source Code: To pinpoint memory issues, use the Call Tree to identify the functions or methods responsible for memory allocation. You can then inspect the associated source code in the Source View.

    Detecting Memory Leaks with the Leaks Instrument

    Retain Cycle

    A retain cycle in Swift occurs when two or more objects hold strong references to each other in a way that prevents them from being deallocated, causing a memory leak. This situation is also known as a “strong reference cycle.” It’s essential to understand retain cycles because they can lead to increased memory usage and potential app crashes.  

    A common scenario for retain cycles is when two objects reference each other, both using strong references. 

    Here’s an example to illustrate a retain cycle:

    class Person {
        var name: String
        var pet: Pet?
    
        init(name: String) {
            self.name = name
        }
    
        deinit {
            print("(name) has been deallocated")
        }
    }
    
    class Pet {
        var name: String
        var owner: Person?
    
        init(name: String) {
            self.name = name
        }
    
        deinit {
            print("(name) has been deallocated")
        }
    }
    
    var rohit: Person? = Person(name: "Rohit")
    var jerry: Pet? = Pet(name: "Jerry")
    
    rohit?.pet = jerry
    jerry?.owner = rohit
    
    rohit = nil
    jerry = nil

    In this example, we have two classes, Person and Pet, representing a person and their pet. Both classes have a property to store a reference to the other class (person.pet and pet.owner).  

    The “Leaks” instrument is designed to detect memory leaks in your app. 

    Here’s how to use it:

    1. Launch Instruments in Xcode: First, open your project in Xcode.  

    2. Commence Profiling: To commence the profiling process, navigate to the “Product” menu and select “Profile.”  

    3. Select the Leaks Instrument: Within the Instruments interface, choose the “Leaks” instrument from the available options.  

    4. Trigger the Memory Leak Scenario: To trigger the scenario where memory is leaked, interact with your application. This interaction, such as creating a retain cycle, will induce the memory leak.

    5. Identify Leaked Objects: The Leaks Instrument will automatically detect and pinpoint the leaked objects, offering information about their origins, including backtraces and the responsible callers.  

    6. Analyze Backtraces and Responsible Callers: To gain insights into the context in which the memory leak occurred, you can inspect the source code in the Source View provided by Instruments.  

    7. Address the Leaks: Armed with this information, you can proceed to fix the memory leaks by making the necessary adjustments in your code to ensure memory is released correctly, preventing future occurrences of memory leaks.

    You should see memory leaks like below in the Instruments.

    The issue in the above code is that both Person and Pet are holding strong references to each other. When you create a Person and a Pet and set their respective references, a retain cycle is established. Even when you set rohit and jerry to nil, the objects are not deallocated, and the deinit methods are not called. This is a memory leak caused by the retain cycle. 

    To break the retain cycle and prevent this memory leak, you can use weak or unowned references. In this case, you can make the owner property in Pet a weak reference because a pet should not own its owner:

    class Pet {
        var name: String
        weak var owner: Person?
    
        init(name: String) {
            self.name = name
        }
    
        deinit {
            print("(name) has been deallocated")
        }
    }

    By making owner a weak reference, the retain cycle is broken, and when you set rohit and jerry to nil, the objects will be deallocated, and the deinit methods will be called. This ensures proper memory management and avoids memory leaks.

    Best Practices for Memory Optimization

    In addition to using Instruments, consider the following best practices for memory optimization:

    1. Release Memory Properly: Ensure that memory is released when objects are no longer needed.

    2. Use Weak References: Use weak references when appropriate to prevent strong reference cycles.

    3. Using Unowned to break retain cycle: An unowned reference does not increment or decrease an object’s reference count. 

    3. Minimize Singletons and Global Variables: These can lead to retained objects. Use them judiciously.

    4. Implement Lazy Loading: Load resources lazily to reduce initial memory usage.

    Conclusion

    Optimizing memory usage is an essential part of creating high-quality iOS apps. 

    Instruments, integrated into Xcode, is a versatile tool that provides insights into memory allocation, leaks, and CPU-intensive code. By mastering these tools and best practices, you can ensure your app is memory-efficient, stable, and provides a superior user experience. Happy profiling!