How to create your own custom ChatGPT like chatbot in less than 5 minutes using your own data and no OpenAI API

Wilbert Misingo

May 29, 2023 • 5 min read

Chatbot with Hugginface

Introduction

In today's world, businesses are constantly looking for new ways to improve their customer service and engagement. One way to do this is by creating a chatbot that can quickly and accurately answer customer questions. In this article, we will show you how to create a chatbot that is based on your own company documents using Python and some powerful AI tools.

Although the use of chatbot have become increasingly popular as a way to provide instant and personalized customer support. Building a chatbot that can understand and respond to user queries based on your company's own documents can greatly enhance the efficiency and effectiveness of your customer service.

In my previous article which can be found here where I described how you can create a chatbot using your custom data and OpenAI API, and then later wrote another one where you can integrate this chatbot to your WhatsApp business number, the article can be found here.

Thus in this article, we will guide you through the process of creating a chatbot using the code provided below.

To achieve the goal, I initially considered modifying the GPT model using my own data. But, fine-tuning is highly expensive and necessitates a sizable dataset with examples. Also, every time the document is altered, it is impossible to make final adjustments. Perhaps more importantly, fine-tuning teaches the model a new ability rather than merely letting it "know" all the information included in the documents. Consequently, fine-tuning is not the best approach for (multi-)document QA.

Prompt engineering, which includes context in the prompts, is the second strategy that springs to mind. For instance, I could insert the original document's text before the question itself instead of asking it directly. Nevertheless, the GPT model has a short attention span and can only process a small number of the prompt's 2,000 words (about 4000 tokens or 3000 words). Given that we have tens of thousands of emails from customers providing feedback and hundreds of product documentation, it is impossible to convey all the information in the prompt. Because the pricing is based on the number of tokens you use, it is also expensive if you give in a lengthy context to the API.

I thought of the notion of first using an algorithm to search the documents and select the pertinent extracts and then providing only these relevant contexts to the GPT model with my questions because the prompt has restrictions on the number of input tokens. I found a library called llama-index (formerly known as gpt-index) while doing research for my idea that accomplishes exactly what I wanted it to do and is easy to use.

And since the use of OpenAI API is a bit expensive, so I thought of a way to create a similar chatbot by the aid of OpenAI API alternative, which are free Open Source Models like GPT4All, OpenAssistant e.t.c

The process

Step 01: Preparing your training data

The first step is to gather all the documents that you want to use to create the chatbot. These documents can include product manuals, FAQs, and other helpful resources that your customers may need to reference. Once you have gathered your documents, you need to organize them into a folder called 'data' and save them in a format that can be easily read by Python.

Step 02: Installing all required libraries

To build a chatbot, we need to use some Python libraries that are specifically designed for natural language processing and machine learning. In this code snippet, we are using the llama_index , transformers and langchain libraries. You can install these libraries using pip:

$ pip install llama_index
$ pip install transformers
$ pip install langchain

Step 03: Importing Libraries and Modules

To begin, we need to import the necessary libraries and modules that will be used throughout the chatbot creation process. The code snippet below demonstrates the required imports:

import torch
from langchain.llms.base import LLM
from llama_index import SimpleDirectoryReader, GPTListIndex, PromptHelper
from llama_index import LLMPredictor, ServiceContext, QuestionAnswerPrompt
from transformers import pipeline
from typing import Optional, List, Mapping, Any

Step 04: Defining Prompt Variables

Next, we define some variables that will be used as prompt variables for the chatbot. These variables determine the maximum input size, the number of desired output tokens, and the maximum overlap between chunks. Here is the code segment that defines the prompt variables:

max_input_size = 2048
num_output = 256
max_chunk_overlap = 20

Step 05: Defining and Using the Prompt Helper

The PromptHelper class helps in handling prompts and chunking long documents. We initialize the prompt helper by passing the previously defined prompt variables. The code snippet below demonstrates the creation of the prompt helper:

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

Step 06: Creating a Custom Language Model (LLM)

In order to generate responses, we need to download and load a pre-trained language model. The code snippet below defines a custom LLM class that uses the facebook/opt-iml-max-30b model from Hugging Face:

class CustomLLM(LLM):
    model_name = "facebook/opt-iml-max-30b"
    pipeline = pipeline("text-generation", model=model_name, device="cuda:0", model_kwargs={"torch_dtype":torch.bfloat16})

    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        prompt_length = len(prompt)
        response = self.pipeline(prompt, max_new_tokens=num_output)[0]["generated_text"]
        return response[prompt_length:]

    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        return {"name_of_model": self.model_name}

    @property
    def _llm_type(self) -> str:
        return "custom"

NB:

Before using this you may consider finding the appropriate model that you find it suitable for you by considering:-

The license of the model
The size of the model

Step 07: Initializing the Language Model and Service Context

Once we have defined our custom LLM, we can initialize it and create a service context. The service context encapsulates the necessary components for our chatbot, including the LLM and prompt helper. Here's the code to initialize the LLM and service context:

llm_predictor = LLMPredictor(llm=CustomLLM())
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

Step 08: Defining the Question-Answer Prompt Template

To structure the interaction with the chatbot, we define a template for the question-answer prompt. This template includes placeholders for the context information and the user's question. The code snippet below shows the template definition:

QA_PROMPT_TMPL = (
    "We have provided context information below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given this

 information, please answer the question: {query_str}\n"
)

QA_PROMPT = QuestionAnswerPrompt(QA_PROMPT_TMPL)

Step 09: Loading training data

To make the chatbot knowledgeable about your company, you need to load your company documents into the chatbot's index. The code snippet below demonstrates loading the data from a specified directory:

documents = SimpleDirectoryReader('./data').load_data()

Step 10: Generating the Index

Once the documents are loaded, we generate an index using the GPTListIndex class. The index is responsible for efficiently retrieving relevant information based on user queries. Here's the code to generate the index:

index = GPTListIndex.from_documents(documents, service_context=service_context)

Step 11: Saving and Loading the Index

To avoid re-indexing the documents every time the chatbot is restarted, we can save the index to disk and load it later. Here's how you can save and load the index:

index.save_to_disk('index.json')
index = GPTListIndex.load_from_disk('index.json')

Step 12: Querying the Chatbot and Getting a Response

Finally, we can interact with the chatbot by querying it with user input. The chatbot will process the query and provide a response based on the indexed company documents. The code snippet below demonstrates querying the chatbot and printing the response:

query_engine = index.as_query_engine()
response = query_engine.query("Hello, what is your function?", text_qa_template=QA_PROMPT)
print(response)

Conclusion

Congratulations! You have successfully created a chatbot based on your own company documents. This chatbot can provide accurate and relevant responses to user queries, leveraging the power of machine learning and natural language processing.

Remember, the provided code is just a starting point, and you can customize and extend it according to your specific requirements. Building a chatbot is an iterative process, so feel free to experiment and enhance the functionality based on user feedback and additional data.

Happy coding!