Build A Rag Chatbot with OpenAI and Langchain

#llms #rag #chatbot

Introduction

In this tutorial, we will build a custom chatbot trained with private data to provide responses to users on specific domain knowledge. This was inspired by completing the Scrimba course on Build LLM Apps with JavaScript and OpenAI by Tom Chant. We will use a Scrimba FAQs document as our base knowledge, along with OpenAI's language model.

Key Concepts

AI (Artificial Intelligence)

Artificial Intelligence is the ability of a machine to simulate human-like intelligence and mimic human skills such as learning, problem-solving, and decision-making. This is achieved using various mechanisms and techniques, like neural networks, machine learning, deep learning, and data science.

LLM (Large Language Models)

Large Language Models are AI systems trained using vast amounts of data and parameters. When provided with a prompt, input, or question, they attempt to generate the most probable and relevant response.

RAG (Retrieval Augmented Generation)

RAG stands for Retrieval Augmented Generation. While LLMs are trained on public data, they may lack knowledge about specific domains. RAG allows the use of a knowledge base with LLMs to get the best responses for domain-specific queries.

Embeddings

Computers only understand numbers. Data is processed and stored in arrays of numbers, referred to as matrices in different dimensions. These numerical representations of data are called embeddings.

Vector Store

A vector store is a database for storing vector representations of data, optimized for storing and querying high-dimensional vectors (embeddings).

Creating Our RAG Chatbot

To create our RAG chatbot, we will follow these steps:

Set up a Vector Store on Supabase.
Create an OpenAI account, buy some credits, and generate an API key.
Generate embeddings and upload them to the Vector Store.
Process user questions and generate responses.

In the following sections, we'll dive deeper into each of these steps and implement our chatbot using OpenAI, HTML, CSS, JavaScript, and LangChain.

Setting up Vector Store on Supabase

Go to Supabase and set up an account.
Create a new project.
Go to the table editor on the left panel and create the table “documents.”
Go to the query editor on the left panel and name the query “match_documents.”
Paste the SQL query below into the query editor:

-- Enable the pgvector extension to work with embedding vectors
create extension vector;

-- Create a table to store your documents
create table documents (
  id bigserial primary key,
  content text, -- corresponds to Document.pageContent
  metadata jsonb, -- corresponds to Document.metadata
  embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed
);

-- Create a function to search for documents
create function match_documents (
  query_embedding vector(1536),
  match_count int DEFAULT null,
  filter jsonb DEFAULT '{}'
) returns table (
  id bigint,
  content text,
  metadata jsonb,
  embedding jsonb,
  similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
  return query
  select
    id,
    content,
    metadata,
    (embedding::text)::jsonb as embedding,
    1 - (documents.embedding <=> query_embedding) as similarity
  from documents
  where metadata @> filter
  order by documents.embedding <=> query_embedding
  limit match_count;
end;
$$;

This SQL command will help us find the nearest match in the vector store when the user enters their input.

Generating Embeddings and Uploading to Vector Store

The illustration above shows the flow for generating embeddings from our document and uploading them to the Supabase vector store. The document is splitted into multiple chunks, which are used to create embeddings and stored in the database. Below is the code to do that:

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { createClient } from "@supabase/supabase-js";
import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase";
import { OpenAIEmbeddings } from "@langchain/openai";
import { promises as fs } from 'fs';

try {
    const text = await fs.readFile("scrimba-info.text", "utf-8");

    const splitter = new RecursiveCharacterTextSplitter({
        chunkSize: 500,
        separators: ['\n\n', '\n', ' ', ''],
        chunkOverlap: 50
    });

    const output = await splitter.createDocuments([text]);

    const sbApiKey = process.env.SUPABASE_API_KEY;
    const sbUrl = process.env.SUPABASE_URL_CHAT_BOT;
    const openAIApiKey = process.env.OPENAI_API_KEY;

    const embeddings = new OpenAIEmbeddings({ openAIApiKey });

    const client = createClient(sbUrl, sbApiKey);

    await SupabaseVectorStore.fromDocuments(output, embeddings, { client, tableName: 'documents' });

} catch (error) {
    console.log(error);
}

Getting User Questions and Generating Responses

To get the most intuitive and user-friendly response to user input, the input will go through the following steps:

We create a standalone question (most meaningful context) from the user input.
Embeddings are created from the standalone question.
The question embeddings are used to find the nearest match in the vector store.
We then use OpenAI to generate an answer, combining the nearest match retrieved from our vector store with the original user input. We can also add conversation memory for the chatbot to be aware of previous conversations—this is optional but can be a challenge to implement.

// retriever.js

import { createClient } from "@supabase/supabase-js";
import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase";
import { OpenAIEmbeddings } from "@langchain/openai";

const sbApiKey = process.env.SUPABASE_API_KEY;
const sbUrl = process.env.SUPABASE_URL_CHAT_BOT;
const openAIApiKey = process.env.OPENAI_API_KEY;

const embeddings = new OpenAIEmbeddings({ openAIApiKey });

const client = createClient(sbUrl, sbApiKey);

const vectorStore = new SupabaseVectorStore(embeddings, {
    client,
    tableName: 'documents',
    queryName: "match_documents"
});

const retriever = vectorStore.asRetriever();

export { retriever };

// combineDocuments.js

function combineDocuments(docs) {
    return docs.map((doc) => doc.pageContent).join(`\n\n`);
}

export { combineDocuments };

import { ChatOpenAI } from "@langchain/openai";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { RunnableSequence, RunnablePassthrough } from "@langchain/core/runnables";
import { retriever } from "./retriever.js";
import { combineDocuments } from "./combineDocuments.js";
import { PromptTemplate } from "@langchain/core/prompts";

const openAIApiKey = process.env.OPENAI_API_KEY;

const llm = new ChatOpenAI({ openAIApiKey });

const standaloneQuestionTemplate = `Given a question, 
convert the question to a standalone question. 
Question: {question} 
Standalone question: `;

const standaloneQuestionPrompt = PromptTemplate.fromTemplate(standaloneQuestionTemplate);

const standaloneQuestionChain = standaloneQuestionPrompt.pipe(llm).pipe(new StringOutputParser());

const retrievalChain = RunnableSequence.from([
    prevResult => prevResult.standalone_question,
    retriever,
    combineDocuments
]);

const answerTemplate = `You are a helpful and enthusiastic support bot who can answer a given question about Scrimba based on 
the context provided. Try to find the answer in the context. If you really don't know the answer, say "I'm sorry, I don't know 
the answer to that." Direct the questioner to email help@scrimba.com. Don't try to make up an answer. Always speak as if you are 
chatting with a friend.
Context: {context}
Question: {question}
Answer: `;

const answerPrompt = PromptTemplate.fromTemplate(answerTemplate);

const answerChain = answerPrompt.pipe(llm).pipe(new StringOutputParser());

const chain = RunnableSequence.from([
    {
        standalone_question: standaloneQuestionChain,
        original_input: new RunnablePassthrough()
    },
    {
        context: retrievalChain,
        question: ({ original_input }) => original_input.question.question
    },
    answerChain
]);

export async function progressConversation(question) {
    const response = await chain.invoke({ question });
    return response;
}

Conclusion

You can find the template here. It contains the full code for both the UI built with HTML, CSS, and JavaScript, and the backend using Node.js. I am not an AI expert, any feedbacks or questions in the comment section will be appreciated.