This article will guide you in creating a chatbot that allows you to upload a CSV dataset. You can then ask questions about the data, and the system, powered by a language model, will provide answers based on the uploaded CSV data.
The following is a sample of the chatbot:
We will use Reflex to build this chatbot.
Outline
- Get an OpenAI API Key
- Create a new folder, open it with a code editor
- Create a virtual environment and activate
- Install requirements
- reflex setup
- my_dataframe_chatbot.py
- state.py
- style.py
- .gitignore
- run app
- conclusion
Get an OpenAI API Key
First, get your own OpenAI API key:
- Go to https://platform.openai.com/account/api-keys.
- Click on the + Create new secret key button.
- Enter an identifier name (optional) and click on the Create secret key button.
- Copy the API key to be used in this tutorial
Create a new folder, open it with a code editor
Create a new folder and name it my_dataframe_chatbot
then open it with a code editor like VS Code.
Create a virtual environment and activate
Open the terminal. Use the following command to create a virtual environment .venv
and activate it:
python3 -m venv .venv
source .venv/bin/activate
Install requirements
We will need to install reflex
to build the app, pandas
to read the CSV
file, and also openai langchain langchain-experimental
to initialize an agent to generate answers to a user's questions of an uploaded CSV file.
Run the following command in the terminal:
pip install reflex==0.3.1 pandas==2.1.1 openai==0.28.1 langchain==0.0.326 langchain-experimental==0.0.36
reflex setup
Now, we need to create the project using reflex. Run the following command to initialize the template app in my_dataframe_chatbot
directory.
reflex init --template blank
The above command will create the following file structure in my_dataframe_chatbot
directory:
You can run the app using the following command in your terminal to see a welcome page when you go to http://localhost:3000/ in your browser
reflex run
my_dataframe_chatbot.py
We need to build the structure and interface of the app and add components. Go to the my_dataframe_chatbot
subdirectory and open the my_dataframe_chatbot.py
file. This is where we will add components to build the structure and interface of the app. Add the following code to it:
import reflex as rx
from my_dataframe_chatbot import style
from my_dataframe_chatbot.state import State
def error_text() -> rx.Component:
"""return a text component to show error."""
return rx.text(State.error_texts, text_align="center", font_weight="bold", color="red",)
def head_text() -> rx.Component:
"""The header: return a text, text, divider"""
return rx.vstack(
rx.text("Chat with your data", font_size="2em", text_align="center", font_weight="bold", color="white",),
rx.text("(Note: input your openai api key, upload your csv file then click submit to start chat)",
text_align="center", color="white",),
rx.divider(border_color="white"),
)
def openai_key_input() -> rx.Component:
"""return a password component"""
return rx.password(
value=State.openai_api_key,
placeholder="Enter your openai key",
on_change=State.set_openai_api_key,
style=style.openai_input_style,
)
color = "rgb(107,99,246)"
def upload_csv():
"""The upload component."""
return rx.vstack(
rx.upload(
rx.vstack(
rx.button(
"Select File",
color=color,
bg="white",
border=f"1px solid {color}",
),
rx.text(
"Drag and drop files here or click to select files"
),
),
multiple=False,
accept = {
"text/csv": [".csv"], # CSV format
},
max_files=1,
border=f"1px dotted {color}",
padding="2em",
),
rx.hstack(rx.foreach(rx.selected_files, rx.text)),
rx.button(
"Submit to start chat",
on_click=lambda: State.handle_upload(
rx.upload_files()
),
),
padding="2em",
)
def confirm_upload() -> rx.Component:
"""text component to show upload confirmation."""
return rx.text(State.upload_confirmation, text_align="center", font_weight="bold", color="green",)
def qa(question: str, answer: str) -> rx.Component:
"""return the chat component."""
return rx.box(
rx.box(
rx.text(question, text_align="right", color="black"),
style=style.question_style,
),
rx.box(
rx.text(answer, text_align="left", color="black"),
style=style.answer_style,
),
margin_y="1em",
)
def chat() -> rx.Component:
"""iterate over chat_history."""
return rx.box(
rx.foreach(
State.chat_history,
lambda messages: qa(messages[0], messages[1]),
)
)
def loading_skeleton() -> rx.Component:
"""return the skeleton component."""
return rx.container(
rx.skeleton_circle(
size="30px",
is_loaded=State.is_loaded_skeleton,
speed=1.5,
text_align="center",
),
display="flex",
justify_content="center",
align_items="center",
)
def action_bar() -> rx.Component:
"""return the chat input and ask button."""
return rx.hstack(
rx.input(
value=State.question,
placeholder="Ask a question about your data",
on_change=State.set_question,
style=style.input_style,
),
rx.button(
"Ask",
on_click=State.answer,
style=style.button_style,
),margin_top="3rem",
)
def index() -> rx.Component:
return rx.container(
error_text(),
head_text(),
openai_key_input(),
upload_csv(),
confirm_upload(),
chat(),
loading_skeleton(),
action_bar(),
)
app = rx.App()
app.add_page(index)
app.compile()
The above code will render the text heading, an input field to enter your openai api key, a component to upload your CSV file, the chat component, and a component to input your questions to get answers.
state.py
Create a new file state.py
in the my_dataframe_chatbot
subdirectory and add the following code:
# import reflex
import reflex as rx
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain.chat_models import ChatOpenAI
from langchain.agents.agent_types import AgentType
import pandas as pd
import os
class State(rx.State):
# The current question being asked.
question: str
error_texts: str
# Keep track of the chat history as a list of (question, answer) tuples.
chat_history: list[tuple[str, str]]
openai_api_key: str
# The files to show.
csv_file: list[str]
upload_confirmation: str = ""
file_path: str
is_loaded_skeleton: bool = True
async def handle_upload(
self, files: list[rx.UploadFile]
):
"""Handle the upload of file(s).
Args:
files: The uploaded files.
"""
for file in files:
upload_data = await file.read()
outfile = rx.get_asset_path(file.filename)
self.file_path = outfile
# Save the file.
with open(outfile, "wb") as file_object:
file_object.write(upload_data)
# Update the csv_file var.
self.csv_file.append(file.filename)
self.upload_confirmation = "csv file uploaded successfully, you can now interact with your data"
def answer(self):
# turn loading state of the skeleton component to False
self.is_loaded_skeleton = False
yield
# check if openai_api_key is empty to return an error
if self.openai_api_key == "":
self.error_texts = "enter your openai api"
return
# check if csv_file is empty to return an error
if not self.csv_file:
self.error_texts = "ensure you upload a csv file and enter your openai api key"
return
if os.path.exists(self.file_path):
df = pd.read_csv(self.file_path)
else:
self.error_texts = "ensure you upload a csv file"
return
# initializes an agent for working with a chatbot and integrates it with a Pandas DataFrame
agent = create_pandas_dataframe_agent(
ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613", openai_api_key=self.openai_api_key),
df,
verbose=True,
agent_type=AgentType.OPENAI_FUNCTIONS,
)
self.upload_confirmation = ""
# Add to the answer as the chatbot responds.
answer = ""
self.chat_history.append((self.question, answer))
yield
# run the agent against a question
output = agent.run(self.question)
self.is_loaded_skeleton = True
# Clear the question input.
self.question = ""
# Yield here to clear the frontend input before continuing.
yield
# update answer from output
for item in output:
answer += item
self.chat_history[-1] = (
self.chat_history[-1][0],
answer,
)
yield
The above code handles the upload of files, it takes in questions and generates answers.
The handle_upload
function manages the asynchronous upload of file(s) provided as a list of rx.UploadFile
objects. It reads the uploaded data, specifies an output file path outfile
, and saves the uploaded file. Additionally, it updates self.csv_file
with the uploaded file's name and provides a confirmation message to self.upload_confirmation
to indicate the successful upload of a CSV file.
The answer
function interacts with OpenAI's GPT-3.5 Turbo model. It first sets loading state indicators and performs error checks, ensuring that the OpenAI API key is provided and a CSV file is uploaded. If the CSV file exists, it reads the data into a Pandas DataFrame df
. The function initializes a chatbot agent and runs it, updating the conversation history as responses are received.
style.py
Create a new file style.py
in the my_dataframe_chatbot
subdirectory and add the following code. This will add styling to the page and components:
shadow = "rgba(0, 0, 0, 0.15) 0px 2px 8px"
chat_margin = "20%"
message_style = dict(
padding="1em",
border_radius="5px",
margin_y="0.5em",
box_shadow=shadow,
)
# Set specific styles for questions and answers.
question_style = message_style | dict(
bg="#F5EFFE", margin_left=chat_margin
)
answer_style = message_style | dict(
bg="#DEEAFD", margin_right=chat_margin
)
# Styles for the action bar.
input_style = dict(
border_width="1px", padding="1em", box_shadow=shadow
)
button_style = dict(box_shadow=shadow)
# style for openai input
openai_input_style = {
"color": "white",
"margin-top": "3rem",
"margin-bottom": "0.5rem",
}
.gitignore
You can add the .venv directory to the .gitignore file to get the following:
*.db
*.py[cod]
.web
__pycache__/
.venv/
Run app
Run the following in the terminal to start the app:
reflex run
You should see an interface as follows when you go to http://localhost:3000/
First, you can enter your OpenAI API key. Then, upload a CSV file. Afterward, you can inquire with the chatbot about your dataset, and it will provide responses.
I tested the app with a CSV file that also contains an age column and I have the following chat. The chatbot produced correct responses to the question I asked:
Conclusion
You can get the code here: https://github.com/emmakodes/my_dataframe_chatbot
Top comments (3)
Reflex is a nice application.
Very True
Great, Thank you