DEV Community

Cover image for πŸ€–AI-Powered Data Queries: Ask Questions in Plain English, Get Instant Results!πŸš€
Prashant Iyer for LLMWare

Posted on

πŸ€–AI-Powered Data Queries: Ask Questions in Plain English, Get Instant Results!πŸš€

You've might have heard of SQL. It's a widely used programming language for storing and processing information in relational databases - simply put, relational databases store data in tables, where each row stores an entity and each column stores an attribute for that entity.

Let's say we have a table called customers in a relational database. If I wanted to access the names of all customers (customer_names) that have an annual_spend of at least $1000, I would have to formulate an SQL query like this:

SELECT customer_names FROM customers WHERE annual_spend >= 1000
Enter fullscreen mode Exit fullscreen mode

I would then run this query against the database to access my results.

But what if AI πŸ€– could do all this for us?

Confused Robot

LLMWare allows us to do just that, making use of small language models such as slim-sql-1b-v0, which is only 1 billion parameters in size.


SLIM SQL Tool

We'll be making use of the SLIM (Structured Language Instruction Model) SQL Tool, which is a GGUF quantized version of the slim-sql-1b-v0 model. This essentially means that our Tool is of a smaller scale than the original model. To our advantage, it doesn't require much computational power to run, so it can run locally on a CPU without an internet connection or a GPU!

This Tool is specialized in small, fast, local prototyping and is effective for SQL operations that involve a single table. The Tool enables us to ask our questions about data entirely in natural language and still get accurate results! Let's look at an example of how to do this from start to finish.


Step 1: Loading our Model πŸͺ«πŸ”‹

We'll start off by loading in our SLIM SQL Tool. Here, we check to see if the model is already downloaded locally, if not, we download it using the ModelCatalog class.

sql_tool_repo_path = os.path.join(LLMWareConfig().get_model_repo_path(), "slim-sql-tool")
if not os.path.exists(sql_tool_repo_path):
    ModelCatalog().load_model("llmware/slim-sql-tool")
Enter fullscreen mode Exit fullscreen mode

Step 2: Loading our Data πŸ“Š

We then load in the sample customer_table.csv file containing our data. This sample file is provided by the llmware library!

files = os.listdir(sql_tool_repo_path)
csv_file = "customer_table.csv"
Enter fullscreen mode Exit fullscreen mode

Next, we create a new SQL table called customer1 from our CSV file using the SQLTables class provided by the llmware library. This is important because we can run SQL queries on an SQL table, but not on a CSV file!

sql_db = SQLTables(experimental=True)
sql_db.create_new_table_from_csv(sql_tool_repo_path, csv_file, table_name="customer1")
print("update: successfully created new db table")
Enter fullscreen mode Exit fullscreen mode

Setting experimental=True will use a provided testing database to create the table in.

Robot Dancing


Step 3: Querying with AI πŸ€–

We're finally getting to the good stuff! Now that we have our model and data, we can begin to ask our questions and get back some results.

We'll first load our agent, which is an instance of the LLMfx class. This class provides us a way to interact with various models through function calls. Our agent will take a natural language input from the user, communicate it to the appropriate model, generate an SQL query, run that query against the database, and then return the results of the query to us. Essentially, this is where the magic happens!

agent = LLMfx()
agent.load_tool("sql", sample=False, get_logits=True, temperature=0.0)
Enter fullscreen mode Exit fullscreen mode

Next, we create a list of natural language questions we're going to be asking given our customer data. Let's see if our agent can answer them!

query_list = ["What is the highest annual spend of any customer?",
                  "Which customer has account number 1234953",
                  "Which customer has the lowest annual spend?"]
Enter fullscreen mode Exit fullscreen mode

We can loop through each of the queries, and let the query_db() function do all the work for us. All our results will be stored in the agent object.

for i, query in enumerate(query_list):
    response = agent.query_db(query, table=table_name)
Enter fullscreen mode Exit fullscreen mode

Results! βœ…

Now that we have our results in the agent's research_list, we can print them out.

for x in range(0,len(agent.research_list)):
    print("research: ", x, agent.research_list[x])
Enter fullscreen mode Exit fullscreen mode

For the example we have seen so far, this is what the output would look like for the first question ("What is the highest annual spend of any customer?").

Robot Dancing

The output is a dictionary containing a lot of detailed information about the steps carried out by the agent, but here are some of the more interesting parts of it:

  • db_response gives us what we want, the answer to the question! In this case, the response is 93540, meaning that the highest annual spend of any customer was $93540!
  • sql_query shows us the SQL query that was generated from our natural language question using the SLIM SQL tool. In this case, the query generated was:
SELECT MAX(annual_spend) FROM customer1
Enter fullscreen mode Exit fullscreen mode

Conclusion

And just like that, we've done it! All we gave the program was a list of natural language questions and a CSV file with data. Behind the scenes, the llmware library:

  1. created a table in a database with our data,
  2. passed our questions into an AI model to get SQL queries,
  3. ran the queries against the database, and
  4. returned the results of the queries!

And if you're still not impressed, remember that we can run this example locally on just a CPU πŸ’»!

Check out our YouTube video on this topic to see us explain the source code and analyze the results!

If you made it this far, thank you for taking the time to go through this topic with us ❀️! For more content like this, make sure to visit our page at https://dev.to/llmware.

The source code for many more examples like this one are on our GitHub at https://github.com/llmware-ai/llmware. Find this example at https://github.com/llmware-ai/llmware/blob/main/examples/SLIM-Agents/text2sql-end-to-end-2.py.

Our repository also contains a notebook for this example that you can run yourself using Google Colab, Jupyter or any other platform that supports .ipynb notebooks: https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks/NoteBook_Examples/text2sql-end-to-end-2-notebook.ipynb.

Lastly, join our Discord to interact with a growing community of AI enthusiasts of all levels of experience at https://discord.gg/fCztJQeV7J!

Top comments (1)

Collapse
 
noberst profile image
Namee

Fantastic article!