Introduction
When building fun projects with Retrieval-Augmented Generation (RAG) applications, we often face limitations like browsing restrictions, making it hard to get the latest information or current data, like weather updates (i hope something more funny). To solve this, we can equip our RAG application with tools to search the internet. Let’s dive in!
Our Tool-bench
- LangChain (Framework for building applications with large language models)
- SearXNG (free metasearch engine)
- CPython (a C language wrapper :> )
- Docker
Setup
First we start with the SearXNG
installation.
1 -) Get SearXNG-docker
git clone https://github.com/searxng/searxng-docker.git
2 -) Edit the .env
file to set the hostname and an email
3 -) Generate the secret key
<Linux>
sed -i "s|ultrasecretkey|$(openssl rand -hex 32)|g" searxng/settings.yml
<MacOS>
sed -i"" -e "s|ultrasecretkey|$(openssl rand -hex 32)|g" searxng/settings.yml
<Windows>
$randomBytes = New-Object byte[] 32
(New-Object Security.Cryptography.RNGCryptoServiceProvider).GetBytes($randomBytes)
$secretKey = -join ($randomBytes | ForEach-Object { "{0:x2}" -f $_ })
(Get-Content searxng/settings.yml) -replace 'ultrasecretkey', $secretKey | Set-Content searxng/settings.yml
4 -) Update the searxng/settings.yml
to enable available search formats and disable the limiter for our LangChain instance:
use_default_settings: true
server:
# base_url is defined in the SEARXNG_BASE_URL environment variable, see .env and docker-compose.yml
secret_key: "<secret-key>" # change this!
limiter: false
image_proxy: true
ui:
static_use_hash: true
redis:
url: redis://redis:6379/0
search:
formats:
- html
- json
5-) Run SearXNG Instance
docker compose up
Check the SearXNG deployment in Docker. If everything looks good, you’re ready to continue.
Demo Application
1 -) Create a virtual environment & activate
python3 -m venv .venv
source .venv/bin/activate
2 -) Install Langchain
pip install langchain langchain-community
3 -) Create main.py
## Simple Get Results
from langchain_community.utilities import SearxSearchWrapper
import pprint
s = SearxSearchWrapper(searx_host="http://localhost:8080",)
result = s.results("What is RAG?", num_results=10, engines=["google"])
pprint.pprint(result)
## Github Tool
from langchain_community.tools.searx_search.tool import SearxSearchResults
wrapper = SearxSearchWrapper(searx_host="**")
github_tool = SearxSearchResults(name="Github", wrapper=wrapper,
kwargs = {
"engines": ["github"],
})
And there you have it! Your RAG application now has search capabilities. This guide doesn’t introduce anything new but aims to bring together the steps for adding web searching functionality to your RAG application. I hope it helps!
Top comments (0)