DEV Community

AIRabbit
AIRabbit

Posted on

Supercharging Obsidian Search with AI and Ollama

Have you ever torn your hair out trying to find a note you know you saved, but the search bar just stares back at you? That was me last week. I was desperately searching for a one-liner command to clear Time Machine's local storage on my Mac. I typed in "clear time machine", "remove backups", "free space" - nothing. It felt like my notes had swallowed the command into a black hole.

Turns out I had saved it under "storage" and "space", not "delete" or "remove". Classic memory lapse. This got me thinking: our brains often don't remember the exact words we use when taking notes. In personal knowledge management, this "memory storage paradox" can make finding information a needle in a haystack problem.

The Search for a Better Search

I love Obsidian for note-taking, but its search functionality relies on exact matches. I needed a way to bridge the gap between how I remember and how I write. So, I explored some existing solutions:

  1. Vector Embeddings: They offer semantic search but require complex setup and heavy resources.
  2. Full-Text Search with Indexing: Fast but limited to literal matches.
  3. Manual Tagging: Effective but demands discipline and foresight.
  4. GPT-Based Solutions: Great semantic understanding but pose privacy concerns and depend on external services.

I have adapted some of these powerful RAG-based solutions in the past, but this time I wanted to see if there was a simpler way to implement search without embedding or indexing.

Essentially this solution is to let the AI *formulate the search* expression and not do the search itself (similar to the concept of generating a SQL statement instead of executing it https://github.com/vanna-ai/vanna).

Instead of overhauling my entire note collection, why not enhance the search query itself? By using a local Language Model (LLM) to expand my search terms, I could get a semantically rich search without sacrificing privacy or simplicity.

I find it very appealing for a number of reasons.

  • Semantic Understanding: Captures related terms and concepts.
  • Privacy Preservation: Everything runs locally; no data leaves my machine.
  • Immediate Implementation: No need for indexing or pre-processing notes.
  • Simplicity: Minimal changes to my existing workflow.

Building the Solution

How It Works

  1. User Inputs a Search Term: Let's say "clear time machine."
  2. Local LLM Generates Related Terms: The model outputs terms like "time machine cleanup," "delete backups," etc.
  3. Construct Enhanced Search Query: Combines original and related terms using Obsidian's search syntax.
  4. Execute Search in Obsidian: Retrieves notes that match any of the expanded terms.

Diving into the Code

Here's the function that queries the local LLM for related terms:

async getRelatedTerms(searchTerm: string): Promise<string[]> {
    try {
        const response = await fetch(`${this.settings.llamaEndpoint}/api/generate`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                model: "llama3.1:latest",
                prompt: `For the search term "${searchTerm}", provide a list of:
                        - Common misspellings
                        - Similar terms
                        - Alternative spellings
                        - Related words
                        Return ONLY the actual terms, one per line, with no explanations or headers.
                        Focus on finding variations of the exact term first.`,
                stream: false
            })
        });

       ...
Enter fullscreen mode Exit fullscreen mode

And here's how the enhanced search query looks like:

buildSearchQuery(terms: string[]): string {
    const allTerms = [`"${this.searchTerm.trim()}"`, ...terms.map(term => `"${term.trim()}"`)];

    if (this.plugin.settings.includeTag && this.plugin.settings.defaultTag.trim() !== '') {
        return `tag:#${this.plugin.settings.defaultTag} AND (${allTerms.join(' OR ')})`;
    }
    return allTerms.join(' OR ');
}
Enter fullscreen mode Exit fullscreen mode

Real-World Example

Let's revisit my Time Machine dilemma.

  • Original Search: "clear time machine"
  • LLM-Expanded Terms:

    • "time machine cleanup"
    • "delete time machine backups"
    • "remove old backups"
    • "free up time machine space"
  • Enhanced Search Query:

tag:#howto AND (
    "clear time machine" OR 
    "time machine cleanup" OR 
    "delete time machine backups" OR 
    "remove old backups" OR 
    "free up time machine space"
)
Enter fullscreen mode Exit fullscreen mode

With this query, Obsidian pulled up the elusive note instantly!

Getting It Up and Running

Requirements

  • Obsidian: Your go-to note-taking app.
  • Local LLM API: I used Llama running locally.
  • Basic Knowledge of Obsidian: Familiarity with search syntax helps.

Steps

  1. Set Up a Local LLM: Install and run Llama or any other local LLM API.
  2. Install the Plugin: Place the plugin files into your Obsidian plugins directory.
  3. Configure Settings: Set your LLM endpoint and default tag in the plugin settings.
  4. Start Searching: Use the enhanced search to find notes more effectively.

Final Thoughts

With this little tweak, I was able to use my on-device AI to improve an existing search capability in Obsisidan, and it really did. I am thinking of adapting a similar solution for other tools that, unlike Obsidian, do not yet have GPT or AI-powered search. I will be sure to share any findings with you soon.

Top comments (0)