DEV Community

Cover image for Tapping Large Language Models' Vast Potential in Knowledge Generation
Daniel Rosehill
Daniel Rosehill

Posted on

Tapping Large Language Models' Vast Potential in Knowledge Generation

Since the summer of 2024, I’ve been building a system to create an optimized workflow for generating and organizing knowledge derived from AI tools. Many consider the idea of calling AI outputs "knowledge" paradoxical, but I take a different view. When my thinking diverges from mainstream opinions, I feel more compelled to share it to add to the range of perspectives on the topic.

Can AI-Derived Information Be Considered 'Knowledge'?

All information, whether human- or computer-generated, can be suspicious. What one person regards as knowledge might seem like a conspiracy theory or mere opinion to someone else. We each curate our own collection of “knowledge” and hold personal versions of the truth. To say that LLMs disrupt an objective canon of thought is flawed.

LLMs have well-documented issues, such as hallucinations and limited context. However, discarding their utility outright is as shortsighted as abandoning libraries because some books contain inaccuracies or quitting search engines due to the prevalence of clickbait and misinformation.

While Flawed, LLM Outputs Are Far From Useless

LLMs, when properly used, are powerful tools for retrieving and synthesizing information. Although their ethical and cultural biases reflect human intervention and intentions, I believe the potential benefits of LLMs outweigh these drawbacks. LLMs offer a novel method for knowledge retrieval, which is crucial in our era of information saturation.

Image description

Why Capture and Organize LLM Outputs?

My current project aims to develop a robust system to capture and map LLM interactions, including prompts, outputs, model configurations, and context. I started this project because I found existing tools inadequate for my needs.

Image description

Why Are LLM Outputs Underappreciated by Vendors?

Despite the ease of digitally storing text, most mainstream LLMs lack comprehensive systems for organizing outputs. My motivations for creating my own system include:

  1. Using LLM outputs as a starting point for deeper research.
  2. Developing an AI-derived knowledge repository. Capturing prompts and outputs separately makes sense, as prompts reveal user intent and thinking, while outputs provide raw AI-generated data.

This approach has enabled me to create a scaled, AI-driven notepad/wiki, showcased partially at danielgoesprompting.com. However, as my collection grew, I encountered challenges inherent to organizing vast amounts of AI-derived knowledge.

Secondary Benefits of Organized AI Data

Storing LLM outputs offers more than just record-keeping. It opens doors to advanced knowledge management features:

  • AI-assisted relationship mapping: Using AI to autonomously relate outputs can streamline knowledge discovery.
  • Topic clustering: This helps visualize the evolution of research over time.
  • Enhanced discovery tools: RAG systems can mine stored outputs for new insights.
  • Historical analysis: Tracking how LLM outputs evolve over time can reveal shifts in model capabilities and prompting strategies.

ThoughtNet Stack - V1

Image description

Summer to September 2024

alt text

Components

  • PostgreSQL (Supabase)
  • ChatGPT web UI
  • React frontend (Laravel Nova)

Proof of concept achieved, but building the frontend was cumbersome and database performance suffered under complex relationships.

ThoughtNet Stack - V2

Image description

A restructured stack optimized for a graph database backend, designed for handling data relationships more intuitively.

Key Features

  • Efficient storage of LLM outputs directly in a graph database.
  • Markdown compatibility to leverage tools like Obsidian.
  • Recall of outputs and context snippets for follow-up prompts.
  • Topic trend visualization over time.
  • Interactive graph-based KM system for exploring relationships.

Daniel Rosehill

Visit my site


License

CC-BY-4.0 (Attribution 4.0 International)

Full license

Summary: This license allows sharing, adapting, and building upon this work, provided appropriate credit is given.


This blog highlights my exploration into leveraging AI for organized knowledge generation and management.

Top comments (0)