Yusen Meng

Posted on Oct 29

Building a Multi-Agent Framework from Scratch with LlamaIndex

#multiagents #llamaindex #ai #llm

Tip: For a better reading experience, please visit the full article on GitHub.

Code: The complete source code for this tutorial is available in the GitHub repository.

Introduction

Multi-agent systems are like teams of specialized workers, each with their own expertise, working together to accomplish complex tasks. In this tutorial, we'll build a multi-agent system using LlamaIndex that generates high-quality Anki flashcards from any given text.

We'll start simple and gradually add more capabilities, similar to how you might build a team - starting with one person and gradually adding more specialists as needed.

Version 1: Basic Single-Agent System

In our first version, we'll create a simple system with just one agent that can generate Anki flashcards from text. Think of it as hiring your first employee who knows how to create educational content.

Architecture

graph LR
    Input[Input Text] -->|Process| QA[Q&A Generator]
    QA -->|Generate| Cards[Flashcards]

    style Input fill:#e1f5fe,stroke:#01579b
    style QA fill:#e8f5e9,stroke:#2e7d32
    style Cards fill:#fff3e0,stroke:#ef6c00

Core Components

Let's break down our code into manageable pieces:

Data Models: First, we define what our flashcards should look like using Pydantic:

class QACard(BaseModel):
    question: str
    answer: str
    extra: str

class Flashcard_model(BaseModel):
    cards: List[QACard]

LLM Setup: We configure LlamaIndex to use our preferred language model:

def get_shared_llm():
    """Returns a shared LLM instance for all agents."""
    return OpenAI(
        model="gpt-4o-mini", 
        temperature=0,
        api_base="http://127.0.0.1:4000/v1",
        api_key="sk-test"
    )

Agent Creation: Our Q&A Generator agent is specialized in creating flashcards:

def qa_generator_factory() -> OpenAIAgent:
    system_prompt = """
    You are an educational content creator specializing in Anki flashcard generation.
    Your task is to create clear, concise flashcards following these guidelines:

    1. Each card should focus on ONE specific concept
    2. Questions should be clear and unambiguous
    3. Answers should be concise but complete
    4. Include relevant extra information in the extra field
    5. Follow the minimum information principle

    Format each card as:
    <card>
        <question>Your question here</question>
        <answer>Your answer here</answer>
        <extra>Additional context, examples, or explanations</extra>
    </card>
    """

    return OpenAIAgent.from_tools(
        [],
        llm=get_shared_llm(),
        system_prompt=system_prompt,
    )

Response Handling: We use a transformer function to convert the agent's output into structured data:

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
def transformer(message: str) -> dict:
    chat_prompt_tmpl = ChatPromptTemplate(
        message_templates=[
            ChatMessage.from_str(message, role="user")
        ]
    )
    llm = get_shared_llm()
    structured_data = llm.structured_predict(Flashcard_model, chat_prompt_tmpl)
    return structured_data.model_dump()

Main Function

Here's how we put it all together:

def generate_anki_cards(input_text: str) -> dict:
    # Create the agent
    agent = qa_generator_factory()

    # Generate flashcards
    response = agent.chat(
        f"Generate Anki flashcards from the following text:\n\n{input_text}"
    )

    # Transform the response into structured data
    structured_response = transformer(str(response))
    return structured_response

Example Usage

Let's try it with a sample text about RSI (Relative Strength Index):

sample_text = """
The Relative Strength Index (RSI) is a momentum indicator used in technical analysis.
It measures the speed and magnitude of recent price changes to evaluate overbought
or oversold conditions. RSI is displayed as an oscillator on a scale from 0 to 100.
Readings above 70 generally indicate overbought conditions, while readings below 30
indicate oversold conditions.
"""

flashcards = generate_anki_cards(sample_text)
print(flashcards)

Key Features

Single Responsibility: Our agent focuses solely on creating flashcards
Data Validation: Pydantic ensures our data is properly structured
Error Handling: Retry mechanism helps handle temporary failures
Structured Output: Consistent XML format for easy parsing
Type Safety: Python type hints help catch errors early

Current Limitations

This basic version has some limitations we'll address in future versions:

No quality control (no one reviewing the cards)
Limited understanding of complex topics
No way to handle specialized content (like code examples)
No memory of previous messages
No way to improve cards based on feedback

In the next version, we'll add a second agent to review and improve the generated flashcards, making our system more robust.

Version 2: Two-Agent System

In this version, we'll add a second agent - the Reviewer - to improve the quality of our flashcards. Think of it like having a teacher who creates the flashcards (Q&A Generator) and an editor who reviews and improves them (Reviewer).

Architecture

graph LR
    Input[Input Text] -->|Generate| QA[Q&A Generator]
    QA -->|Review| Rev[Reviewer]
    Rev -->|Output| Cards[Final Flashcards]

    style Input fill:#e1f5fe,stroke:#01579b
    style QA fill:#e8f5e9,stroke:#2e7d32
    style Rev fill:#f3e5f5,stroke:#7b1fa2
    style Cards fill:#fff3e0,stroke:#ef6c00

New Components

Let's look at what we're adding to our system:

Agent Types: First, we define our agent types using an Enum:

class Speaker(str, Enum):
    QA_GENERATOR = "Q&A Generator"
    REVIEWER = "Reviewer"

Reviewer Agent: This agent specializes in improving flashcard quality:

def reviewer_factory() -> OpenAIAgent:
    system_prompt = """
    You are the Reviewer agent. Your task is to review and refine Anki flashcards,
    ensuring they follow these rules:

    1. Each card should test ONE piece of information
    2. Questions must be:
       - Simple and direct
       - Testing a single fact
       - Using cloze format when appropriate
    3. Answers must be:
       - Brief and precise
       - Limited to essential information
    4. Extra field must include:
       - Detailed explanations
       - Examples
       - Context
    """

    return OpenAIAgent.from_tools(
        [],
        llm=get_shared_llm(),
        system_prompt=system_prompt,
    )

State Management: We track the progress of our flashcard creation:

def get_initial_state(text: str) -> dict:
    return {
        "input_text": text,
        "qa_cards": "",
        "review_status": "pending"
    }

Main Function

Here's how we combine both agents:

def generate_anki_cards(input_text: str) -> dict:
    # Initialize state
    state = get_initial_state(input_text)

    # Step 1: Generate initial cards
    generator = qa_generator_factory()
    response = generator.chat(
        f"Generate Anki flashcards from the following text:\n\n{state['input_text']}"
    )
    state["qa_cards"] = str(response)

    # Step 2: Review and improve cards
    reviewer = reviewer_factory()
    review_response = reviewer.chat(
        f"Review and improve these flashcards:\n\n{state['qa_cards']}"
    )
    state["qa_cards"] = str(review_response)

    # Transform final cards into structured data
    return transformer(state["qa_cards"])

Example Usage

Let's see how it works with our RSI example:

sample_text = """
The Relative Strength Index (RSI) is a momentum indicator used in technical analysis.
It measures the speed and magnitude of recent price changes to evaluate overbought
or oversold conditions. RSI is displayed as an oscillator on a scale from 0 to 100.
Readings above 70 generally indicate overbought conditions, while readings below 30
indicate oversold conditions.
"""

flashcards = generate_anki_cards(sample_text)

Example Output

Here's what the process looks like:

Initial Generation:

<cards>
    <card>
        <question>What is RSI in technical analysis?</question>
        <answer>A momentum indicator that measures speed and magnitude of recent price changes</answer>
        <extra>Used to evaluate overbought/oversold conditions</extra>
    </card>
</cards>

After Review:

<cards>
    <card>
        <question>RSI (Relative Strength Index) is a _____ indicator in technical analysis.</question>
        <answer>momentum</answer>
        <extra>
        Key points about RSI:
        - Measures speed/magnitude of price changes
        - Scale: 0 to 100
        - Overbought: Above 70
        - Oversold: Below 30
        </extra>
    </card>
</cards>

Key Improvements

Quality Control: The Reviewer agent ensures better card quality
Better Format: Cloze deletions for more effective learning
Richer Context: More detailed extra field content
Progress Tracking: State management shows card status
Consistent Structure: XML format maintained throughout

Current Limitations

While better than Version 1, this system still has some limitations:

No way to analyze complex topics
Basic two-step process (could be more sophisticated)
No memory of previous messages
Limited error handling
No way to handle specialized content (like code)

In Version 3, we'll add more agents to handle these limitations and create a more robust system.

Version 3: Multi-Agent Orchestra

In this version, we enhance our system by enabling multiple agents to collaborate effectively. Each agent has a specific role, and they work together to create high-quality flashcards. The Orchestrator agent plays a crucial role in managing this collaboration, ensuring that each agent contributes at the right time. Additionally, the Topic Analyzer helps break down complex topics into simpler parts, providing a structured overview of the content.

How Multi-Agent Collaboration Works

In a multi-agent system, each agent is like a team member with a specific expertise. Here's how they collaborate:

Specialized Roles: Each agent is designed to perform a specific task. For example, the Q&A Generator creates flashcards, while the Reviewer ensures their quality.
Sequential Processing: Agents work in a sequence, with each one building on the output of the previous agent. This ensures that the final product is refined and comprehensive.
Dynamic Decision-Making: The Orchestrator agent decides which agent should work next based on the current state of the process. This allows the system to adapt to the needs of the content dynamically.
Shared Context: A memory system maintains context across interactions, allowing agents to make informed decisions based on previous outputs.

Introducing the Orchestrator Agent

The Orchestrator is the manager of our multi-agent system. It coordinates the workflow, ensuring that each agent contributes effectively. Here's how it works:

Orchestrator's Role

Decision-Making: The Orchestrator evaluates the current state and decides which agent should run next. It ensures that the workflow is efficient and that all necessary tasks are completed.
Flexibility: The Orchestrator can choose any agent at any time, allowing for a flexible and adaptive process. It can run agents multiple times if needed or skip steps if the content is already complete.
End Condition: The Orchestrator decides when the process is complete, ensuring that the final flashcards are comprehensive and high-quality.

Orchestrator Implementation

Here's how we implement the Orchestrator:

def orchestrator_factory(state: dict) -> OpenAIAgent:
    system_prompt = f"""
    You are the Orchestrator agent. Your task is to coordinate the interaction between all agents to create high-quality flashcards.

    Current State:
    {pformat(state, indent=2)}

    Available agents:
    * Topic Analyzer - Breaks down complex topics
    * Q&A Generator - Creates flashcards
    * Reviewer - Improves card quality

    Decision Guidelines:
    - Use Topic Analyzer first for topic breakdown
    - Use Q&A Generator to create cards
    - Use Reviewer to refine cards
    - Choose END when the cards are ready

    Output only the next agent to run ("Topic Analyzer", "Q&A Generator", "Reviewer", or "END")
    """
    return OpenAIAgent.from_tools(
        [],
        llm=get_shared_llm(),
        system_prompt=system_prompt,
    )

Introducing the Topic Analyzer Agent

The Topic Analyzer is responsible for breaking down complex topics into manageable parts. It provides a structured overview of the content, which helps in creating more focused and comprehensive flashcards.

Topic Analyzer's Role

Content Breakdown: The Topic Analyzer identifies key topics and subtopics from the input text, helping to structure the content.
Hierarchical Structuring: It creates a hierarchical structure of topics, making it easier to understand the relationships between different concepts.
Application Suggestions: The agent suggests real-world applications or case studies related to the topics, enriching the flashcards with practical insights.

Topic Analyzer Implementation

Here's how we implement the Topic Analyzer:

def topic_analyzer_factory(state: dict) -> OpenAIAgent:
    system_prompt = f"""
    You are the Topic Analyzer agent. Your task is to analyze the given text and identify key topics for flashcard creation.

    Current State:
    {pformat(state, indent=2)}

    Instructions:
    1. Identify main concepts and sub-concepts
    2. Create a structured list of topics
    3. Suggest real-world applications

    Output format:
    <topics>
        <topic>
            <name>Main topic name</name>
            <subtopics>
                <subtopic>Subtopic 1</subtopic>
                <subtopic>Subtopic 2</subtopic>
            </subtopics>
        </topic>
    </topics>
    """
    return OpenAIAgent.from_tools(
        [],
        llm=get_shared_llm(),
        system_prompt=system_prompt,
        verbose=True
    )

Main Function

The main function now includes the Orchestrator, which dynamically decides the workflow:

def generate_anki_cards(input_text: str) -> dict:
    # Initialize state and memory
    state = get_initial_state(input_text)
    memory = setup_memory()

    while True:
        # Get current chat history
        current_history = memory.get()

        # Let Orchestrator decide next step
        orchestrator = orchestrator_factory(state)
        next_agent = str(orchestrator.chat(
            "Decide which agent to run next based on the current state.",
            chat_history=current_history
        )).strip().strip('"').strip("'")
        print(f"\nOrchestrator selected: {next_agent}")

        if next_agent == "END":
            print("\nOrchestrator decided to end the process")
            break

        # Execute selected agent
        try:
            if next_agent == Speaker.TOPIC_ANALYZER.value:
                analyzer = topic_analyzer_factory(state)
                response = analyzer.chat(
                    f"Analyze this text for flashcard topics:\n\n{state['input_text']}",
                    chat_history=current_history
                )
                state["topics"] = str(response)
                print("\nTopic Analysis Results:")
                print(state["topics"])

            elif next_agent == Speaker.QA_GENERATOR.value:
                generator = qa_generator_factory()
                response = generator.chat(
                    f"Generate flashcards for this topic:\n\n{state['topics']}",
                    chat_history=current_history
                )
                state["qa_cards"] = str(response)
                state["review_status"] = "needs_review"
                print("\nGenerated Cards:")
                print(state["qa_cards"])

            elif next_agent == Speaker.REVIEWER.value:
                reviewer = reviewer_factory()
                response = reviewer.chat(
                    f"Review these flashcards:\n\n{state['qa_cards']}",
                    chat_history=current_history
                )
                state["qa_cards"] = str(response)
                state["review_status"] = "reviewed"
                print("\nReviewed Cards:")
                print(state["qa_cards"])

                # Update memory with new interaction
                memory.put(ChatMessage(role="assistant", content=str(response)))
                print(f"\nUpdated memory with {next_agent}'s response")

        except Exception as e:
            print(f"\nError in {next_agent}: {str(e)}")
            continue

    # Transform final cards into structured data
    final_cards = transformer(state["qa_cards"])
    return final_cards

Example Usage

Here's how the orchestrated system works with our RSI example:

if __name__ == "__main__":
    sample_text = """
    The Relative Strength Index (RSI) is a momentum indicator used in technical analysis.
    It measures the speed and magnitude of recent price changes to evaluate overbought
    or oversold conditions. RSI is displayed as an oscillator on a scale from 0 to 100.
    Readings above 70 generally indicate overbought conditions, while readings below 30
    indicate oversold conditions. The RSI can also help identify divergences, trend lines,
    and failure swings that may not be apparent on the underlying price chart.
    """

    flashcards = generate_anki_cards(sample_text)
    print("\nGenerated, Analyzed, and Reviewed Flashcards:")
    print(flashcards)

Key Improvements

Coordinated Workflow: The Orchestrator manages agent interactions, ensuring a smooth and efficient process.
Topic Analysis: The Topic Analyzer provides a deeper understanding of content structure, leading to more comprehensive flashcards.
Persistent Memory: Context is maintained across interactions, allowing for more informed decision-making.
Flexible Execution: The Orchestrator dynamically selects agents based on the current state, adapting to the needs of the content.
Hierarchical Processing: Topics are broken down into manageable chunks, making it easier to create detailed and accurate flashcards.

Limitations

While this version is more sophisticated, it still has some areas for improvement:

No specialized handling of code examples
Limited formatting options
Basic error handling

In Version 4, we'll introduce the Code and Extra Field Expert agent and a Formatter agent to handle code examples and ensure consistent formatting. We'll also improve error handling and add support for different content types.

Version 4: Enhanced Capabilities

In this version, we introduce two new agents: the Code and Extra Field Expert and the Formatter. These agents handle code examples and ensure consistent formatting, respectively. We also improve error handling and add support for different content types.

Architecture

graph TB
    Input[Input Text] -->|Start| Orch[Orchestrator]

    Orch -->|Analyze| TA[Topic Analyzer]
    Orch -->|Process Code| Code[Code Expert]
    Orch -->|Generate| QA[Q&A Generator]
    Orch -->|Format| Format[Formatter]
    Orch -->|Review| Rev[Reviewer]

    TA -->|Topics| Decision{Continue?}
    Code -->|Results| Decision
    QA -->|Results| Decision
    Format -->|Results| Decision
    Rev -->|Results| Decision

    Decision -->|Yes| Orch
    Decision -->|No| Final[Final Flashcards]

    style Input fill:#e1f5fe,stroke:#01579b
    style Orch fill:#fff3e0,stroke:#ef6c00
    style TA fill:#e8f5e9,stroke:#2e7d32
    style Code fill:#e8f5e9,stroke:#2e7d32
    style QA fill:#bbdefb,stroke:#1976d2
    style Format fill:#bbdefb,stroke:#1976d2
    style Rev fill:#f3e5f5,stroke:#7b1fa2
    style Final fill:#fff3e0,stroke:#ef6c00
    style Decision fill:#ffecb3,stroke:#ffa000

New Components

1. Updated Agent Types

We expand our agent types to include the new roles:

class Speaker(str, Enum):
    QA_GENERATOR = "Q&A Generator"
    REVIEWER = "Reviewer"
    TOPIC_ANALYZER = "Topic Analyzer"
    ORCHESTRATOR = "Orchestrator"
    CODE_AND_EXTRA_FIELD_EXPERT = "Code and Extra Field Expert"
    FORMATTER = "Formatter"

2. Code and Extra Field Expert

This agent enhances flashcards with code examples and detailed explanations:

def code_and_extra_field_expert_factory() -> OpenAIAgent:
    system_prompt = """
    You are the Code and Extra Field Expert agent. Your task is to enhance Anki flashcards
    by adding relevant code snippets and comprehensive extra content.

    Instructions:
    1. Add clear, concise code examples that illustrate key concepts
    2. Ensure code snippets are well-commented and easy to understand
    3. In the extra field, provide:
       - Step-by-step explanations of code snippets 
       - Common use cases and scenarios
       - Potential pitfalls and edge cases
       - Best practices and optimization tips
    4. Use appropriate markdown formatting for code blocks
    5. Include relevant documentation links
    6. Ensure explanations are clear for a 15-year-old
    """
    return OpenAIAgent.from_tools(
        [],
        llm=get_shared_llm(),
        system_prompt=system_prompt,
    )

3. Formatter Agent

This agent ensures that the flashcards are properly formatted:

def formatter_agent_factory() -> OpenAIAgent:
    system_prompt = """
    You are the Formatter agent. Your task is to ensure proper XML structure and markdown
    formatting in the flashcards.

    Formatting Rules:
    1. Maintain valid XML structure
    2. Properly escape special characters
    3. Format code blocks with appropriate language tags
    4. Use consistent indentation
    5. Ensure markdown compatibility
    6. Preserve code snippets exactly as provided
    7. Handle nested structures properly
    """
    return OpenAIAgent.from_tools(
        [],
        llm=get_shared_llm(),
        system_prompt=system_prompt,
    )

Enhanced Error Handling and Validation

We improve error handling with retries and validation:

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
def validate_and_transform(message: str) -> dict:
    try:
        # Transform to structured data
        chat_prompt_tmpl = ChatPromptTemplate(
            message_templates=[
                ChatMessage.from_str(message, role="user")
            ]
        )
        structured_data = get_shared_llm().structured_predict(
            Flashcard_model, 
            chat_prompt_tmpl
        )
        return structured_data.model_dump()

    except ET.ParseError as e:
        print(f"XML validation error: {str(e)}")
        raise TryAgain
    except Exception as e:
        print(f"Transformation error: {str(e)}")
        raise TryAgain

Example Usage

Here's how the enhanced system works with a sample text that includes code:

if __name__ == "__main__":
    sample_text = """
    To calculate the RSI (Relative Strength Index) in Python, you typically use 
    technical analysis libraries like pandas-ta or the ta library. The RSI is 
    calculated using the average gains and losses over a specified period 
    (usually 14 periods). Here's how you can implement it:

    1. Using the ta library:
    ```

python
    import pandas as pd
    import ta

    # Assuming you have price data in a DataFrame
    df['RSI'] = ta.momentum.RSIIndicator(
        close=df['close'],
        window=14
    ).rsi()


    ```

    2. Manual implementation:
    ```

python
    def calculate_rsi(data, periods=14):
        delta = data.diff()
        gain = (delta.where(delta > 0, 0)).rolling(
            window=periods).mean()
        loss = (-delta.where(delta < 0, 0)).rolling(
            window=periods).mean()
        rs = gain / loss
        return 100 - (100 / (1 + rs))


    ```
    """

    flashcards = generate_anki_cards(sample_text)
    print("Generated Flashcards with Code Examples:")
    print(flashcards)

Key Improvements

Code Handling: Specialized agent for code examples and explanations
Consistent Formatting: Dedicated formatter agent
Robust Error Handling: Improved validation and retries
Rich Extra Content: Comprehensive explanations and best practices
Markdown Support: Proper formatting of code blocks and text

Conclusion

Throughout this tutorial, we've built a sophisticated multi-agent system for generating Anki flashcards, progressing from a simple single-agent implementation to a complex orchestrated system. Let's recap the key developments across each version:

Evolution of the System

Version 1: Basic Single-Agent
- Established core data structures
- Implemented basic flashcard generation
- Set up error handling foundation
Version 2: Two-Agent Interaction
- Added quality control through review process
- Introduced state management
- Implemented iterative refinement
Version 3: Multi-Agent Orchestra
- Added orchestration layer
- Implemented topic analysis
- Enhanced context management through memory system
Version 4: Enhanced Capabilities
- Added specialized code handling
- Implemented consistent formatting
- Enhanced error handling and validation

Key Takeaways

Modular Design
- Each agent has a specific responsibility
- Easy to add new agents or modify existing ones
- Clear separation of concerns
Robust Architecture
- State management across agents
- Error handling and retry mechanisms
- Memory system for context preservation
Quality Control
- Multiple layers of review and refinement
- Consistent formatting and structure
- Rich content with code examples and explanations

Future Enhancements

While our system is already quite capable, there are several potential areas for future improvement:

Parallel Processing
- Implement concurrent card generation
- Add batch processing capabilities
- Optimize for large-scale operations
Advanced Features
- Image handling and processing
- Multi-language support
- Audio content integration
- Interactive examples
Learning Capabilities
- Feedback incorporation
- Quality metrics tracking
- Adaptive prompt refinement
Integration Options
- API endpoints
- Web interface
- Direct Anki integration
- Version control for cards

Final Thoughts

Building a multi-agent system requires careful consideration of agent interactions, state management, and error handling. This tutorial has demonstrated how to incrementally build such a system, starting from basic components and progressively adding sophistication.

The final system showcases the power of combining multiple specialized agents, each with its own expertise, to create a robust and flexible solution. While the focus has been on Anki flashcard generation, the patterns and principles demonstrated here can be applied to many other multi-agent applications.

Remember that the key to successful multi-agent systems lies in:

Clear agent responsibilities
Robust communication patterns
Comprehensive error handling
Flexible state management
Consistent output formatting

By following these principles and the patterns demonstrated in this tutorial, you can build your own multi-agent systems for various applications beyond flashcard generation.

Introduction

Version 1: Basic Single-Agent System

Architecture

Core Components

Main Function

Example Usage

Key Features

Current Limitations

Version 2: Two-Agent System

Architecture

New Components

Main Function

Example Usage

Example Output

Key Improvements

Current Limitations

Version 3: Multi-Agent Orchestra

How Multi-Agent Collaboration Works

Introducing the Orchestrator Agent

Orchestrator's Role

Orchestrator Implementation

Introducing the Topic Analyzer Agent

Topic Analyzer's Role

Topic Analyzer Implementation

Main Function

Example Usage

Key Improvements

Limitations

Version 4: Enhanced Capabilities

Architecture

New Components

1. Updated Agent Types

2. Code and Extra Field Expert

3. Formatter Agent

Enhanced Error Handling and Validation

Example Usage

Key Improvements

Conclusion

Evolution of the System

Key Takeaways

Future Enhancements

Final Thoughts

Read next

OpenAI o3 - Thinking Fast and Slow

First step and troubleshooting Docling — RAG with LlamaIndex on my CPU laptop

Salesforce vs. HubSpot: Which CRM is Right for Your Team?

Building a Local AI Code Reviewer with ClientAI and Ollama