DEV Community

Cover image for Cost-Tracking and Model-Spend Monitoring with LiteLLM
binyam
binyam

Posted on • Originally published at binyam.io

Cost-Tracking and Model-Spend Monitoring with LiteLLM

As AI models become more powerful and widely used, managing costs is crucial—especially when working with multiple LLM providers like OpenAI, Anthropic, or Mistral. Without proper tracking, expenses can spiral out of control.

Enter LiteLLM, a lightweight library that standardizes interactions with various LLM APIs while offering built-in cost-tracking features. In this post, we'll explore how to implement cost monitoring and spend analytics to keep your AI budget in check.


Why Track LLM Costs?

Large Language Models (LLMs) charge based on:

  • Tokens processed (input + output)
  • Model choice (GPT-4 Turbo vs. Claude Haiku)
  • API usage frequency

Without monitoring, you might:

  • Accidentally exceed budgets with high-volume requests.
  • Waste money on overpriced models for simple tasks.
  • Lack visibility into which projects or users consume the most resources.

Step 1: Setting Up LiteLLM for Cost-Tracking

LiteLLM provides a unified interface for multiple LLM providers and logs token usage + costs automatically.

Installation

pip install litellm
Enter fullscreen mode Exit fullscreen mode

Basic Usage with Cost Tracking

from litellm import completion
import os

# Set API keys (e.g., OpenAI)
os.environ["OPENAI_API_KEY"] = "your-api-key"

response = completion(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Explain AI in 1 sentence."}],
)

print(f"Response: {response.choices[0].message.content}")
print(f"Cost: ${response.cost:.4f}")  # LiteLLM calculates cost automatically!
Enter fullscreen mode Exit fullscreen mode

Output

Response: AI is the simulation of human intelligence processes by machines.
Cost: $0.0001
Enter fullscreen mode Exit fullscreen mode

Step 2: Monitoring Spend Across Teams & Projects

LiteLLM can log requests to SQL, BigQuery, or Prometheus for deeper analysis.

Logging to SQLite

from litellm import completion
from litellm.integrations.sql_logger import SQLLogger

# Initialize logger
sql_logger = SQLLogger(
    table_name="llm_logs",  # Logs token counts, costs, and timestamps
    db_path="./llm_spend.db"
)

response = completion(
    model="gpt-4",
    messages=[{"content": "Write a Python function for Fibonacci.", "role": "user"}],
    logger=sql_logger,
)
Enter fullscreen mode Exit fullscreen mode

Now, query your database:

SELECT model, SUM(cost) as total_cost 
FROM llm_logs 
GROUP BY model;
Enter fullscreen mode Exit fullscreen mode

Example Output

Model Total Cost
gpt-3.5-turbo $12.45
claude-3-haiku $3.20

Step 3: Setting Budget Alerts

Prevent overspending by adding hard limits or Slack alerts.

Hard Budget Limit

from litellm import BudgetManager

budget_manager = BudgetManager(project="marketing-campaign", total_budget=100)

try:
    response = completion(
        model="gpt-4",
        messages=[{"role": "user", "content": "Generate 10 blog ideas"}],
        budget_manager=budget_manager,
    )
except Exception as e:
    print(f"Budget exceeded: {e}")
Enter fullscreen mode Exit fullscreen mode

Slack Alerts

from litellm import alerting

alerting.slack_alert(
    webhook_url="your-slack-webhook",
    message="Warning: Project 'marketing-campaign' has spent 90% of its budget!"
)
Enter fullscreen mode Exit fullscreen mode

Step 4: Optimizing Costs

Once you track spending, optimize with:

  1. Model Switching: Use cheaper models (e.g., Haiku for simple tasks).
  2. Caching: Cache frequent queries with Redis.
  3. Batching: Combine multiple requests into one.

Example: Fallback to Cheaper Model

response = completion(
    model=["gpt-4", "gpt-3.5-turbo"],  # Fallback chain
    messages=[{"role": "user", "content": "Explain quantum computing."}],
)
Enter fullscreen mode Exit fullscreen mode

Conclusion

With LiteLLM, you can:
✅ Track costs in real-time across providers.
✅ Log spending per team/project.
✅ Set budget limits and alerts.
✅ Optimize model usage for cost efficiency.

Start implementing today, and never get blindsided by an unexpected AI bill again!

What's your biggest cost challenge with LLMs? Let's discuss in the comments! 🚀


Further Reading:

Top comments (0)