DEV Community

Jannik Maierhoefer
Jannik Maierhoefer

Posted on

Langfuse Launch Week #2

Langfuse Launch Week Header Image

Langfuse, the open-source LLM engineering platform, is excited to announce its second Launch Week, starting on Monday, November 18, 2024. This week-long event will feature daily platform updates, culminating in a Product Hunt launch on Friday and a Virtual Town Hall on Wednesday.


Focus of Launch Week

Langfuse's second Launch Week is all about supporting the next generation of AI models and integrating the platform more deeply into developer workflows. The updates aim to deliver end-to-end prompt engineering tools specifically designed for product teams, enhancing the robustness and versatility of AI applications.


πŸ”» Day 0: Prompt Management for Vercel AI SDK

On the first day, Langfuse introduced native integration of its Prompt Management with the Vercel AI SDK. This integration enables developers to:

  • Version and release prompts directly in Langfuse.
  • Utilize prompts via the Vercel AI SDK.
  • Seamlessly monitor metrics like latency, costs, and usage.

This update answers critical questions for developers:

  • Which prompt version caused a specific bug?
  • What’s the cost and latency impact of each prompt version?
  • Which prompt versions are most used?

πŸ†š Day 1: Dataset Experiment Run Comparison View

The second day brought a new comparison view for dataset experiment runs within Langfuse Datasets. This powerful feature allows teams to:

  • Analyze multiple experiment runs side-by-side.
  • Compare application performance across test dataset experiments.
  • Explore metrics like latency and costs.
  • Drill down into individual dataset items.

This enhancement is particularly valuable for testing different prompts, models, or application configurations, making it a must-have tool for teams working on AI-powered products.


βš–οΈ Day 2: LLM-as-a-Judge Evaluations for Datasets

Day 2 of Launch Week 2 brings managed LLM-as-a-judge evaluators to dataset experiments. Assign evaluators to your datasets and they will automatically run on new experiment runs, scoring your outputs based on your evaluation criteria.

You can run any LLM-as-a-judge prompt, Langfuse comes with templates for the following evaluation criteria: Hallucination, Helpfulness, Relevance, Toxicity, Correctness, Contextrelevance, Contextcorrectness, Conciseness.

Langfuse LLM-as-a-judge works with any LLM that supports tool/function calling that is accessible via the following APIs: OpenAI, Azure OpenAI, Anthropic, AWS Bedrock. Via LLM gateways such as LiteLLM, virtually any popular LLM can be used via the OpenAI connector.


Upcoming Events

πŸ“† Virtual Town Hall

Join Langfuse for a live Virtual Town Hall on Wednesday, November 20, 2024, at 10 am PT / 7 pm CET. This session will include:

  • Live demonstrations of the new features.
  • Insights into integrating these updates into workflows.
  • A sneak peek into the future of Langfuse, including the upcoming V3 release.

πŸ…ΏοΈ Product Hunt Launch

Langfuse will make its third appearance on Product Hunt on Friday, November 22, 2024, showcasing the highlights of Launch Week and engaging with the tech community.


Stay Updated

Stay connected with Langfuse during Launch Week:

  • 🌟 Star the project on GitHub to show your support.
  • Follow Langfuse on Twitter and LinkedIn for updates.
  • Subscribe to the Langfuse mailing list to receive daily updates throughout the week.

Learn more: Langfuse Blog

Top comments (0)