Ragrank 🎯
Feel free to contribute on GitHub 💚
The story behind Ragrank: Recently, I was building an LLM application using Retrieval Augmented Generation (RAG). After pushing that into production, I received some feedback indicating that the chatbot's responses were sometimes terrible and did not make sense with the questions asked. So, from that point onwards, I wanted to create an RAG or LLM testing platform which is:
- Open Source 💚
- Simple to use
- Multi-platform supported
- Integrated with all LLM tools.
That's how Ragrank 🎯 was born (this is not an alternative to any LLM testing tools - yet).
When I explored the tools for evaluating my RAG application, most of them were complicated to start with. The metrics were also represented by very complex equations, making it difficult to understand why we were using such metrics.
Moreover, those tools were limited to a single platform. Some were merely Python libraries while others were just UI websites. They were disjointed and difficult to integrate.
So, I decided to build a user-centric ecosystem. As a first step, I have created an open-source Python library to evaluate the RAG with simple code, serving as a foundation for the ecosystem.
My vision is not just a Python library 😄. Eventually, we will have a full-fledged website that seamlessly integrates with the library and can track evaluations. Additionally, there will be a JavaScript library that provides support for JS and Typescript LLM applications.
As of now,
Features 🔥
- Includes 4 predefined metrics for evaluating the response and context (experimental).
- Allows creation of custom metrics.
- Supports the use of any Langchain LLM for internal use.
- Provides visualization of evaluation results.
- Ingests data from multiple sources.
- (more features coming soon)
Features planned for the near future, for which I need your help 💫.
- More evaluation metrics
- Integration with popular LLM tools (langchain, llama index)
- Website for tracking evaluations
- Javascript library for evaluation
Top comments (1)
Amazing article. I don't know why AI related articles are not doing well on devto community. I see AI concepts and related posts getting less engagement.