DEV Community

Cover image for πŸ”ͺ 6 Killer Open-Source Libraries to Achieve AI Mastery in 2024 πŸ”₯πŸͺ„
Jeffrey Ip for Confident AI

Posted on • Updated on

πŸ”ͺ 6 Killer Open-Source Libraries to Achieve AI Mastery in 2024 πŸ”₯πŸͺ„

TL;DR

AI has traditionally been a very difficult field for web developers to break into... until now 😌 With the introduction of large language models (LLMs) like ChatGPT, it seems like nowadays anyone can become an AI engineer. But make no mistake, this cannot be further from the truth.

In this article, I will reveal the current top AI libraries that makes a mediocre AI engineer exceptional. As an ex-Google, ex-Microsoft AI engineer myself, I will show you how exceptional AI engineers use these libraries to build great applications.

Are you ready to up-skill yourself and be one step closer to becoming an AI wizard before 2024? Lets begin πŸ€—


1. DeepEval - Open-source Evaluation Infrastructure for LLMs

Image description

A good engineer can build, but an exceptional engineer can communicate the value of what they're built. DeepEval allows you to do exactly that.

DeepEval allows you to unit test and debug your large language model (LLM, or just AI) applications at scale in both development and production in under 10 lines of code.

Why is this valuable you ask? Because companies nowadays want to be seen as an innovative AI company and so stakeholders prefer engineers that can not just build like an indie hacker, but know how to ship reliable AI applications like a seasonal AI specialist.**

import pytest
from deepeval import assert_test
from deepeval.test_case import LLMTestCase
from deepeval.metrics import AnswerRelevancyMetric
import chatbot

def test_chatbot():
   input = "How to become an AI engineer in 2024?"
   test_case = LLMTestCase(input=input, actual_output=chatbot(input))
   answer_relevancy_metric = AnswerRelevancyMetric()
   assert_test(test_case, [answer_relevancy_metric])
Enter fullscreen mode Exit fullscreen mode

🌟 Star DeepEval on GitHub


2. Unstructured - Pre-processing for Unstructured Data

LLMs thrive because they are versatile and can handle a large variety of inputs, but not all. Unstructured helps you easily transform unstructured data like webpages, PDFs, tables into readable formats for LLMs.

What does this mean? This means you can now enable your AI application to be customized on your internal documents. Unstructured is amazing because it in my opinion, operates at the right level of abstraction - it gives the boring hard work while giving you enough control as a developer.

from unstructured.partition.auto import partition

elements = partition(filename="example-docs/eml/fake-email.eml")
print("\n\n".join([str(el) for el in elements]))
Enter fullscreen mode Exit fullscreen mode

🌟 Star Unstructured


3. Airbyte - Data Integration for LLMs

Image description

Connect data sources, move data around, basically most of what you need to build a real-time AI application, using Airbyte. Allows your LLMs to be connected to information outside of the data it was trained on.

Alike Unstructured, Airbyte provides a great level of abstraction over the work an AI engineer does.

🌟 Star Airbyte


4. Qdrant - Fast Vector Search for LLMs

Ever wondered what happens if you feed in too much data to ChatGPT? That's right, you'll encounter a context overflow error.

That's because LLMs cannot take in infinite information. To help with that, we need a way to only feed in relevant information. And this process, is known as retrieval augmented generation (RAG). Here's another great article on what RAG is.

Qdrant is a vector database that helps you do just that. It stores and retrieve relevant information at blazing fast speed, ensuring your application stays up to date with the real world.

🌟 Star Qdrant


5. MemGPT - Memory Management for LLMs

So Qdrant helps give LLMs "long-term memory", but what happens if there's too much to "remember"? MemGPT helps you manage memory for this exact use case.

MemGPT is like a cache for vector databases, with its own proprietary way to clearing caches. It helps you manage redundant information in your knowledge bases, making your AI application more performant and accurate.

🌟 Star MemGPT


6. LiteLLM - LLM proxy

LiteLLM is a proxy for multiple LLMs. It is great for experimentation and combined with DeepEval, allows you to pick the best model for your use case. The best part? it allows you to use any model it supports in the same OpenAI interface.

from litellm import completion
import os

## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-openai-key" 

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
Enter fullscreen mode Exit fullscreen mode

🌟 Star LiteLLM


Closing Remarks

That's all folks, thanks for reading and I'd hope you learned a few things along the way!

Please like and comment if enjoyed this article, and as always, don't forget to give open-source some love by starring their repos as a token of appreciation 🌟.

Top comments (23)

Collapse
 
matijasos profile image
Matija Sosic

Great list! I agree, it's so hard to choose just 5. I'd add usemage.ai/ -> it's not a library per se, but if you want to generate a full React/Node.js app from a short description, this is the best tool out there (and it's free, no OpenAI key required!)

Keep up the great work :)

Collapse
 
guybuildingai profile image
Jeffrey Ip

Great addition!

Collapse
 
ranjancse profile image
Ranjan Dailata
Collapse
 
guybuildingai profile image
Jeffrey Ip

Excluded for a reason :) Quality over quantity

Collapse
 
uliyahoo profile image
uliyahoo

Great stuff. Coding with LLMs is just getting started and people need help finding all the great tools.

Would also check out CopilotKit - React library for building in-app chatbots & Textareas.
github.com/CopilotKit/CopilotKit

Collapse
 
guybuildingai profile image
Jeffrey Ip

Nice work!

Collapse
 
rajeshj3 profile image
Rajesh Joshi

Here's an OpenSource project, helpful in running ML models in the background.

Get Job Execution Reminders ⏰ via Webhook using WebhookPlan

View on GitHub

Collapse
 
guybuildingai profile image
Jeffrey Ip

Seems cool!

Collapse
 
valvonvorn profile image
val von vorn

Another killer! Thanks for your spam post!

Collapse
 
guybuildingai profile image
Jeffrey Ip

Ur welcome!

Collapse
 
valvonvorn profile image
val von vorn

did you inspire from Devs Killer website maybe?
devskiller.com/

Collapse
 
srbhr profile image
Saurabh Rai

MemGPT being on this list is awesome. It's a nice little cache for your "Vector Databases."

Collapse
 
guybuildingai profile image
Jeffrey Ip

I see what you did there!

Collapse
 
srbhr profile image
Saurabh Rai

πŸ˜‚

Collapse
 
fernandezbaptiste profile image
Bap

I love your banner! Really cool list thanks for sharing!

Collapse
 
guybuildingai profile image
Jeffrey Ip

Thank you, glad you liked it!

Collapse
 
majilaii profile image
Kuong Ao Ieong

I am using DeepEval for almost all of my AI projects and so far I love it! Honestly love the platform and the intuitive design

Collapse
 
biplobsd profile image
Biplob Sutradhar

Great list. ✨

Collapse
 
guybuildingai profile image
Jeffrey Ip

Anytime :)

Collapse
 
marisogo profile image
Marine

Nice list to have some motivation to try new things!

Collapse
 
guybuildingai profile image
Jeffrey Ip

Glad you liked it!