DEV Community

Cover image for 9 open-source libraries that will make your CV stand out ⚡ 🚀
Sunil Kumar Dash for Composio

Posted on

9 open-source libraries that will make your CV stand out ⚡ 🚀

Tech hiring is broken, and the competition has become highly fierce over the past two years. However, mastering niche technologies used by big companies and building innovative applications around them can significantly enhance your CV's credibility.

So, I've compiled a list of open-source libraries to help you stand out.

Feel free to explore these libraries and let others know in the comments about the ones your organization is implementing.

Resume GIF


Composio 👑 - AI tooling and integrations platform

AI is eating the world, and there is no dispute that the future of workforces will have a human-AI hybrid system. For this to happen, the AI model should be able to access external systems.

Composio is the industry-leading solution in this space. It provides an ever-expanding catalogue of tools and integrations across industry verticals, from CRM, HRM, and Sales to Development, Productivity, and administration.

Composio Tools and Intergrations

Easily integrate apps like GitHub, Slack, Jira, Gmail, etc, with AI models to automate complex real-world workflow automation.

It has native support for Python and Javascript.

Quickly get started with Composio using pip or npm.

pip install composio-core

npm install composio-core
Enter fullscreen mode Exit fullscreen mode

Python

Add a GitHub integration.

composio add github
Enter fullscreen mode Exit fullscreen mode

Composio handles user authentication and authorization on your behalf.

Here is how you can use the GitHub integration to star a repository.

from openai import OpenAI
from composio_openai import ComposioToolSet, App

openai_client = OpenAI(api_key="******OPENAIKEY******")

# Initialise the Composio Tool Set
composio_toolset = ComposioToolSet(api_key="**\\*\\***COMPOSIO_API_KEY**\\*\\***")

## Step 4
# Get GitHub tools that are pre-configured
actions = composio_toolset.get_actions(actions=[Action.GITHUB_ACTIVITY_STAR_REPO_FOR_AUTHENTICATED_USER])

## Step 5
my_task = "Star a repo ComposioHQ/composio on GitHub"

# Create a chat completion request to decide on the action
response = openai_client.chat.completions.create(
model="gpt-4-turbo",
tools=actions, # Passing actions we fetched earlier.
messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": my_task}
  ]
)
Enter fullscreen mode Exit fullscreen mode

Run this Python script to execute the given instruction using the agent.

Javascript

You can Install it using npmyarn, or pnpm.

npm install composio-core
Enter fullscreen mode Exit fullscreen mode

Define a method to let the user connect their GitHub account.

import { OpenAI } from "openai";
import { OpenAIToolSet } from "composio-core";

const toolset = new OpenAIToolSet({
  apiKey: process.env.COMPOSIO_API_KEY,
});

async function setupUserConnectionIfNotExists(entityId) {
  const entity = await toolset.client.getEntity(entityId);
  const connection = await entity.getConnection('github');

  if (!connection) {
      // If this entity/user hasn't already connected, the account
      const connection = await entity.initiateConnection(appName);
      console.log("Log in via: ", connection.redirectUrl);
      return connection.waitUntilActive(60);
  }

  return connection;
}

Enter fullscreen mode Exit fullscreen mode

Add the required tools to the OpenAI SDK and pass the entity name on to the executeAgent function.

async function executeAgent(entityName) {
  const entity = await toolset.client.getEntity(entityName)
  await setupUserConnectionIfNotExists(entity.id);

  const tools = await toolset.get_actions({ actions: ["github_activity_star_repo_for_authenticated_user"] }, entity.id);
  const instruction = "Star a repo ComposioHQ/composio on GitHub"

  const client = new OpenAI({ apiKey: process.env.OPEN_AI_API_KEY })
  const response = await client.chat.completions.create({
      model: "gpt-4-turbo",
      messages: [{
          role: "user",
          content: instruction,
      }],
      tools: tools,
      tool_choice: "auto",
  })

  console.log(response.choices[0].message.tool_calls);
  await toolset.handle_tool_call(response, entity.id);
}

executeGithubAgent("joey")

Enter fullscreen mode Exit fullscreen mode

Execute the code and let the agent do the work for you.

Composio works with famous frameworks like LangChain, LlamaIndex, CrewAi, etc.

For more information, visit the official docs, and for even more complex examples, see the repository's example sections.

Composio GIF

Star the Composio repository ⭐


2. Apache Kafka - Distributed Event Streaming Platform

Apache Kafka is the backbone of many Fortune 500 companies requiring high-throughput event data pipelines. Having Kafka in your CV would undoubtedly make you stand out.

It is an open-source distributed platform for handling real-time data streams. It enables large volumes of event data collection, storage, and processing with high fault tolerance.

It is ideal for building event-driven systems. Big companies like Netflix, LinkedIn, and Uber use Kafka to stream real-time data and analytics, manage event-driven architectures and monitoring systems, and enable real-time recommendations and notifications.

Download the latest Kafka release and extract it to get started with it:

$ tar -xzf kafka_2.13-3.8.0.tgz
$ cd kafka_2.13-3.8.0
Enter fullscreen mode Exit fullscreen mode

Set up Kafka with Kraft.

To use Kafka with Kraft, create a cluster UUID.

KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)" 
Enter fullscreen mode Exit fullscreen mode

Format Log directories

bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server.properties
Enter fullscreen mode Exit fullscreen mode

Start Kafka server

bin/kafka-server-start.sh config/kraft/server.properties
Enter fullscreen mode Exit fullscreen mode

Then, you can create topics and publish and consume events

Before you write your events, you must create topics. Run this in another shell.

bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Now, write some events to the topic.

bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092
>This is my first event
>This is my second event
Enter fullscreen mode Exit fullscreen mode

Read the events.

bin/kafka-console-consumer.sh --topic quickstart-events --from-beginning --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

For comprehensive details on Kafka and its use, refer to this article I wrote a while back.

Read more about Kafka here.

Kafka GIF

Explore the Kafka Mirror repository ⭐


3. Grafana - The Open Observability Platform

Grafana is another open-source software used by many big companies. It is an analytics and monitoring platform that allows you to query, store, and visualize metrics from multiple data sources. You can also create, explore, and share dashboards with your teams.

Features of Grafana include

  • Metrics and logs visualization.
  • Dynamic dashboards.
  • Alerting on Slack, Pagerduty, etc., based on custom rules for metrics.
  • Explore metrics through ad-hoc queries.
  • Mix multiple data sources in the same graph.

Check out the official documentation to explore Grafana in detail.

Grafana GIF

Explore the Grafana repository ⭐


4. Celery - Distributed task queue

Building a robust application can be challenging, especially when multiple events need to be accounted for. Celery can come in handy in these situations.

Celery is simple, flexible, distributed open-source software that facilitates real-time processing of task queues and scheduling. It lets you offload time-consuming tasks and execute them asynchronously in the background, improving your application's performance and scalability.

It is available in most programming languages, from Python and JS to Go and Rust.

Celery uses message brokers like Redis and RabbitMQ.

Get started quickly by installing with pip.

pip install celery reddit
Enter fullscreen mode Exit fullscreen mode

Start the Redis server in the background.

redis-server
Enter fullscreen mode Exit fullscreen mode

Define a simple Task like sending an email.

from celery import Celery

# Define a Celery app with Redis as the message broker
app = Celery('tasks', broker='redis://localhost:6379/0')

# Define a simple task (e.g., sending an email)
@app.task
def send_email(recipient):
    print(f"Sending email to {recipient}")
    return f"Email sent to {recipient}"

Enter fullscreen mode Exit fullscreen mode

Start the Celery worker by running the following command in the terminal:

celery -A tasks worker --loglevel=info
Enter fullscreen mode Exit fullscreen mode

You can now use send_email asynchronously in your Python code. Create another Python script to call the task:

python
Copy code
from tasks import send_email

# Call the task asynchronously using `.delay()`
send_email.delay('user@example.com')
Enter fullscreen mode Exit fullscreen mode

Once you call send_email.delay(), the task will be processed by the Celery worker asynchronously, and you'll see something like this in the terminal where the Celery worker is running:

[2024-09-24 12:00:00,000: INFO/MainProcess] Task tasks.send_email[abc123] succeeded in 0.001s: 'Email sent to user@example.com'
Enter fullscreen mode Exit fullscreen mode

For more, refer to their official documentation.

Celery GIF

Explore the Celery repository ⭐


5. Selenium - Browser Automation Framework

Browser automation is one of the inevitable things you will encounter at least once in your tech career. Many companies use Selenium for multiple purposes, such as Web automation, testing, and even scraping dynamic web content.

Selenium allows developers to interact with web browsers programmatically, simulating user actions like clicking buttons, filling out forms, and navigating between pages. This makes it an invaluable tool for testing web applications across browsers and platforms.

It is available in programming languages.

Install Selenium in Python with pip.

pip install Selenium
Enter fullscreen mode Exit fullscreen mode

You must install Chrome Webdriver for Chromium-based browsers and Gecko Driver for Firefox browsers.

Here’s an example of using Selenium with ChromeDriver:

python
Copy code
from selenium import webdriver

# Specify the path to your ChromeDriver executable
driver = webdriver.Chrome(executable_path='/path/to/chromedriver')

# Open a webpage
driver.get("https://www.example.com")

# Perform actions (e.g., click a button, find elements, etc.)
print(driver.title)  # Print the page title

# Close the browser
driver.quit()

Enter fullscreen mode Exit fullscreen mode

For more, check the documentation.

Selenium GIF

Explore the Selenium repository ⭐


6. LlamaIndex - Data Framework for LLM Applications

AI is hot right now, and multiple companies are building products around AI models. There can not be a better time to be an AI developer.

LlamaIndex is a leading framework for building applications using large language models (LLMs). It lets you connect any data store with relational, graph, or vector databases with LLMs. It provides all the bells and whistles, such as data loaders, connectors, chunkers, re-rankers, etc., to build efficient AI applications.

Quickly get started with LlamaIndex by installing it via pip.

pip install llamaindex
Enter fullscreen mode Exit fullscreen mode

A simple example of using a vector database in LlamaIndex.

# custom selection of integrations to work with core
pip install llama-index-core
pip install llama-index-llms-openai
pip install llama-index-llms-replicate
pip install llama-index-embeddings-huggingface

import os

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("YOUR_DATA_DIRECTORY").load_data()
index = VectorStoreIndex.from_documents(documents)
Enter fullscreen mode Exit fullscreen mode

Query the database.

query_engine = index.as_query_engine()
query_engine.query("YOUR_QUESTION")
Enter fullscreen mode Exit fullscreen mode

For more information, please refer to their documentation.

LlamaIndex GIF

Explore the Llama Index repository ⭐


7. Pytorch Lightning - The deep learning framework

Knowing Pytorch lightning can help your cause better if you are into AI model development.

It’s a versatile framework built with PyTorch that helps organize and grow deep learning projects. It offers tools for training, testing, and deploying models across different areas.

Here are some advantages of using Lightning over plain PyTorch:

  • It makes PyTorch code easier to read, better organized, and more user-friendly.
  • It reduces repetitive code by providing built-in training loops and utilities.
  • It simplifies the process of training, experimenting, and deploying models with less unnecessary code.

You can get started with Lightning by installing it with pip:

Define an auto-encoder using the Lightning module.

import os
from torch import optim, nn, utils, Tensor
from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor
import lightning as L

# define any number of nn.Modules (or use your current ones)
encoder = nn.Sequential(nn.Linear(28 * 28, 64), nn.ReLU(), nn.Linear(64, 3))
decoder = nn.Sequential(nn.Linear(3, 64), nn.ReLU(), nn.Linear(64, 28 * 28))

# define the LightningModule
class LitAutoEncoder(L.LightningModule):
    def __init__(self, encoder, decoder):
        super().__init__()
        self.encoder = encoder
        self.decoder = decoder

    def training_step(self, batch, batch_idx):
        # training_step defines the train loop.
        # it is independent of forward
        x, _ = batch
        x = x.view(x.size(0), -1)
        z = self.encoder(x)
        x_hat = self.decoder(z)
        loss = nn.functional.mse_loss(x_hat, x)
        # Logging to TensorBoard (if installed) by default
        self.log("train_loss", loss)
        return loss

    def configure_optimizers(self):
        optimizer = optim.Adam(self.parameters(), lr=1e-3)
        return optimizer

# init the autoencoder
autoencoder = LitAutoEncoder(encoder, decoder)

Enter fullscreen mode Exit fullscreen mode

Load MNIST data.

# setup data
dataset = MNIST(os.getcwd(), download=True, transform=ToTensor())
train_loader = utils.data.DataLoader(dataset)

Enter fullscreen mode Exit fullscreen mode

The Lightning Trainer “mixes” any LightningModule with any dataset and abstracts away all the engineering complexity needed for scale.

# train the model (hint: here are some helpful Trainer arguments for rapid idea iteration)
trainer = L.Trainer(limit_train_batches=100, max_epochs=1)
trainer.fit(model=autoencoder, train_dataloaders=train_loader)

Enter fullscreen mode Exit fullscreen mode

For more on Lightning, check out the official documentation.

Lightining GIF

Explore the Pytorch Lightning repository ⭐


8. Posthog - Open-source product analytics platform

Building a modern application is incomplete without Posthog. It is the leading solution for product analytics, offering tools to track user behaviour, measure engagement, and improve your application with actionable insights.

This is easily one of those libraries you will need all the time. They offer cloud and self-hosting solutions.

Some key features of Posthog include

  • Event Tracking: Track user interactions and behaviour in real-time.
  • Session Recordings: Replay user sessions to understand how they navigate your app.
  • Heatmaps: Visualize where users click and engage the most on your site.
  • Feature Flags: Enable or disable features for specific user groups without redeploying code.

For more, refer to the official documentation.

Posthog GIF

Explore the Posthog repository ⭐


9. Auth0 by Okta - Authentication and Authorization platform

Implementing application authentication is essential, and knowing how to roll authentication can easily stand out.

With Auth0, you can streamline the process, enabling secure login, user management, and multi-factor authentication with minimal effort.

Some of the crucial features of Auth0.

  • Single Sign-On (SSO): Seamless login across multiple applications with a single credential.
  • Multi-Factor Authentication (MFA): Adds extra security with multiple verification methods.
  • Role-Based Access Control (RBAC): Manage user permissions based on assigned roles for secure access control.
  • Social Login Integration: Easily integrate logins via Google, Facebook, and GitHub.

Auth0 SDK is available for almost all platforms and languages.

Auth0 GIF

Explore the Posthog repository ⭐


Thank you for reading the listicle.

Let me know in the comments if you know of other essential open-source AI tools. ✨

Top comments (6)

Collapse
 
nevodavid profile image
Nevo David

Great list!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thank you, Nevo.

Collapse
 
tomasdevs profile image
Tomas Stveracek

Very useful, I'll save it, thanks.

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thank you, Tomas; I am glad you liked it.

Collapse
 
time121212 profile image
tim brandom

That's a nice list, Thanks

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thanks Tim.