Sunil Kumar Dash for Composio

Posted on Sep 26

9 open-source libraries that will make your CV stand out ⚡ 🚀

#webdev #javascript #programming #python

Tech hiring is broken, and the competition has become highly fierce over the past two years. However, mastering niche technologies used by big companies and building innovative applications around them can significantly enhance your CV's credibility.

So, I've compiled a list of open-source libraries to help you stand out.

Feel free to explore these libraries and let others know in the comments about the ones your organization is implementing.

Composio 👑 - AI tooling and integrations platform

AI is eating the world, and there is no dispute that the future of workforces will have a human-AI hybrid system. For this to happen, the AI model should be able to access external systems.

Composio is the industry-leading solution in this space. It provides an ever-expanding catalogue of tools and integrations across industry verticals, from CRM, HRM, and Sales to Development, Productivity, and administration.

Easily integrate apps like GitHub, Slack, Jira, Gmail, etc, with AI models to automate complex real-world workflow automation.

It has native support for Python and Javascript.

Quickly get started with Composio using pip or npm.

pip install composio-core

npm install composio-core

Python

Add a GitHub integration.

composio add github

Composio handles user authentication and authorization on your behalf.

Here is how you can use the GitHub integration to star a repository.

from openai import OpenAI
from composio_openai import ComposioToolSet, App

openai_client = OpenAI(api_key="******OPENAIKEY******")

# Initialise the Composio Tool Set
composio_toolset = ComposioToolSet(api_key="**\\*\\***COMPOSIO_API_KEY**\\*\\***")

## Step 4
# Get GitHub tools that are pre-configured
actions = composio_toolset.get_actions(actions=[Action.GITHUB_ACTIVITY_STAR_REPO_FOR_AUTHENTICATED_USER])

## Step 5
my_task = "Star a repo ComposioHQ/composio on GitHub"

# Create a chat completion request to decide on the action
response = openai_client.chat.completions.create(
model="gpt-4-turbo",
tools=actions, # Passing actions we fetched earlier.
messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": my_task}
  ]
)

Run this Python script to execute the given instruction using the agent.

Javascript

You can Install it using npm, yarn, or pnpm.

npm install composio-core

Define a method to let the user connect their GitHub account.

import { OpenAI } from "openai";
import { OpenAIToolSet } from "composio-core";

const toolset = new OpenAIToolSet({
  apiKey: process.env.COMPOSIO_API_KEY,
});

async function setupUserConnectionIfNotExists(entityId) {
  const entity = await toolset.client.getEntity(entityId);
  const connection = await entity.getConnection('github');

  if (!connection) {
      // If this entity/user hasn't already connected, the account
      const connection = await entity.initiateConnection(appName);
      console.log("Log in via: ", connection.redirectUrl);
      return connection.waitUntilActive(60);
  }

  return connection;
}

Add the required tools to the OpenAI SDK and pass the entity name on to the executeAgent function.

async function executeAgent(entityName) {
  const entity = await toolset.client.getEntity(entityName)
  await setupUserConnectionIfNotExists(entity.id);

  const tools = await toolset.get_actions({ actions: ["github_activity_star_repo_for_authenticated_user"] }, entity.id);
  const instruction = "Star a repo ComposioHQ/composio on GitHub"

  const client = new OpenAI({ apiKey: process.env.OPEN_AI_API_KEY })
  const response = await client.chat.completions.create({
      model: "gpt-4-turbo",
      messages: [{
          role: "user",
          content: instruction,
      }],
      tools: tools,
      tool_choice: "auto",
  })

  console.log(response.choices[0].message.tool_calls);
  await toolset.handle_tool_call(response, entity.id);
}

executeGithubAgent("joey")

Execute the code and let the agent do the work for you.

Composio works with famous frameworks like LangChain, LlamaIndex, CrewAi, etc.

For more information, visit the official docs, and for even more complex examples, see the repository's example sections.

Star the Composio repository ⭐

2. Apache Kafka - Distributed Event Streaming Platform

Apache Kafka is the backbone of many Fortune 500 companies requiring high-throughput event data pipelines. Having Kafka in your CV would undoubtedly make you stand out.

It is an open-source distributed platform for handling real-time data streams. It enables large volumes of event data collection, storage, and processing with high fault tolerance.

It is ideal for building event-driven systems. Big companies like Netflix, LinkedIn, and Uber use Kafka to stream real-time data and analytics, manage event-driven architectures and monitoring systems, and enable real-time recommendations and notifications.

Download the latest Kafka release and extract it to get started with it:

$ tar -xzf kafka_2.13-3.8.0.tgz
$ cd kafka_2.13-3.8.0

Set up Kafka with Kraft.

To use Kafka with Kraft, create a cluster UUID.

KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)"

Format Log directories

bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server.properties

Start Kafka server

bin/kafka-server-start.sh config/kraft/server.properties

Then, you can create topics and publish and consume events

Before you write your events, you must create topics. Run this in another shell.

bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092

Now, write some events to the topic.

bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092
>This is my first event
>This is my second event

Read the events.

bin/kafka-console-consumer.sh --topic quickstart-events --from-beginning --bootstrap-server localhost:9092

For comprehensive details on Kafka and its use, refer to this article I wrote a while back.

3. Grafana - The Open Observability Platform

Grafana is another open-source software used by many big companies. It is an analytics and monitoring platform that allows you to query, store, and visualize metrics from multiple data sources. You can also create, explore, and share dashboards with your teams.

Features of Grafana include

Metrics and logs visualization.
Dynamic dashboards.
Alerting on Slack, Pagerduty, etc., based on custom rules for metrics.
Explore metrics through ad-hoc queries.
Mix multiple data sources in the same graph.

Check out the official documentation to explore Grafana in detail.

Explore the Grafana repository ⭐

4. Celery - Distributed task queue

Building a robust application can be challenging, especially when multiple events need to be accounted for. Celery can come in handy in these situations.

Celery is simple, flexible, distributed open-source software that facilitates real-time processing of task queues and scheduling. It lets you offload time-consuming tasks and execute them asynchronously in the background, improving your application's performance and scalability.

It is available in most programming languages, from Python and JS to Go and Rust.

Celery uses message brokers like Redis and RabbitMQ.

Get started quickly by installing with pip.

pip install celery reddit

Start the Redis server in the background.

redis-server

Define a simple Task like sending an email.

from celery import Celery

# Define a Celery app with Redis as the message broker
app = Celery('tasks', broker='redis://localhost:6379/0')

# Define a simple task (e.g., sending an email)
@app.task
def send_email(recipient):
    print(f"Sending email to {recipient}")
    return f"Email sent to {recipient}"

Start the Celery worker by running the following command in the terminal:

celery -A tasks worker --loglevel=info

You can now use send_email asynchronously in your Python code. Create another Python script to call the task:

python
Copy code
from tasks import send_email

# Call the task asynchronously using `.delay()`
send_email.delay('user@example.com')

Once you call send_email.delay(), the task will be processed by the Celery worker asynchronously, and you'll see something like this in the terminal where the Celery worker is running:

[2024-09-24 12:00:00,000: INFO/MainProcess] Task tasks.send_email[abc123] succeeded in 0.001s: 'Email sent to user@example.com'

For more, refer to their official documentation.

Explore the Celery repository ⭐

5. Selenium - Browser Automation Framework

Browser automation is one of the inevitable things you will encounter at least once in your tech career. Many companies use Selenium for multiple purposes, such as Web automation, testing, and even scraping dynamic web content.

Selenium allows developers to interact with web browsers programmatically, simulating user actions like clicking buttons, filling out forms, and navigating between pages. This makes it an invaluable tool for testing web applications across browsers and platforms.

It is available in programming languages.

Install Selenium in Python with pip.

pip install Selenium

You must install Chrome Webdriver for Chromium-based browsers and Gecko Driver for Firefox browsers.

Here’s an example of using Selenium with ChromeDriver:

python
Copy code
from selenium import webdriver

# Specify the path to your ChromeDriver executable
driver = webdriver.Chrome(executable_path='/path/to/chromedriver')

# Open a webpage
driver.get("https://www.example.com")

# Perform actions (e.g., click a button, find elements, etc.)
print(driver.title)  # Print the page title

# Close the browser
driver.quit()

For more, check the documentation.

Explore the Selenium repository ⭐

6. LlamaIndex - Data Framework for LLM Applications

AI is hot right now, and multiple companies are building products around AI models. There can not be a better time to be an AI developer.

LlamaIndex is a leading framework for building applications using large language models (LLMs). It lets you connect any data store with relational, graph, or vector databases with LLMs. It provides all the bells and whistles, such as data loaders, connectors, chunkers, re-rankers, etc., to build efficient AI applications.

Quickly get started with LlamaIndex by installing it via pip.

pip install llamaindex

A simple example of using a vector database in LlamaIndex.

# custom selection of integrations to work with core
pip install llama-index-core
pip install llama-index-llms-openai
pip install llama-index-llms-replicate
pip install llama-index-embeddings-huggingface

import os

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("YOUR_DATA_DIRECTORY").load_data()
index = VectorStoreIndex.from_documents(documents)

Query the database.

query_engine = index.as_query_engine()
query_engine.query("YOUR_QUESTION")

For more information, please refer to their documentation.

Explore the Llama Index repository ⭐

7. Pytorch Lightning - The deep learning framework

Knowing Pytorch lightning can help your cause better if you are into AI model development.

It’s a versatile framework built with PyTorch that helps organize and grow deep learning projects. It offers tools for training, testing, and deploying models across different areas.

Here are some advantages of using Lightning over plain PyTorch:

It makes PyTorch code easier to read, better organized, and more user-friendly.
It reduces repetitive code by providing built-in training loops and utilities.
It simplifies the process of training, experimenting, and deploying models with less unnecessary code.

You can get started with Lightning by installing it with pip:

Define an auto-encoder using the Lightning module.

import os
from torch import optim, nn, utils, Tensor
from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor
import lightning as L

# define any number of nn.Modules (or use your current ones)
encoder = nn.Sequential(nn.Linear(28 * 28, 64), nn.ReLU(), nn.Linear(64, 3))
decoder = nn.Sequential(nn.Linear(3, 64), nn.ReLU(), nn.Linear(64, 28 * 28))

# define the LightningModule
class LitAutoEncoder(L.LightningModule):
    def __init__(self, encoder, decoder):
        super().__init__()
        self.encoder = encoder
        self.decoder = decoder

    def training_step(self, batch, batch_idx):
        # training_step defines the train loop.
        # it is independent of forward
        x, _ = batch
        x = x.view(x.size(0), -1)
        z = self.encoder(x)
        x_hat = self.decoder(z)
        loss = nn.functional.mse_loss(x_hat, x)
        # Logging to TensorBoard (if installed) by default
        self.log("train_loss", loss)
        return loss

    def configure_optimizers(self):
        optimizer = optim.Adam(self.parameters(), lr=1e-3)
        return optimizer

# init the autoencoder
autoencoder = LitAutoEncoder(encoder, decoder)

Load MNIST data.

# setup data
dataset = MNIST(os.getcwd(), download=True, transform=ToTensor())
train_loader = utils.data.DataLoader(dataset)

The Lightning Trainer “mixes” any LightningModule with any dataset and abstracts away all the engineering complexity needed for scale.

# train the model (hint: here are some helpful Trainer arguments for rapid idea iteration)
trainer = L.Trainer(limit_train_batches=100, max_epochs=1)
trainer.fit(model=autoencoder, train_dataloaders=train_loader)

For more on Lightning, check out the official documentation.

Explore the Pytorch Lightning repository ⭐

8. Posthog - Open-source product analytics platform

Building a modern application is incomplete without Posthog. It is the leading solution for product analytics, offering tools to track user behaviour, measure engagement, and improve your application with actionable insights.

This is easily one of those libraries you will need all the time. They offer cloud and self-hosting solutions.

Some key features of Posthog include

Event Tracking: Track user interactions and behaviour in real-time.
Session Recordings: Replay user sessions to understand how they navigate your app.
Heatmaps: Visualize where users click and engage the most on your site.
Feature Flags: Enable or disable features for specific user groups without redeploying code.

For more, refer to the official documentation.

Explore the Posthog repository ⭐

9. Auth0 by Okta - Authentication and Authorization platform

Implementing application authentication is essential, and knowing how to roll authentication can easily stand out.

With Auth0, you can streamline the process, enabling secure login, user management, and multi-factor authentication with minimal effort.

Some of the crucial features of Auth0.

Single Sign-On (SSO): Seamless login across multiple applications with a single credential.
Multi-Factor Authentication (MFA): Adds extra security with multiple verification methods.
Role-Based Access Control (RBAC): Manage user permissions based on assigned roles for secure access control.
Social Login Integration: Easily integrate logins via Google, Facebook, and GitHub.

Auth0 SDK is available for almost all platforms and languages.

Explore the Posthog repository ⭐

Thank you for reading the listicle.

Let me know in the comments if you know of other essential open-source AI tools. ✨

Top comments (21)

Nevo David • Sep 26

Great list!

Sunil Kumar Dash • Sep 27

Thank you, Nevo.

Subhayu Kumar Bala • Sep 28

Great list!

Just a small typo pip install celery reddit should have been pip install celery redis

Sunil Kumar Dash • Sep 29

Oops,...thanks for pointing it out.

anna lapushner • Sep 27

What isn't broken! Thank you for these helpful tipS! You're amazing, really!

Sunil Kumar Dash • Sep 28

Thank you so much, Anna.

anna lapushner • Dec 19

I particularly appreciate the practical focus and inclusion of actionable steps for developers to dive right in. That said, I’m curious about the call to action. While encouraging readers to explore these tools and share their experiences is engaging, how does that directly enhance a developer’s likelihood of being hired? Is the goal to demonstrate hands-on familiarity in interviews, or to showcase these projects on a portfolio? Clarifying this could help readers better understand how to position themselves for opportunities.