DEV Community

Cover image for Python for AI : Cheatlist
Marine for Taipy

Posted on • Edited on

Python for AI : Cheatlist

TL;DR

Getting into AI and ML without using Python is very difficult and, might I say, even impossible.
So here's a list of prominent Python libraries for your AI and ML models.
These libraries have been, and continue to be, institutions in the AI landscape.

Entry
 

I. Building an application for your AI

What is ML & AI if you can't share it with non-data scientists/programmers?
Let's start with two major frameworks that help showcase your results through GUIs.

App builders

1- Taipy

Taipy was created to give data scientists the skills to develop production-ready applications.

This open-source Python library is designed for easy development for front-end (GUI) and ML/Data pipelines. Don’t compromise on customization, performance, and scaling.

Taipy

Star ⭐ the Taipy repository

Your support means a lot🌱, and helps us in many ways, like writing articles! 🙏

 

2- Gradio

This Python library facilitates the quick sharing of AI/ML models through easy-to-create basic applications. It’s a great way to showcase your model quickly.

Gradio

Star ⭐ the Gradio repository


II. AI and ML frameworks

Now let's enter the main part of the article, the major, most important Python libraries that'll help getting into ML/AI.

ML


3- Scikit learn

This might be Python’s top 3 most famous libraries, and rightfully so.
Sklearn is a reference in Machine Learning. It includes different models such as K-means clustering, regression, and classification algorithms.
It also excels in dimension reduction techniques.
Sklearn also provides data selection and validation functions. It's easy to learn/use and should be your go-to ML library during your data science journey.

Star ⭐ the Sklearn repository


4- Tensorflow

This library is a must-know for Neural Network modeling. Perfect when dealing with unstructured data such as image classification or NLP (Natural Language Processing). TensorFlow is widely used in research and industries as it provides a complete API for designing and manipulating Neural Networks.

Star ⭐ the TensorFlow repository


5- Pytorch

Pytorch is known for its more significant focus on natural language processing and a more Pythonic feel, reducing the steep learning curve for TensorFlow.

Star ⭐ the PyTorch repository


6- Keras

Keras is a high-level API that runs on top of frameworks such as TensorFlow. If starting with Neural Networks, start with Keras. It is ideal for quick implementations as it simplifies the implementation process, making it the best beginner-friendly option for Neural Network implementation.

Star ⭐ the Keras repository


7- FastAI

This library simplifies training fast and accurate neural nets; it’s built on top of PyTorch.

Star ⭐ the FastAI repository


8- FLAML

This library is a game changer. It finds and tests the optimal hyperparameters and machine learning models for your data and use cases.

Star ⭐ the FLAML repository


9- Catboost

This library, standing for Categorical Boosting, is the way to go if your dataset predominantly consists of categorical data. This library will remove the preprocessing headache of complex one-hot encoding, eliminating the need to preprocess categorical data. It can provide better accuracy than XGBoost when running with default parameters.

Star ⭐ the Catboost repository


10- PyCaret

This is an excellent Machine Learning automation tool. It automates all your ML workflows easily as it is low code.

Star ⭐ the PyCaret repository


11- H20

H20 is a user-friendly machine-learning platform for more innovative applications. It is fast and scalable. It includes algorithms for Deep Learning, GLM, PCA, RulkeFIt, etc…

Star ⭐ the H20 repository


12- XGBoost

XGBoost is one of the most popular libraries regarding Machine Learning algorithms.
This gradient-boosting library is widely used in real-life use cases, particularly for tabular data.
It is a favorite among Kaggle competition winners.
This library includes regression and classification algorithms but also provides feature selection tools.

Star ⭐ the XGBoost repository


13- TPOT

This library for AutoML will optimize your Machine Learning pipelines using genetic programming.

Star ⭐ the TPOT repository


14- ChatterBot

Quickly build your chatbot with this library. You’ll be able to enhance your user engagement and interactions and services.

Star ⭐ the ChatterBot repository


III. Natural Language Processing (NLP)

NLP


15- NTLK

NLTK is an essential toolkit for Natural Language Processing.
NLTKs' key features include processing and manipulating text( tokenization, stemming, etc.…) and classifications with NLP tasks for sentiment analysis, for example.

Star ⭐ the NLTK repository


16- SpaCy

It is the newer kid on the block, focusing on making NLP more accessible and user-friendly.
The library optimized the process to guarantee incredible speed and efficiency.

Star ⭐ the SpaCy repository


17- Gensim

This library specializes in topic modeling and document similarity—a good fit for your unsupervised text use cases and tasks.

Star ⭐ the Gensim repository


18- HF transformers

This is your tool for advanced NLP tasks. This library has state-of-the-art natural language processing models and algorithms.

Star ⭐ the HF transformers repository


IV. Model Visualization & Evaluation

Eval


19- Matplotlib

Matplotlib is the main widget library in Python, and for a good reason.
Matplotlib allows the plotting of 2D graphs with a wide range of chart types and also allows for significant customization.
The fine-grain control of the elements is a real advantage of this library.

Star ⭐ the Matplotlib repository


20- imbalanced-learn

This library gives you tools for dealing with imbalanced data—a life-saver when your dataset is far from balanced.

Star ⭐ the imbalanced-learn repository


21- SHAP

This library helps generate an explanation of your model’s output. It’s a great way to bring in some interpretability in black-grey box models.

Star ⭐ the SHAP repository


22- Missingno

It is a great solution to identify missing values in your data. Missingo helps you visualize them quickly, making this process simpler and more efficient.

Star ⭐ the Missingno repository


23- Lazy predict

It helps build baseline models to compare and evaluate without extensive code. A great tool for newbies in ML. It is low-code and takes care of the parameter-tuning hassle for you.

Star ⭐ the Lazy Predict repository


24- Category Encoders

This library will help you deal with categorical data. It helps build baseline models to compare and evaluate without extensive code. This library is Sklearn compatible.

Star ⭐ the Category Encoders repository


V. Computer Vision

Computer Vision


25- OpenCV

OpenCV provides various algorithms around real-time computer vision.
You can process multiple formats, including objects, humans, and handwriting.

Star ⭐ the OpenCV repository


Have fun!

Exploring these libraries will allow you to handle most AI and ML use cases. Python libraries go beyond tools, they participate in the continuous innovation in the AI landscape, so make sure to support them!

Top comments (2)

Collapse
 
aleajactaest78 profile image
AleaJactaEst

I will keep that in mind

Collapse
 
time121212 profile image
tim brandom

Never heard of Missingo and Lazy predict, but most of the libraries have been settled in for years