COMMUNITY
Join us at one of our Unstructured Data Meetups! We had so much fun at the last event this Tuesday, once again at Github! Thanks to the awesome speakers and all the attendees!
SF Unstructured Data Meetup - November 14, 2023 - watch the video
- Mihail Eric, Founder, Storia.ai
- Jacob Marks, MLE/DevEvangelist, Voxel51
- Josh Reini, Data Scientist/DevRel, TruEra
The next one in San Francisco is on Jan 16, 2024. Please register early because they fill up fast! We will be joined by
- Jack Retterer, DevRel, Unstructured.io
- George Williams, Organizer, Big-ANN NeurIPS 2023
GETTING STARTED WITH VECTOR SEARCH
- What Is Semantic Search?
- Vector Distance
- What Are AI Hallucinations?
- Vector Database 101: Everything You Need to Know
ARTICLES
Natural Language Processing (NLP)
- An Introduction to Natural Language Processing
- Top 20 NLP Models to Empower Your ML Application
- Tokens, N-Grams, and Bag-of-Words Models
- Primer on Neural Networks and Embeddings for Language Models
RAG
- Do We Still Need Vector Databases for RAG with OpenAI's Releasing of Its Built-In Retrieval?
- Grounding Our Chat Towards Data Science Results
- How LangChain Implements Self Querying
TUTORIAL
Using AI to Find Your Celebrity Stylist. In this tutorial, you will learn how to utilize a fine-tuned model to segment clothing in images. You will then crop out each labeled article and resize the images to the same size. Finally, store the embeddings generated from those images in Milvus, an open-source vector database.
VIDEOS
Frank's RedHot Takes
- OpenAI DevDay 2023 Retrieval API
- Vector databases are here to stay, even in the age of long-context models
GITHUB REPOS
Milvus Vector Database. Milvus is an open source vector database used to store, index, and manage massive embedding vectors generated by deep neural networks and other machine learning (ML) models.
GPT Cache. GPTCache is an open-source tool designed to improve the efficiency and speed of GPT-based applications by implementing a cache to store the responses generated by language models.
VectorDBBench. VectorDBBench is an open-source benchmarking tool to help you evaluate the performance of mainstream vector databases and cloud services with yoru specific use case.
Top comments (0)