In the last post, I did a brief introduction of the AI topic, of myself, and how I started working with it. Now it’s time to start looking at some AI concepts that may help us to understand how it works.
Disclaimer:
I’m not an ML specialist. This is the shallowest shallow description of some AI-related concepts. This is NOT a guide or tutorial of any sort. There might be flaws, misconceptions, and incomplete information. This is a brief of my current understanding of topics and you should treat it as base content to make you think about it and search for more content from specialized sources. And, of course, I’m more than happy if you can share with me what you’ve learned differently, correct me if I say anything inaccurate, or add further details to any related topic.
That said, let’s get started.
Concepts
What’s Artificial Intelligence (AI)?
The Artificial Intelligence concept was first brought up as a study field in 1956 by John McCarthy at Darthmout Conference and many believe this is the AI Birthdate. However, the core concept can be tracked far back in time — and when I say “far” I do mean it.
If we understand the AI concept as a trained humanlike intelligence that resides outside of a life form, then we can go back in time to somewhere around 700 B.C. where, according to historians, that’s when Hesiod and Homer wrote famous Greek poems. Hephaestus forged Talos from bronze, women made of gold, and even automata devices, and to some of them, he gave intelligence — pretty much what we call Artificial Intelligence (or General Artificial Intelligence) nowadays. And, considering that from the mythology perspective, even humankind was made of mud and then granted intelligence, could we also say we are god-made Artificial Intelligence? Well, that’s a topic for another philosophical discussion.
The point is that the AI concept is not something new, but has been in the social imaginary and collective unconscious for a very long time. However, only recently with the advances in technology and research fields, we’ve been able to achieve a new step of AI usage by accomplishing human tasks. And in 2022 OpenAI unlocked a new myriad of possibilities by releasing ChatGPT.
Well, as this is supposed to be a more technical view of AI concepts, I’ll try my best to keep social and philosophical topics out of this from now on. (Sorry about that.)
So going straight to the point and answering the question “What is Artificial Intelligence?” I could briefly say: AI is a concept that represents the ability of a machine to simulate humanlike intelligence and mimic human skills to learn, solve problems, and make decisions, based on pre-trained data and other concepts and techniques such as Neural Network, Machine Learning, Deep Learning, and Data Science.
What’s a Neural Network?
A Neural Network follows the same concept as our brain’s neural system which is responsible for our learning process. The neural network is made of functions (neurons) built to decrypt other functions. A neuron is a linear function that given an input uses algorithms to try different parameters and biases that result in the expected output. This try-and-fail process is also known as Back Propagation. However, the function may not be perfectly decrypted, instead, an approximate solution can be found as it outputs the expected result.
The neurons need to be activated in order to be able to process the different parameters and try to approximate the expected result. This activation is performed by changing its linear function to a non-linear function. And, the amount of parameters used for training a neuron is what the “n b(illion)” parameters mean when we see some Large Language Model naming, “Llama-2–7b” for example. The amount of parameters doesn’t matter that much, what’s most important is how the neural network has been trained, so it can get to a faster and more precise approximation to an expected output given an input.
Basically, a neural system is capable of solving any computing problem, as far as the neural layers are deep enough to process the amount of tests needed, this is what’s called Deep Learning.
All this process is done by mathematical calculations using vectors and matrices.
What are vectors and matrices?
Before explaining vectors and matrices, first, we need to recall that everything in computing is represented by a number. The characters you type on your keyboard, the pixels that form the images you see on your screen. Everything is numbers. This is where a vector takes place.
A vector is a list of numbers. We can think of it as an array. So a vector can, for example, contain the numerical representation of words. A number is a scalar value and is processed by CPUs. And, a vector can be defined with a length bits, like 16-bits, 32-bits, and so on. We can think of the length as the “resolution” of the vector. The higher the bits, the more detailed the data, but the more resource it needs.
A matrix, on the other hand, is a representation of a list of vectors (or an array of arrays). As humans, we can visualize only 3 dimensions (X, Y, and Z), but a matrix can be made of multiple dimensions that we can only understand through mathematics.
To process matrices, we need to make multiplication calculations and that’s why GPUs are needed, as they are built to process a large amount of multiplication calculations efficiently. Originally, GPUs were designed for image processing, as images are composed of matrices. But as LLMs also need matrix processing, GPUs became the best tool for dealing with it.
What’s Embedding?
A common terminology we see when working with Machine Learning and Large Language Models is “embedding”. Embeddings are converters that transform words into a vectorial representation (array of numbers). It then can be stored in a vector store and be used for semantical similarity search.
What is GPT?
GPT that’s largely known from ChatGPT stands for Generative Pre-training Transformer. It’s basically an AI that uses techniques to, given a word, predict the next word according to its probability. Differently from what people can usually believe, GPT does not create anything new, it just uses its trained data to generate a sequence of words by predicting one after another.
What’s LLM?
LLM stands for Large Language Model, it’s an AI system trained to generate human-like texts using NLP (Natural Language Processing) based on patterns learned from a vast number of parameters. GPT is considered an LLM. And, it can be used for a large variety of use cases like summarization, code generation, text generation, question and answers, and more.
We can say the evolution of LLMs are LMMs, or Large Multimodal Models. As LLMs are capable of receiving a single input and output type such as text, LMM can receive multiple input and output types like text, images, videos, etc. It means LMM can receive a text and an image as input and output a video, or any other way around.
Some common LLMs are: GPT-3.5, Mistral, Llama2
Some common LMMs are: GPT-4, Llava, Gemini, Claude-3
How is all of this coming to play together?
Now that we covered the basics, let’s see how everything works together.
Let’s say that OpenAI used Neural Networks, Machine Learning, Deep Learning, and Data Science to train a GPT with 175 billion parameters. Then you open the ChatGPT website and type “what is AI?”. Your prompt will be embedded into a vector representation then the GPUs on OpenAI servers will make dozens of matrix calculations by multiplying the vector representation of your prompt vs. the pre-trained data stored on OpenAI’s vector stores and retrieve the word with the highest semantical similarity score compared to your prompt. A sequence of words will be displayed and you feel like someone is really typing them as they show, but in fact, the words are being calculated one by one in real-time. You get your answer and are mesmerized by how amazing ChatGPT is, but it “just” auto-completed your question and it looked like an answer. It’s really fascinating how well done it can achieve the results and the possibilities to use such technology, even though the GPU consumption may be pretty high.
The 1-bit LLM
When dealing with vectors and matrices, there’s a technique called quantization that means reducing the vector length and so optimizing the energy and storage consumption, as well as increasing the feedback speed. But, it comes with a trade-off of reducing the accuracy of LLM responses.
Recently a Microsoft team released a paper introducing the 1-bit LLM concept. It suggests a different approach to deal with vectors and matrices, by reducing the vector length from 32 or 16-bits to 1.58-bit, in practical terms it’s a 2-bits length. It means each vector would handle only the values -1, 0, or 1. The results of their benchmarking of energy and storage consumption are really nice, as well as the feedback response that drastically reduced. We could even question the need for GPUs for returning LLM responses as matrix calculations would be replaced from multiplications to sum — and this can be well handled by CPUs that are cheaper and demand less energy. This is a huge win when we think of the real costs that big tech companies are paying to allow us to use GPT nowadays. Or even for companies that are self-hosting their LLMs and investing tons of money in cloud services. However, some experts are really unsure if the 1-bit LLM is really going to work as well as the paper suggests, especially when we consider the decrease in accuracy that it may come with. This is something for us all to figure out in the near future once they release the BitNet b1.58 model.
It looks like this model, however, would be game-changing for serving LLMs, but not for training them. Even if the demand for GPUs would not fall to zero, I wonder how it’d impact NVIDIA market value if GPUs are going to be less needed for the AI race, and how Microsoft would be interested in making that true — but that’s a topic for a different discussion.
OpenAI and Anthrophic directions
Recently, CEOs from both companies that are leading the AI trendings nowadays, along with Microsoft and Meta, showed interest in making AI even more immersive tooling to use on daily-basis tasks. Some people speculate for GPT-5 and Claude-4 we might see some sort of feature that integrates with our computers, mobile devices, and gadgets.
This is not something totally new, as there are already projects moving towards to that direction. There’s an open-source project that’s really standing out on that behalf called Open Interpreter. It basically allows us to chat with our computer and ask it to complete tasks for us, from simple things like “move this file from folder A to folder B” to even more complex things like scheduling a meeting and sending emails.
This is just a glimpse of how we’re at just the beginning of the adoption of AI in our daily-basis routine. There are still a lot of fields to explore and solutions to be developed.
Conclusion
In this post, we’ve taken a look at the world of Artificial Intelligence, exploring its origins, its core concepts, and its current applications. We’ve discussed the concept of AI, Neural Networks, vectors and matrices, embedding, GPT, LLMs, the exciting new concept of 1-bit LLMs, and what the big companies seem to be interested in.
Remember, this is just a brief overview of these topics. I encourage you to continue your own research and deepen your understanding of these fascinating concepts. And, as always, I welcome your thoughts, corrections, and additions to this discussion.
For the next posts, we’ll start implementing solutions using mainly JavaScript/TypeScript (NextJS) and a bit of Python.
Stay tuned for the next part of the AI series!
Top comments (0)