DEV Community

# transformers

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Chapter 12: Inference - Generating New Text

Chapter 12: Inference - Generating New Text

Comments
9 min read
Chapter 11: The Full GPT - Assembling the Model

Chapter 11: The Full GPT - Assembling the Model

Comments
10 min read
Chapter 9: Single-Head Attention - Tokens Looking at Each Other

Chapter 9: Single-Head Attention - Tokens Looking at Each Other

Comments
9 min read
Chapter 8: RMS Normalisation and Residual Connections

Chapter 8: RMS Normalisation and Residual Connections

Comments
4 min read
Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won

Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won

Comments
8 min read
Chapter 7: The Training Loop and Adam Optimiser

Chapter 7: The Training Loop and Adam Optimiser

Comments
7 min read
Chapter 6: Embeddings, the Forward Pass, and the Loss Function

Chapter 6: Embeddings, the Forward Pass, and the Loss Function

Comments
7 min read
Mamba vs. Transformers: Architecture Comparison

Mamba vs. Transformers: Architecture Comparison

1
Comments
5 min read
Without google's transformers, there is no GPT-ishs

Without google's transformers, there is no GPT-ishs

Comments
6 min read
Chapter 5: Linear Transformation and Softmax

Chapter 5: Linear Transformation and Softmax

Comments
4 min read
Chapter 4: The Bigram Model - Simplest Possible Language Model

Chapter 4: The Bigram Model - Simplest Possible Language Model

Comments
5 min read
Chapter 3: The Tokenizer - Text to Numbers and Back

Chapter 3: The Tokenizer - Text to Numbers and Back

Comments
2 min read
Chapter 2: Backward - Automatic Gradient Computation

Chapter 2: Backward - Automatic Gradient Computation

Comments
7 min read
Chapter 1: The Value Class - Recording the Forward Pass

Chapter 1: The Value Class - Recording the Forward Pass

Comments
10 min read
why agi is not possible with the current llms and transformers

why agi is not possible with the current llms and transformers

1
Comments
6 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.