DEV Community

# attention

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
TurboQuant: How a Simple Spin Saves Gigabytes of GPU Memory

TurboQuant: How a Simple Spin Saves Gigabytes of GPU Memory

Comments
6 min read
Attention Residuals: How Kimi Is Rethinking Transformer Depth

Attention Residuals: How Kimi Is Rethinking Transformer Depth

Comments
3 min read
Replacing Dot-Product Attention with RBF-Attention: Technical and Computational Challenges and Solutions

Replacing Dot-Product Attention with RBF-Attention: Technical and Computational Challenges and Solutions

Comments
20 min read
Attention Refinery

Attention Refinery

1
Comments
8 min read
Anonymous User Claims Proof of d^2 Complexity for Attention Mechanisms, Challenging Transformer Optimization

Anonymous User Claims Proof of d^2 Complexity for Attention Mechanisms, Challenging Transformer Optimization

Comments
10 min read
Loot Systems and the Illusion of Progress

Loot Systems and the Illusion of Progress

1
Comments
4 min read
brain defrag: time away from screens (and from "one more" with ai)

brain defrag: time away from screens (and from "one more" with ai)

1
Comments
4 min read
Identifying Early Warning Signs of Attention Mechanism Instability

Identifying Early Warning Signs of Attention Mechanism Instability

1
Comments
5 min read
The Intricate Dance of Self-Attention: What Can Go Wrong?

The Intricate Dance of Self-Attention: What Can Go Wrong?

Comments
5 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.