DEV Community

# attention

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
A Classic Efficiency Trick Just Moved Into a New Part of the AI

A Classic Efficiency Trick Just Moved Into a New Part of the AI

Comments 1
3 min read
DeepSeek's new open models give everyone a million-word memory by default

DeepSeek's new open models give everyone a million-word memory by default

Comments
3 min read
MiniMax M3 大模型注意力机制上所做的重大颠覆与优化

MiniMax M3 大模型注意力机制上所做的重大颠覆与优化

Comments
2 min read
A Looming Crisis of AI Generated Text

A Looming Crisis of AI Generated Text

Comments
4 min read
TurboQuant: How a Simple Spin Saves Gigabytes of GPU Memory

TurboQuant: How a Simple Spin Saves Gigabytes of GPU Memory

Comments
6 min read
Attention Residuals: How Kimi Is Rethinking Transformer Depth

Attention Residuals: How Kimi Is Rethinking Transformer Depth

Comments
3 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.