DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
We benchmarked 10 LLMs on 10 real agent coding tasks — here are the results

We benchmarked 10 LLMs on 10 real agent coding tasks — here are the results

Comments
2 min read
How we almost wrote off 3 models as broken — the thinking-mode tax

How we almost wrote off 3 models as broken — the thinking-mode tax

1
Comments
2 min read
Why Your AI Character Keeps Breaking Under Pressure (And What I Built Instead of Yet Another System Prompt)

Why Your AI Character Keeps Breaking Under Pressure (And What I Built Instead of Yet Another System Prompt)

Comments
8 min read
What 16 Parallel Claude Agents Built Around Themselves: Deconstructing Anthropic's C Compiler Experiment

What 16 Parallel Claude Agents Built Around Themselves: Deconstructing Anthropic's C Compiler Experiment

Comments
12 min read
The right of an AI agent to stay silent

The right of an AI agent to stay silent

Comments
8 min read
1-bit, 545 megabytes, zero API keys — local AI that beats GPT-5.4

1-bit, 545 megabytes, zero API keys — local AI that beats GPT-5.4

Comments
2 min read
I Built a Local AI Coding Agent on M5 Max 128GB — It Failed 164 Times Before Passing 35 Tests

I Built a Local AI Coding Agent on M5 Max 128GB — It Failed 164 Times Before Passing 35 Tests

Comments
7 min read
How Stripe, Shopify, and Airbnb Build AI Harnesses

How Stripe, Shopify, and Airbnb Build AI Harnesses

Comments
3 min read
Anthropic plugs into SpaceX's 220,000-GPU Colossus — and doubles Claude's rate limits

Anthropic plugs into SpaceX's 220,000-GPU Colossus — and doubles Claude's rate limits

1
Comments
3 min read
I Trained an LLM on 75K of My Own Messages So It Would Stop Writing Like a Chatbot

I Trained an LLM on 75K of My Own Messages So It Would Stop Writing Like a Chatbot

Comments
8 min read
What 11 big tech companies actually do with AI in 2026

What 11 big tech companies actually do with AI in 2026

Comments
23 min read
tierKV: A Distributed KV Cache That Makes Evicted Blocks Faster to Restore Than GPU Cache Hits

tierKV: A Distributed KV Cache That Makes Evicted Blocks Faster to Restore Than GPU Cache Hits

1
Comments
3 min read
Why AI Coding Agents Waste 30% of Their Tokens — And How to Fix It

Why AI Coding Agents Waste 30% of Their Tokens — And How to Fix It

Comments
6 min read
How a $0.02/Call Model Scored 78.2% on SWE-bench Verified — Beating Every Model on the Leaderboard

How a $0.02/Call Model Scored 78.2% on SWE-bench Verified — Beating Every Model on the Leaderboard

Comments
7 min read
Stop Guessing Your RAG Quality: Automating Faithfulness Metrics with Spring AI and LLM-as-a-Judge

Stop Guessing Your RAG Quality: Automating Faithfulness Metrics with Spring AI and LLM-as-a-Judge

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.