DEV Community

jidonglab profile picture

jidonglab

1 project a week. Building and sharing the entire process — from idea to shipped product in 7 days. Currently: AI news automation.

Return Claude's Thinking Blocks or Your Agent Breaks

Return Claude's Thinking Blocks or Your Agent Breaks

Comments
6 min read

Want to connect with jidonglab?

Create an account to connect with jidonglab. You can also sign in below to proceed if you already have an account.

Already have an account? Sign in
Binary Quantized Embeddings: 32x Smaller Vectors, Recall Intact

Binary Quantized Embeddings: 32x Smaller Vectors, Recall Intact

Comments
7 min read
Prefill/Decode Disaggregation: Stop Serving LLMs on One GPU

Prefill/Decode Disaggregation: Stop Serving LLMs on One GPU

Comments
6 min read
Chunked Prefill: Why One Long Prompt Freezes Your LLM Server

Chunked Prefill: Why One Long Prompt Freezes Your LLM Server

Comments 1
7 min read
How I Made Opus 4.8 Act Like Fable 5 (64% 97%, Measured)

How I Made Opus 4.8 Act Like Fable 5 (64% 97%, Measured)

Comments
7 min read
FP8 KV Cache Quantization: The Memory Math and the Accuracy Cliff

FP8 KV Cache Quantization: The Memory Math and the Accuracy Cliff

Comments
6 min read
Why LLM Decoding Is Memory-Bound: Prefill vs Decode Roofline

Why LLM Decoding Is Memory-Bound: Prefill vs Decode Roofline

Comments
6 min read
Min-p Sampling: Why Top-p Breaks at High Temperature

Min-p Sampling: Why Top-p Breaks at High Temperature

Comments
7 min read
Matryoshka Embeddings: Truncate Vectors 12x Without Losing Recall

Matryoshka Embeddings: Truncate Vectors 12x Without Losing Recall

Comments
6 min read
The Hidden Token Tax of Tool Use in LLM Agents

The Hidden Token Tax of Tool Use in LLM Agents

Comments
7 min read
Grouped-Query Attention: The KV Cache Math Behind Long Context

Grouped-Query Attention: The KV Cache Math Behind Long Context

Comments
6 min read
RoPE Scaling: How LLMs Stretch From 8K to 128K Context

RoPE Scaling: How LLMs Stretch From 8K to 128K Context

Comments
7 min read
Prompt Caching With Claude: Where the Cache Breakpoint Goes

Prompt Caching With Claude: Where the Cache Breakpoint Goes

Comments
6 min read
Attention Sinks: Why Evicting Your LLM's First Token Breaks It

Attention Sinks: Why Evicting Your LLM's First Token Breaks It

Comments
7 min read
Why Your LLM-as-Judge Disagrees With Itself (And How to Fix It)

Why Your LLM-as-Judge Disagrees With Itself (And How to Fix It)

Comments
7 min read
Speculative Decoding: Why Two Models Decode Faster Than One

Speculative Decoding: Why Two Models Decode Faster Than One

Comments
7 min read
Structured Output Isn't Free: The Constrained-Decoding Tax

Structured Output Isn't Free: The Constrained-Decoding Tax

Comments
6 min read
71,700 Stars and 60 Rust Crates: Inside OpenAI's Codex CLI Source

71,700 Stars and 60 Rust Crates: Inside OpenAI's Codex CLI Source

Comments
6 min read
Pentagon Blacklisted Anthropic From 8 Classified AI Deals

Pentagon Blacklisted Anthropic From 8 Classified AI Deals

Comments
7 min read
Anthropic $900B: 2.4x in 90 Days, 48-Hour Window

Anthropic $900B: 2.4x in 90 Days, 48-Hour Window

Comments
5 min read
Symphony: Why OpenAI's PRs Jumped 500% in 3 Weeks

Symphony: Why OpenAI's PRs Jumped 500% in 3 Weeks

Comments
5 min read
GPT Image 2 Inside Codex: My New Frontend Workflow

GPT Image 2 Inside Codex: My New Frontend Workflow

1
Comments
6 min read
GPT-5.5-Codex vs 5.3: A 200-Task Bench Result

GPT-5.5-Codex vs 5.3: A 200-Task Bench Result

Comments
6 min read
Codex Is No Longer a CLI. Embed It in Your App.

Codex Is No Longer a CLI. Embed It in Your App.

Comments
6 min read
I Gave Codex My Mouse for a Day. Here's What Broke.

I Gave Codex My Mouse for a Day. Here's What Broke.

Comments
6 min read
OpenAI's Super App Play: Why Spud + Duct Tape Matter for Builders

OpenAI's Super App Play: Why Spud + Duct Tape Matter for Builders

Comments
8 min read
Why OpenAI Shipped GPT-5.5 Just 6 Weeks After 5.4

Why OpenAI Shipped GPT-5.5 Just 6 Weeks After 5.4

Comments
8 min read
OpenCode Hit 140K Stars. Why Terminal Agents Won 2026.

OpenCode Hit 140K Stars. Why Terminal Agents Won 2026.

Comments
7 min read
How a Markdown File Hit 16K Stars: Skills in 2026

How a Markdown File Hit 16K Stars: Skills in 2026

1
Comments
7 min read
MIT Tech Review Just Split Its \"Breakthrough\" List. That's the Story.

MIT Tech Review Just Split Its \"Breakthrough\" List. That's the Story.

Comments
7 min read
Stellantis Just Outsourced Its AI Moat to Microsoft. Expect GM, Ford, and VW to Follow.

Stellantis Just Outsourced Its AI Moat to Microsoft. Expect GM, Ford, and VW to Follow.

Comments
8 min read
44% of New Music on Deezer Is AI. Only 0.5% of Streams Are. Read That Twice.

44% of New Music on Deezer Is AI. Only 0.5% of Streams Are. Read That Twice.

Comments
7 min read
Adobe Just Made MCP an Enterprise Procurement Line Item

Adobe Just Made MCP an Enterprise Procurement Line Item

Comments
7 min read
Claude Opus 4.7 Hit 87.6% on SWE-bench. The Story Is What It Didn't Charge You.

Claude Opus 4.7 Hit 87.6% on SWE-bench. The Story Is What It Didn't Charge You.

Comments
7 min read
I Built an AI Newsletter for Myself. 11 Subscribers, 49 Posts, Zero Regrets.

I Built an AI Newsletter for Myself. 11 Subscribers, 49 Posts, Zero Regrets.

Comments
6 min read
DataForge, Atropos, and a 30K-Token Guillotine: Reverse-Engineering Hermes 4's Training Stack

DataForge, Atropos, and a 30K-Token Guillotine: Reverse-Engineering Hermes 4's Training Stack

Comments
8 min read
The Honest Hermes 4 Production Checklist (April 2026 Edition)

The Honest Hermes 4 Production Checklist (April 2026 Edition)

Comments
7 min read
66.5% of My Claude Code Tokens Were Wasted. A 200-Line Wrapper Got Them Back.

66.5% of My Claude Code Tokens Were Wasted. A 200-Line Wrapper Got Them Back.

1
Comments
7 min read
Hermes 4's Tool-Calling Is Trained as a Separate Skill. Here's Why Your Agent Cares.

Hermes 4's Tool-Calling Is Trained as a Separate Skill. Here's Why Your Agent Cares.

Comments
6 min read
Claude Design Ships On Canva's Engine. Figma Has a Problem.

Claude Design Ships On Canva's Engine. Figma Has a Problem.

Comments
6 min read
Go Goroutine Crashes: 97% of the Output Is Noise

Go Goroutine Crashes: 97% of the Output Is Noise

Comments
2 min read
The 96.3% Is a Trap: What Hermes 4 405B Actually Changed

The 96.3% Is a Trap: What Hermes 4 405B Actually Changed

Comments
8 min read
How I Built a 1,056-Test Rust CLI in 3 Weeks

How I Built a 1,056-Test Rust CLI in 3 Weeks

Comments
2 min read
I read all 232 pages of the Opus 4.7 system card

I read all 232 pages of the Opus 4.7 system card

2
Comments
8 min read
Opus 4.7 killed budget_tokens: what changed and how to migrate

Opus 4.7 killed budget_tokens: what changed and how to migrate

Comments
7 min read
OpenAI's 'duct-tape' model appeared on Arena — then vanished

OpenAI's 'duct-tape' model appeared on Arena — then vanished

Comments
6 min read
The npm Deprecated Warning Nobody Reads (But Claude Does)

The npm Deprecated Warning Nobody Reads (But Claude Does)

Comments
2 min read
Reddit's Biggest Coding Community Just Banned AI Content — The Developer Backlash Against AI Slop Begins

Reddit's Biggest Coding Community Just Banned AI Content — The Developer Backlash Against AI Slop Begins

Comments
3 min read
Nature Report: Best AI Agents Still Score Half of Human Scientists — A Reality Check for the Agent Hype

Nature Report: Best AI Agents Still Score Half of Human Scientists — A Reality Check for the Agent Hype

Comments
4 min read
Meta Ditched Llama for a Closed Model Called Muse Spark — Open Source AI Just Lost Its Biggest Champion

Meta Ditched Llama for a Closed Model Called Muse Spark — Open Source AI Just Lost Its Biggest Champion

Comments
4 min read
ASML Raises 2026 Guidance to €40B After Q1 Beat — AI Is Rewriting the Chip Supply Chain

ASML Raises 2026 Guidance to €40B After Q1 Beat — AI Is Rewriting the Chip Supply Chain

Comments
4 min read
OpenAI Bought Its Second Fintech Startup in 6 Months. ChatGPT Is Coming for Your Wallet.

OpenAI Bought Its Second Fintech Startup in 6 Months. ChatGPT Is Coming for Your Wallet.

Comments
2 min read
Google Chrome Just Got AI Macros — Save Any Prompt, Run It Anywhere With One Click

Google Chrome Just Got AI Macros — Save Any Prompt, Run It Anywhere With One Click

Comments
2 min read
Revolut Trained an AI on 40 Billion Banking Events. The Results Are Wild.

Revolut Trained an AI on 40 Billion Banking Events. The Results Are Wild.

1
Comments
2 min read
From $7.5B to $18B in 4 Months: The AI Infrastructure Gold Rush Nobody Saw Coming

From $7.5B to $18B in 4 Months: The AI Infrastructure Gold Rush Nobody Saw Coming

Comments
2 min read
5 Seconds to Install. 60-90% Less Noise. Forever.

5 Seconds to Install. 60-90% Less Noise. Forever.

Comments
2 min read
Track Every Token You Save With contextzip gain

Track Every Token You Save With contextzip gain

Comments
2 min read
Zero Config, Zero Overhead: The Invisible CLI Proxy

Zero Config, Zero Overhead: The Invisible CLI Proxy

Comments
2 min read
Before/After: What Claude Code Actually Sees

Before/After: What Claude Code Actually Sees

Comments
2 min read
Why Your pip Install Output Doesn't Belong in Claude's Context

Why Your pip Install Output Doesn't Belong in Claude's Context

Comments
2 min read
loading...