DEV Community

# gpu

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Nvidia GreenBoost Lets You Fake More VRAM — And It Actually Kind of Works

Nvidia GreenBoost Lets You Fake More VRAM — And It Actually Kind of Works

Comments
4 min read
Boost Local LLMs: TurboQuant KV Cache, Fast Cold Starts, & Rust GPU Dev

Boost Local LLMs: TurboQuant KV Cache, Fast Cold Starts, & Rust GPU Dev

Comments
4 min read
Fix Zombie VRAM: Clear GPU Memory Without Rebooting

Fix Zombie VRAM: Clear GPU Memory Without Rebooting

1
Comments
4 min read
I shipped Google's TurboQuant as a vLLM plugin 72 hours after the paper — here's what nobody else tested

I shipped Google's TurboQuant as a vLLM plugin 72 hours after the paper — here's what nobody else tested

2
Comments
3 min read
Local LLM vs Claude for Coding: I Benchmarked a $500 GPU Against Cloud AI [2026]

Local LLM vs Claude for Coding: I Benchmarked a $500 GPU Against Cloud AI [2026]

Comments
8 min read
Local LLM Power-Ups: Voxtral TTS, TurboQuant, & Sub-Second Cold Starts

Local LLM Power-Ups: Voxtral TTS, TurboQuant, & Sub-Second Cold Starts

Comments
3 min read
Compressed VLM inference from a single Containerfile — turboquant-vllm v1.1

Compressed VLM inference from a single Containerfile — turboquant-vllm v1.1

1
Comments
2 min read
vLLM On-Demand Gateway: Zero-VRAM Standby for Local LLMs on Consumer GPUs

vLLM On-Demand Gateway: Zero-VRAM Standby for Local LLMs on Consumer GPUs

1
Comments
4 min read
Local LLM Unleashed: Faster Inference, Instant Starts, & Open TTS

Local LLM Unleashed: Faster Inference, Instant Starts, & Open TTS

Comments
4 min read
I Tried Speculative Decoding on RTX 4060 8GB — Every Config Was Slower Than Baseline

I Tried Speculative Decoding on RTX 4060 8GB — Every Config Was Slower Than Baseline

1
Comments
8 min read
Local LLM Security Criticals, Rust on GPU, & Deep Dive into PTX Optimization

Local LLM Security Criticals, Rust on GPU, & Deep Dive into PTX Optimization

Comments
3 min read
Building a Cost-Effective Local AI Server in 2026: Proxmox, PCIe Passthrough, and Surviving the GPU Shortage

Building a Cost-Effective Local AI Server in 2026: Proxmox, PCIe Passthrough, and Surviving the GPU Shortage

Comments
4 min read
Introducing vMetal: Run Your GPU Data Center Like a Hyperscaler

Introducing vMetal: Run Your GPU Data Center Like a Hyperscaler

Comments
4 min read
I Rented Out My GPU for Passive Income — Here’s What Happened After My First Week

I Rented Out My GPU for Passive Income — Here’s What Happened After My First Week

Comments
4 min read
Running a 4-Agent AI Fleet on a Single NVIDIA RTX 3060 Ti

Running a 4-Agent AI Fleet on a Single NVIDIA RTX 3060 Ti

1
Comments
6 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.