DEV Community

# inference

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
The Inference Inversion

The Inference Inversion

Comments
7 min read
Muse Spark beats Llama 4 with 10x less compute. Here's how.

Muse Spark beats Llama 4 with 10x less compute. Here's how.

Comments
7 min read
First Words: LLM Inference on RISC-V

First Words: LLM Inference on RISC-V

Comments
9 min read
Gaussian Process Regression: The Bayesian Approach to Curve Fitting

Gaussian Process Regression: The Bayesian Approach to Curve Fitting

Comments
13 min read
Google Dropped TurboQuant Two Weeks Ago. The Community Already Made It Usable.

Google Dropped TurboQuant Two Weeks Ago. The Community Already Made It Usable.

1
Comments
6 min read
Hierarchical Bayesian Regression with PyMC: When Groups Share Strength

Hierarchical Bayesian Regression with PyMC: When Groups Share Strength

1
Comments
13 min read
From MLE to Bayesian Inference: Why Your Estimate Needs a Prior

From MLE to Bayesian Inference: Why Your Estimate Needs a Prior

Comments
15 min read
The EM Algorithm: An Intuitive Guide with the Coin Toss Example

The EM Algorithm: An Intuitive Guide with the Coin Toss Example

Comments
10 min read
Maximum Likelihood Estimation from Scratch: From Coin Flips to Gaussians

Maximum Likelihood Estimation from Scratch: From Coin Flips to Gaussians

Comments
13 min read
DGX Spark Inference Performance: Local LLM vs Cloud Benchmarks (2026)

DGX Spark Inference Performance: Local LLM vs Cloud Benchmarks (2026)

Comments
5 min read
Estimating Operational Costs for CLIP-Based Image Search on 1 Million Images: Infrastructure Expenses Focused

Estimating Operational Costs for CLIP-Based Image Search on 1 Million Images: Infrastructure Expenses Focused

Comments
12 min read
I built an Ollama alternative with TurboQuant, model groups, and multi-GPU support

I built an Ollama alternative with TurboQuant, model groups, and multi-GPU support

Comments 1
4 min read
How to Optimize AI Agent Costs — Inference, API Calls, and Infrastructure

How to Optimize AI Agent Costs — Inference, API Calls, and Infrastructure

Comments 1
3 min read
Why Inference Compression Compounds for Modular Agents

Why Inference Compression Compounds for Modular Agents

1
Comments
4 min read
Model Serving Infrastructure: Building Scalable Inference

Model Serving Infrastructure: Building Scalable Inference

Comments
7 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.