DEV Community

# pytorch

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Speculative decoding shifted our output distribution and evals missed it

Speculative decoding shifted our output distribution and evals missed it

1
Comments
4 min read
Winograd convolutions cost us 2 mAP and we didn't notice for a month

Winograd convolutions cost us 2 mAP and we didn't notice for a month

Comments
4 min read
Developer Take On: A High-Resolution Neural Cellular Automata

Developer Take On: A High-Resolution Neural Cellular Automata

Comments
4 min read
A 9-point eval gain vanished when we deduped train against test

A 9-point eval gain vanished when we deduped train against test

Comments
4 min read
SO-HMS: A Universal Optimization Framework for Complex Multi-Objective Systems

SO-HMS: A Universal Optimization Framework for Complex Multi-Objective Systems

1
Comments
1 min read
PyTorch from Scratch — Part 1: Tensors, Gradients & Activations

PyTorch from Scratch — Part 1: Tensors, Gradients & Activations

1
Comments
5 min read
AI Coding Tools for Machine Learning Engineers in 2026: Jupyter, PyTorch, and the CUDA Trap

AI Coding Tools for Machine Learning Engineers in 2026: Jupyter, PyTorch, and the CUDA Trap

Comments
5 min read
Carbon-Aware Model Training: Scheduling GPU Workloads Around Electricity Carbon Intensity

Carbon-Aware Model Training: Scheduling GPU Workloads Around Electricity Carbon Intensity

6
Comments
7 min read
Why JAX Is a Much Better Backend for Quantum Circuit Simulation Than PyTorch

Why JAX Is a Much Better Backend for Quantum Circuit Simulation Than PyTorch

Comments
3 min read
Our event-camera detector lost 6 mAP to a badly chosen accumulation window

Our event-camera detector lost 6 mAP to a badly chosen accumulation window

Comments
4 min read
From Bayesian to deep knowledge tracing — upgrading NumPath's student model with a PyTorch LSTM

From Bayesian to deep knowledge tracing — upgrading NumPath's student model with a PyTorch LSTM

Comments
5 min read
QAT vs PTQ on our edge vision model: 6 months of A/B data

QAT vs PTQ on our edge vision model: 6 months of A/B data

Comments
4 min read
Structured channel pruning got our detector under 12ms on a Jetson

Structured channel pruning got our detector under 12ms on a Jetson

Comments
4 min read
Serving 40 LoRA adapters on one base model: the throughput we got

Serving 40 LoRA adapters on one base model: the throughput we got

Comments
4 min read
torch.compile recompiled our SDXL UNet 38 times in production

torch.compile recompiled our SDXL UNet 38 times in production

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.