DEV Community

# cuda

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
ROCm RX 6700 XT Installation Guide on Ubuntu 24.04

ROCm RX 6700 XT Installation Guide on Ubuntu 24.04

Comments
2 min read
CUDA Kernel Execution Debugging Journey

CUDA Kernel Execution Debugging Journey

1
Comments
3 min read
Evolution of GPU Programming

Evolution of GPU Programming

Comments
26 min read
Custom CUDA Kernels Outperforming cuBLAS: Deep Dive into GPU Memory Optimization for Small-Batch ML Workloads

Custom CUDA Kernels Outperforming cuBLAS: Deep Dive into GPU Memory Optimization for Small-Batch ML Workloads

Comments
9 min read
Single bash script to install CUDA 12.8 on Ubuntu

Single bash script to install CUDA 12.8 on Ubuntu

Comments
2 min read
Just finished my GGUF-Shard

Just finished my GGUF-Shard

Comments
1 min read
Demystifying GPUs: From Core Architecture to Scalable Systems

Demystifying GPUs: From Core Architecture to Scalable Systems

81
Comments 2
12 min read
"A wild goose never laid a tame egg" - I rebuild the Xerxes DDoS Tool

"A wild goose never laid a tame egg" - I rebuild the Xerxes DDoS Tool

1
Comments 2
6 min read
#Day1 of My Journey to Google

#Day1 of My Journey to Google

Comments
1 min read
WSL2 TensorFlow GPU Setup – RTX 4060 + Ubuntu 22.04 + CUDA 12.2 + cuDNN

WSL2 TensorFlow GPU Setup – RTX 4060 + Ubuntu 22.04 + CUDA 12.2 + cuDNN

Comments
2 min read
CUDA Deep Dive: Demystifying Kernels, Thread Hierarchies, and the GPU Execution Model: P-1

CUDA Deep Dive: Demystifying Kernels, Thread Hierarchies, and the GPU Execution Model: P-1

Comments 2
8 min read
NVIDIA CUDA Toolkit 12.8

NVIDIA CUDA Toolkit 12.8

2
Comments
2 min read
Building a JS pytorch clone: Performance investigation

Building a JS pytorch clone: Performance investigation

Comments
9 min read
CUDA Series (2/3)

CUDA Series (2/3)

Comments
5 min read
CUDA Series (1/3)

CUDA Series (1/3)

5
Comments
11 min read
Implementing DeepSeek-R1 Tool Calls with OpenWebUI and Llama.cpp for Local AI Workflows

Implementing DeepSeek-R1 Tool Calls with OpenWebUI and Llama.cpp for Local AI Workflows

Comments
2 min read
Accelerating OpenCV with CUDA on Jetson Orin NX: A Complete Build Guide

Accelerating OpenCV with CUDA on Jetson Orin NX: A Complete Build Guide

Comments
4 min read
Running Nvidia COSMOS on A100 80Gb

Running Nvidia COSMOS on A100 80Gb

2
Comments
2 min read
Global vs Static in C++

Global vs Static in C++

Comments
1 min read
OpenMP Data-Sharing Clauses: Differences Explained

OpenMP Data-Sharing Clauses: Differences Explained

2
Comments
2 min read
"Learn HPC with me" kickoff

"Learn HPC with me" kickoff

Comments
1 min read
Snooping on your GPU: Using eBPF to Build Zero-instrumentation CUDA Monitoring

Snooping on your GPU: Using eBPF to Build Zero-instrumentation CUDA Monitoring

7
Comments 1
15 min read
Qt error when opening ncu-ui

Qt error when opening ncu-ui

Comments
1 min read
Using Polars/Tensorflow with NVIDIA GPU (CUDA), on Windows using WSL2

Using Polars/Tensorflow with NVIDIA GPU (CUDA), on Windows using WSL2

3
Comments
4 min read
Lattice Generation using GPU computing in realtime

Lattice Generation using GPU computing in realtime

Comments
1 min read
loading...