how-to-optim-algorithm-in-cuda

CUDA, GPU kernel, and AI infrastructure optimization notes.

This repository collects hands-on CUDA kernels, CUTLASS/CuTe notes, Triton examples, PTX ISA notes, PyTorch internals notes, and LLM inference/training optimization material. It is one of my main public study and engineering notebooks for GPU systems work.

Repository Map

cuda-kernels/: handwritten CUDA kernels for reduce, softmax, elementwise, GEMV, indexing, atomic add, upsampling, and linear attention.
cuda-mode/: notes and code from the CUDA-MODE lecture series.
cutlass/: CUTLASS and CuTe DSL notes, including GEMM, TMA, WGMMA, swizzling, and instruction-level material.
triton/: Triton kernels, PyTorch interop examples, and meetup notes.
large-language-model/: LLM serving, training, and systems optimization notes.
pytorch/: PyTorch internals and CUDA-related notes.
papers/: GPU architecture and ML systems paper notes.
ptx-isa/: PTX ISA study notes.
tools/: small helper scripts.
deprecated/: older material kept for reference.

Related Repositories

CUDA and GPU optimization: this repository.
Deep learning compiler notes: https://github.com/BBuf/tvm_mlir_learn
Deep learning framework notes: https://github.com/BBuf/how-to-learn-deep-learning-framework

Status

Actively curated around CUDA kernels, LLM inference optimization, and AI infrastructure. Older Chinese-language notes are being consolidated or replaced with English entry points.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

how-to-optim-algorithm-in-cuda

Repository Map

Related Repositories

Status

Star History

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 709 Commits
cuda-kernels		cuda-kernels
cuda-mode		cuda-mode
cutlass		cutlass
deprecated		deprecated
large-language-model		large-language-model
ml-engineering		ml-engineering
papers		papers
ptx-isa		ptx-isa
pytorch		pytorch
tools		tools
triton		triton
.gitignore		.gitignore
README.md		README.md
RESOURCES.md		RESOURCES.md

Folders and files

Latest commit

History

Repository files navigation

how-to-optim-algorithm-in-cuda

Repository Map

Related Repositories

Status

Star History

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages