Trending

See what the GitHub community is most excited about this month.

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 30,391 3,677 Built by

368 stars this month

NVIDIA / cuopt

GPU accelerated decision optimization

Cuda 961 203 Built by

63 stars this month

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 3,501 299 Built by

106 stars this month

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

Cuda 2,347 226 Built by

68 stars this month

NVIDIA / cudf-spark-jni

RAPIDS Accelerator JNI For Apache Spark

Cuda 59 85 Built by

4 stars this month

NVIDIA / cuCollections

Cuda 652 112 Built by

6 stars this month

NVIDIA / cuvs

cuVS - a library for vector search and clustering on the GPU

Cuda 795 197 Built by

28 stars this month

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Cuda 1,244 222 Built by

99 stars this month

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 3,116 283 Built by

77 stars this month

NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,457 2,066 Built by

60 stars this month

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,451 437 Built by

63 stars this month

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient BLAS kernel library on GPU

Cuda 7,460 1,079 Built by

162 stars this month

NVIDIA / nvbench

CUDA Kernel Benchmarking Library

Cuda 880 109 Built by

15 stars this month

princeton-vl / lietorch

Cuda 840 102 Built by

4 stars this month

rahul-goel / fused-ssim

Lightning fast differentiable SSIM.

Cuda 231 81 Built by

4 stars this month

NVIDIA / nccl-tests

NCCL Tests

Cuda 1,570 385 Built by

34 stars this month

NVIDIA / cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,834 462 Built by

6 stars this month