sm121
Here are 11 public repositories matching this topic...
7.67× LoRA / 8.35× Full FT speedup for Qwen3.5 (0.8B–27B) on NVIDIA DGX Spark — wall-clock parity with rented H100. Lossless within BF16. Three-command interactive wizard handles model picker, data validator, training, and merge.
-
Updated
May 19, 2026 - Python
Production runbook for Qwen3.5-122B hybrid INT4+FP8 on NVIDIA DGX Spark GB10 — optimization stack, PD firmware wedge diagnosis, bench results
-
Updated
Jun 18, 2026
Patches + recipe to deploy festr2/MiMo-V2.5-Pro-NVFP4-MXFP8-attn-TP8 on 8-node DGX Spark sm_121 (Ray + vLLM, TP=8). Fixes the fused-qkv loader bug that mis-slotted Q values as K/V on 7 of 8 ranks.
-
Updated
May 19, 2026 - Python
Empirical kernel scheduling characterization for NVIDIA GB10 (SM121a). Sweeps GEMM tile configurations, classifies PTX instruction paths, captures hardware telemetry
-
Updated
May 10, 2026 - C++
Pre-built PyTorch wheels and build scripts for NVIDIA DGX Spark (GB10, sm_121, Blackwell, CUDA 13.0, ARM64)
-
Updated
Jun 25, 2026 - Shell
DGX Spark (GB10/SM121) platform support for Meta's KernelAgent — auto-detect, hardware constraints, safe Triton configs
-
Updated
Mar 14, 2026 - Python
OpenMP Offloading Validation & Verification Suite; Official repository. We have migrated from bitbucket!! For documentation, results, publication and presentations, please check out our website ->
-
Updated
Jun 15, 2026 - C
Improve this page
Add a description, image, and links to the sm121 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the sm121 topic, visit your repo's landing page and select "manage topics."