Skip to content
View Davidqian123's full-sized avatar
:octocat:
Exploring AI
:octocat:
Exploring AI
  • Nexa AI Inc
  • Bay Area, United States

Block or report Davidqian123

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
17 stars written in C++
Clear filter

LLM inference in C/C++

C++ 90,641 13,892 Updated Dec 1, 2025

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

C++ 76,930 8,303 Updated May 27, 2025

MLX: An array framework for Apple silicon

C++ 22,941 1,410 Updated Nov 27, 2025

Tensor library for machine learning

C++ 13,643 1,413 Updated Nov 24, 2025

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 13,607 2,116 Updated Nov 20, 2025

lightweight, standalone C++ inference engine for Google's Gemma models.

C++ 6,628 576 Updated Nov 28, 2025

Diffusion model(SD,Flux,Wan,Qwen Image,...) inference in pure C/C++

C++ 4,633 448 Updated Nov 30, 2025

⚠️DirectML is in maintenance mode ⚠️ DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tas…

C++ 2,534 327 Updated Sep 23, 2025

General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for…

C++ 2,392 180 Updated Nov 9, 2025

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

C++ 1,555 122 Updated Mar 23, 2025

Low-bit LLM inference on CPU/NPU with lookup table

C++ 896 74 Updated Jun 5, 2025

Suno AI's Bark model in C/C++ for fast text-to-speech generation

C++ 848 80 Updated Nov 16, 2024

C++ implementation of Qwen-LM

C++ 609 60 Updated Dec 6, 2024

Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.

C++ 479 22 Updated Dec 1, 2025

Swift library to work with llama and other large language models.

C++ 275 55 Updated Sep 19, 2025

Vulkan中文教程 | 自撰 | 面向对象

C++ 191 17 Updated Oct 5, 2025

LLM inference in C/C++

C++ 47 5 Updated Nov 30, 2025