A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
-
Updated
Oct 31, 2025 - C++
A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
High-speed and easy-use LLM serving framework for local deployment
基于各种LLM的聊天机器人框架,支持多语言,语音唤醒,语音对话,本地执行功能,支持 OpenAI,Grok, Claude,讯飞星火,Stable Diffusion,ChatGLM,通义千问,腾讯混元,360 智脑,百川 AI,火山方舟,Ollama ,Gemini等API
An AI-powered embedded system that captures real-time images, generates descriptive captions using Qwen, and reads them out loud to assist the visually impaired.
On the Releases page, you can download pre-built binaries for arm, armv7l and Raspberry pi. LLM inference in C/C++
👁️ Deploy YOLO11 for efficient computer vision on edge devices, optimized for the Horizon X5 RDK with a streamlined C++ codebase.
Add a description, image, and links to the qwen topic page so that developers can more easily learn about it.
To associate your repository with the qwen topic, visit your repo's landing page and select "manage topics."