Skip to content
View TrinityXI's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report TrinityXI

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. Fast-llm Fast-llm Public

    Forked from ztxz16/fastllm

    fastllm是c++实现,后端无依赖(仅依赖CUDA,无需依赖PyTorch)的高性能大模型推理库。 可实现单4090推理DeepSeek R1 671B INT4模型,单路可达20+tps。

    C++