Skip to content
View lahmuller's full-sized avatar

Block or report lahmuller

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. flash-attention flash-attention Public

    Forked from ROCm/flash-attention

    Fast and memory-efficient exact attention

    Python

  2. nano-vllm nano-vllm Public

    Forked from GeeeekExplorer/nano-vllm

    Nano vLLM

    Python

  3. snowflakedb/ArcticInference snowflakedb/ArcticInference Public

    ArcticInference: vLLM plugin for high-throughput, low-latency inference

    Python 452 64

  4. sgl-project/sglang sgl-project/sglang Public

    SGLang is a high-performance serving framework for large language models and multimodal models.

    Python 29.9k 6.8k