Skip to content
View Gregory-Pereira's full-sized avatar
🚀
keeping everything running
🚀
keeping everything running

Block or report Gregory-Pereira

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. llm-d/llm-d llm-d/llm-d Public

    Achieve state of the art inference performance with modern accelerators on Kubernetes

    Shell 3.6k 569

  2. kubernetes-sigs/gateway-api-inference-extension kubernetes-sigs/gateway-api-inference-extension Public

    Gateway API Inference Extension

    Go 701 293

  3. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 85k 18.7k

  4. deepseek-ai/DeepEP deepseek-ai/DeepEP Public

    DeepEP: an efficient expert-parallel communication library

    Cuda 9.8k 1.3k

  5. llm-d/llm-d-latency-predictor llm-d/llm-d-latency-predictor Public

    Latency prediction service for ML-model based scoring with llm-d-inference-scheduler

    Python 6 14

  6. llm-d-router llm-d-router Public

    Forked from llm-d/llm-d-router

    Inference scheduler for llm-d

    Go 1