Skip to content

Pinned Loading

  1. understand-r1-zero understand-r1-zero Public

    Understanding R1-Zero-Like Training: A Critical Perspective

    Python 1.1k 54

  2. zero-bubble-pipeline-parallelism zero-bubble-pipeline-parallelism Public

    Forked from NVIDIA/Megatron-LM

    Zero Bubble Pipeline Parallelism

    Python 433 32

  3. lorahub lorahub Public

    [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

    Python 656 40

  4. oat oat Public

    🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

    Python 551 47

  5. stde stde Public

    Official implementation of Stochastic Taylor Derivative Estimator (STDE) NeurIPS2024

    Python 122 8

  6. feedback-conditional-policy feedback-conditional-policy Public

    Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"

    Python 51

Repositories

Showing 10 of 98 repositories
  • Precision-RL Public

    Defeating the Training-Inference Mismatch via FP16

    sail-sg/Precision-RL’s past year of commit activity
    Python 64 MIT 7 1 0 Updated Oct 31, 2025
  • Precision-RL-verl Public Forked from volcengine/verl

    Defeating the Training-Inference Mismatch via FP16

    sail-sg/Precision-RL-verl’s past year of commit activity
    Python 1 Apache-2.0 2,415 0 0 Updated Oct 31, 2025
  • oat Public

    🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

    sail-sg/oat’s past year of commit activity
    Python 551 Apache-2.0 47 3 1 Updated Oct 31, 2025
  • jrystal Public

    A JAX-based Differentiable Density Functional Theory Framework for Materials

    sail-sg/jrystal’s past year of commit activity
    Python 38 Apache-2.0 2 5 0 Updated Oct 28, 2025
  • SkyLadder Public Forked from jzhang38/TinyLlama

    The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling

    sail-sg/SkyLadder’s past year of commit activity
    Python 37 Apache-2.0 576 0 0 Updated Oct 15, 2025
  • NDA Public

    Code for "Nonparametric Data Attribution for Diffusion Models"

    sail-sg/NDA’s past year of commit activity
    2 0 1 0 Updated Oct 14, 2025
  • tty-use Public
    sail-sg/tty-use’s past year of commit activity
    C 11 0 0 0 Updated Oct 13, 2025
  • imperceptible-jailbreaks Public

    [ArXiv 2025] Imperceptible Jailbreaking against Large Language Models

    sail-sg/imperceptible-jailbreaks’s past year of commit activity
    Python 22 4 0 0 Updated Oct 7, 2025
  • feedback-conditional-policy Public

    Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"

    sail-sg/feedback-conditional-policy’s past year of commit activity
    Python 51 0 0 0 Updated Sep 29, 2025
  • variational-reasoning Public

    Code for "Variational Reasoning for Language Models"

    sail-sg/variational-reasoning’s past year of commit activity
    Python 51 1 0 0 Updated Sep 29, 2025

Top languages

Loading…