Skip to content
View yihedeng9's full-sized avatar

Highlights

  • Pro

Block or report yihedeng9

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. rlhf-summary-notes rlhf-summary-notes Public

    A brief and partial summary of RLHF algorithms.

    135 3

  2. OpenVLThinker OpenVLThinker Public

    OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement

    Python 116 6

  3. DuoGuard DuoGuard Public

    DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

    Python 27 4

  4. STIC STIC Public

    Enhancing Large Vision Language Models with Self-Training on Image Comprehension.

    Python 70 4

  5. uclaml/SPIN uclaml/SPIN Public

    The official implementation of Self-Play Fine-Tuning (SPIN)

    Python 1.2k 102

  6. uclaml/PDE uclaml/PDE Public

    Official repo of Progressive Data Expansion: data, code and evaluation

    Jupyter Notebook 29 1