Skip to content
View AbiGMe's full-sized avatar
🧠
E-com real estate app | Spring Boot·Keycloak·PG·Redis·Docker·Next.js·Flutter
🧠
E-com real estate app | Spring Boot·Keycloak·PG·Redis·Docker·Next.js·Flutter
  • Illinois

Highlights

  • Pro

Block or report AbiGMe

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Deep Reasoning Translation (DRT) Project

239 9 Updated Sep 1, 2025

✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models

632 58 Updated Jun 16, 2025

RewardBench: the first evaluation tool for reward models.

Python 661 90 Updated Jun 12, 2025

A resource repository for representation engineering in large language models

142 5 Updated Nov 14, 2024

A resource repository for machine unlearning in large language models

509 30 Updated Jul 20, 2025

Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research

Python 262 39 Updated Nov 30, 2025

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,280 209 Updated Mar 5, 2024

[NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"

Jupyter Notebook 104 7 Updated Nov 9, 2023

[ CVPR 2024 ] Implementation for "GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation"

Python 282 8 Updated Jun 12, 2024

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 750 52 Updated Sep 27, 2024

[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models

Python 79 4 Updated May 29, 2025

[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Python 229 5 Updated Nov 7, 2025

Source code for the paper "Eraser: Jailbreaking defense in large language models via unlearning harmful knowledge".

Python 9 2 Updated Jul 8, 2024

The fastai book, published as Jupyter Notebooks

Jupyter Notebook 24,105 9,295 Updated Aug 16, 2024

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX

Jupyter Notebook 2,498 462 Updated Sep 12, 2025

[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Python 116 10 Updated Jun 28, 2025

[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Python 5,695 580 Updated Jan 16, 2025

PyTorch implementations of deep reinforcement learning algorithms and environments

Python 5,905 1,209 Updated Jul 25, 2024

First token cutoff sampling inference example

Python 31 1 Updated Jan 15, 2024

Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).

Python 242 25 Updated Oct 29, 2025

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Python 1,162 154 Updated Oct 1, 2024

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

Python 2,257 411 Updated Jul 9, 2024

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python 12,196 2,003 Updated Nov 14, 2025

An implemtation of Everyting of Thoughts (XoT).

Python 155 15 Updated Feb 21, 2024

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 356 38 Updated Jul 10, 2025

[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning

69 3 Updated Jul 13, 2025

B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Python 86 11 Updated May 21, 2025

[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*

Jupyter Notebook 119 7 Updated Dec 10, 2024

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,729 482 Updated Jan 8, 2024

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,853 568 Updated Jul 11, 2025