Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
-
Updated
Sep 24, 2025 - Python
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
Survey: https://arxiv.org/pdf/2507.20198
A High-Efficiency System of Large Language Model Based Search Agents
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
Official PyTorch implementation of the paper "Dataset Distillation via the Wasserstein Metric" (ICCV 2025).
TinyML and Efficient Deep Learning Computing | MIT 6.S965/6.5940
Dynamic Attention Mask (DAM) generate adaptive sparse attention masks per layer and head for Transformer models, enabling long-context inference with lower compute and memory overhead without fine-tuning.
Official PyTorch implementation of the paper "Towards Adversarially Robust Dataset Distillation by Curvature Regularization" (AAAI 2025).
A deep learning framework that implements Early Exit strategies in Convolutional Neural Networks (CNNs) using Deep Q-Learning (DQN). This project enhances computational efficiency by dynamically determining the optimal exit point in a neural network for image classification tasks on CIFAR-10.
Fast, concise, LLM-first Generative UI language
Transformer (GPT) implemented from scratch in C++. Runs on modest hardware with complete mathematical derivations and optimized tensor operations.
Ground-Truthing AI Energy Consumption: Validating CodeCarbon Against External Measurements
Symbolic Transformers: 2.2MB models for logical reasoning. Achieves 47% accuracy with 566K parameters—220× smaller than GPT-2. Proves data quality > model size for symbolic AI. 🔬 Novel base-625 symbolic encoding | 🚀 Edge-deployable | 📊 Open research
🔬 Curiosity-Driven Quantized Mixture of Experts
Welcome to my digital garden where I cultivate thoughts on Machine Learning, Generative AI, Trustworthy AI, AI Systems, Efficient AI, and Paper Reviews.
An open and practical guide to Edge Language
In this repo you will understand .The process of reducing the precision of a model’s parameters and/or activations (e.g., from 32-bit floating point to 8-bit integers) to make neural networks smaller, faster, and more energy-efficient with minimal accuracy loss.
"TRM (Tiny Recursive Model) integration architecture for Symbion.space ecosystem"
MOCA-Net: Novel neural architecture with sparse MoE, external memory, and budget-aware computation. Real Stanford SST-2 integration, O(L) complexity, 96.40% accuracy. Built for efficient sequence modeling.
Add a description, image, and links to the efficient-ai topic page so that developers can more easily learn about it.
To associate your repository with the efficient-ai topic, visit your repo's landing page and select "manage topics."