Skip to content
View efrantar's full-sized avatar

Organizations

@IST-DASLab

Block or report efrantar

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. IST-DASLab/gptq IST-DASLab/gptq Public

    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

    Python 2.1k 171

  2. IST-DASLab/sparsegpt IST-DASLab/sparsegpt Public

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    Python 803 105

  3. IST-DASLab/marlin IST-DASLab/marlin Public

    FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

    Python 835 67

  4. IST-DASLab/qmoe IST-DASLab/qmoe Public

    Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

    Python 276 21

  5. IST-DASLab/OBC IST-DASLab/OBC Public

    Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".

    Python 120 15

  6. rob-twophase rob-twophase Public

    The ultimate Rubik's Cube solving algorithm for high-speed axial robots.

    C++ 132 11