Skip to content
View nadavrot's full-sized avatar

Block or report nadavrot

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. Efficient matrix multiplication Efficient matrix multiplication
    1
    # High-Performance Matrix Multiplication
    2
    
                  
    3
    This is a short post that explains how to write a high-performance matrix
    4
    multiplication program on modern processors. In this tutorial I will use a
    5
    single core of the Skylake-client CPU with AVX2, but the principles in this post
  2. legday legday Public

    A compressor for ML weights

    C++ 12

  3. memset_benchmark memset_benchmark Public

    This repository contains high-performance implementations of memset and memcpy in assembly.

    Assembly 330 16

  4. fast_log fast_log Public

    A fast implementation of log() and exp()

    C 53 3

  5. arpfloat arpfloat Public

    An arbitrary-precision floating-point library in Rust

    Rust 45 6

  6. pgo_ml pgo_ml Public

    Source code for the paper "Profile Guided Optimization without Profiles: A Machine Learning Approach"

    C 24 9