Skip to content
Change the repository type filter

All

    Repositories list

    • Daft

      Public
      High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
      Rust
      3765.1k31668Updated Dec 31, 2025Dec 31, 2025
    • Python
      0000Updated Dec 18, 2025Dec 18, 2025
    • Examples for using the Daft data engine
      Jupyter Notebook
      2922Updated Dec 18, 2025Dec 18, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      12k000Updated Dec 12, 2025Dec 12, 2025
    • This repository is for the active development of the Azure SDK for Rust. For consumers of the SDK we recommend visiting Docs.rs and looking up the docs for any of libraries in the SDK.
      Rust
      322000Updated Oct 22, 2025Oct 22, 2025
    • examples

      Public
      Daft and Ev platform examples
      0100Updated Sep 3, 2025Sep 3, 2025
    • Using Daft to generate new captions for the LAION400m HuggingFace image dataset!
      Jupyter Notebook
      0200Updated Jun 19, 2025Jun 19, 2025
    • Daft landing page
      3021Updated Jun 12, 2025Jun 12, 2025
    • HTML
      0001Updated May 1, 2025May 1, 2025
    • AiAiAi

      Public
      A collection of AI applications written on Daft
      Jupyter Notebook
      0200Updated Apr 21, 2025Apr 21, 2025
    • daft-cli

      Public
      A cli for spinning up and managing Ray clusters for the Daft Query Engine.
      Rust
      31431Updated Feb 15, 2025Feb 15, 2025
    • Benchmarking of distributed query engines
      Python
      2800Updated Jan 24, 2025Jan 24, 2025
    • Jupyter Notebook
      0000Updated Dec 19, 2024Dec 19, 2024
    • JavaScript
      0000Updated Aug 15, 2024Aug 15, 2024
    • Building a simple Multimodal Data Warehouse: workflows to ingest, analyze, process & train models on multimodal data
      Jupyter Notebook
      0200Updated Jul 23, 2024Jul 23, 2024
    • Open, Multi-modal Catalog for Data & AI
      Java
      557000Updated Jun 14, 2024Jun 14, 2024
    • parquet2

      Public
      Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow
      Rust
      61001Updated May 31, 2024May 31, 2024
    • deltacat

      Public
      A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
      Python
      43000Updated May 2, 2024May 2, 2024
    • arrow2

      Public
      Transmute-free Rust library to work with the Arrow format
      Rust
      227100Updated Apr 28, 2024Apr 28, 2024
    • Convert sequences of Rust objects to Arrow tables
      Rust
      27000Updated Apr 4, 2024Apr 4, 2024
    • Rust
      10000Updated Nov 29, 2023Nov 29, 2023
    • ludwig

      Public
      Data-centric declarative deep learning framework
      Python
      1.2k001Updated Oct 26, 2023Oct 26, 2023
    • Code for generating tables of data and tabular files (CSV, JSON, Parquet etc) for testing
      Thrift
      0500Updated Jul 25, 2023Jul 25, 2023
    • Demonstration of Daft on Flyte
      0100Updated Jul 7, 2023Jul 7, 2023
    • icebridge

      Public
      A Python Bridge to Apache Iceberg using Py4J
      Python
      1000Updated Sep 27, 2022Sep 27, 2022
    • MNIST data in JSON format
      0200Updated Sep 1, 2022Sep 1, 2022
    • Kubernetes spawner for JupyterHub in the Eventual Hub
      Python
      306100Updated Jul 23, 2022Jul 23, 2022