A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
-
Updated
Apr 9, 2025 - C++
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
CLIP inference in plain C/C++ with no extra dependencies
multimodal routing, geocoding, and map tiles
LLaVA server (llama.cpp).
The simulation system for robotic general intelligence™
[ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation
ROS2 package that integrates L3CAM sensors using L3CAM SDK
Highway Driving (project 7 of 9 from Udacity Self-Driving Car Engineer Nanodegree)
ROS2 package for the visualization of the fusion of the L3Cam device sensors
Repository to document and advertise our McGill Capstone Group 22 Project
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."