-
HKUST
- Hong Kong
Highlights
- Pro
Stars
Godot template with a main menu, options menus, pause menu, credits, scene loader, extra tools, and an example game scene.
Bash is all you need - A nano Claude Code–like agent, built from 0 to 1
Elevate your AI research writing, no more tedious polishing ✨
[EMNLP 2025 Oral] MemoryOS is designed to provide a memory operating system for personalized AI agents.
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.
Enable Opencode to authenticate against Antigravity (Google's IDE) via OAuth so you can use Antigravity rate limits and access models like gemini-3-pro and claude-opus-4-5-thinking with your Google…
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
Dynamic context pruning plugin for OpenCode - intelligently manages conversation context to optimize token usage
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
A cross-platform desktop All-in-One assistant tool for Claude Code, Codex, OpenCode & Gemini CLI.
Get Spotify tracks in true FLAC from Tidal, Qobuz, Amazon Music & Deezer — no account required.
Multilingual Voice Understanding Model
An open-source AI agent that brings the power of Gemini directly into your terminal.
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...…
Foundational Models for State-of-the-Art Speech and Text Translation
📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
MiMo-Audio: Audio Language Models are Few-Shot Learners
Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.
Official inference code for UniSS: Unified Expressive Speech-to-Speech Translation with Your Voice.




