Skip to content
View coder543's full-sized avatar

Highlights

  • Pro

Block or report coder543

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Hybrid Schema-Guided Reasoning (SGR) has agentic system design created by neuraldeep community

Python 911 161 Updated Dec 31, 2025

A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM …

Jupyter Notebook 9,041 1,400 Updated Dec 5, 2025

Your favorite self-hostable alternative to Google Timeline (Google Location History)

Ruby 7,535 237 Updated Dec 31, 2025

🔍 AI search engine - self-host with local or cloud LLMs

TypeScript 3,497 324 Updated Sep 27, 2024

High performance self-hosted photo and video management solution.

TypeScript 87,962 4,650 Updated Dec 31, 2025

Foundational model for human-like, expressive TTS

Python 4,197 694 Updated Jul 30, 2024

Cast Mac windows to visionOS

Swift 884 47 Updated Oct 27, 2025

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 7,936 601 Updated Jul 17, 2024

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.

Python 1,394 68 Updated Aug 4, 2025

A fast, local neural text to speech system

C++ 10,388 883 Updated Aug 26, 2025

An Open Source text-to-speech system built by inverting Whisper.

Jupyter Notebook 4,548 263 Updated Dec 14, 2025

Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message…

TypeScript 32,769 6,520 Updated Dec 31, 2025

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 35,728 3,980 Updated Apr 19, 2025

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,427 249 Updated Dec 3, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,106 648 Updated Aug 10, 2024

llama.cpp with BakLLaVA model describes what does it see

Python 380 41 Updated Nov 8, 2023

A programming framework for agentic AI

Python 53,050 8,056 Updated Oct 8, 2025

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,712 449 Updated May 29, 2024

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Python 5,012 400 Updated Jul 10, 2024

⏩ Ship faster with Continuous AI. Open-source CLI that can be used in TUI mode as a coding agent or Headless mode to run background agents

TypeScript 30,600 3,974 Updated Jan 1, 2026

AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.

Rust 3,430 298 Updated Dec 30, 2025
Python 3,377 146 Updated Feb 25, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,853 582 Updated May 3, 2024

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.

Jupyter Notebook 4,983 580 Updated Sep 17, 2024

Toy Gaussian Splatting visualization in Unity

C# 3,009 401 Updated Oct 17, 2025

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Python 20,152 2,856 Updated Oct 17, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 19,380 2,078 Updated Oct 21, 2025

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,963 784 Updated Feb 11, 2024

Search images with a text or image query, using Open AI's pretrained CLIP model.

Python 261 25 Updated Jan 15, 2022
Next