Skip to content
View Sierkinhane's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@CVI-SZU

Block or report Sierkinhane

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Sierkinhane/README.md

Hi there, I'm Jinheng Xie. 👋

A third-year PhD student at Show Lab, National University of Singapore, working with Prof. Mike Shou. Prior to my PhD, I dedicated three years to exploring label-efficient learning for scene understanding, focusing on weakly-supervised object localization and semantic segmentation. In my first year of PhD journey, I delved into visual prompt learning and effective controllable image synthesis. Currently, I’m concentrating on unifying multimodal understanding and generation within a native unified multimodal model. I have trained two unified multimodal models, Show-o and Show-o2, with trainable parameters up to 7 billion and utilizing billion-scale datasets.

Pinned Loading

  1. showlab/Show-o showlab/Show-o Public

    [ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

    Python 1.8k 81

  2. showlab/BoxDiff showlab/BoxDiff Public

    [ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

    Python 273 18

  3. CVI-SZU/CCAM CVI-SZU/CCAM Public

    [CVPR 2022] C2AM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation

    Python 199 25

  4. showlab/VisorGPT showlab/VisorGPT Public

    [NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT

    Python 136 3

  5. CVI-SZU/CLIMS CVI-SZU/CLIMS Public

    [CVPR 2022] CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation

    Python 137 12

  6. CRNN_Chinese_Characters_Rec CRNN_Chinese_Characters_Rec Public

    (CRNN) Chinese Characters Recognition.

    Python 1.9k 537