Skip to content

[Feature] - Lumina-DiMOO and MMaDA-Parallel #1004

@JohnLoveJoy

Description

@JohnLoveJoy

Feature Summary

Omni foundational image gen model for seamless multimodal generation and understanding.

Detailed Description

These new models are much more compelling than the massive, multi-billion-parameter models that only a tiny fraction of users can run.

https://github.com/Alpha-VLLM/Lumina-DiMOO

"We introduce Lumina-DiMOO, an open-source foundational model for seamless multimodal generation and understanding. Lumina-DiMOO sets itself apart from prior unified models by utilizing a fully discrete diffusion modeling to handle inputs and outputs across various modalities. This innovative approach allows Lumina-DiMOO to achieve higher sampling efficiency compared to previous autoregressive (AR) or hybrid AR-diffusion paradigms and adeptly support a broad spectrum of multimodal tasks, including text-to-image generation, image-to-image generation (e.g., image editing, subject-driven generation, and image inpainting, etc.), as well as image understanding."

Image Image

https://github.com/tyfeld/MMaDA-Parallel

"While thinking-aware generation aims to improve performance on complex tasks, we identify a critical failure mode where existing sequential, autoregressive approaches can paradoxically degrade performance due to error propagation. To systematically analyze this issue, we propose ParaBench, a new benchmark designed to evaluate both text and image output modalities. Our analysis using ParaBench reveals that this performance degradation is strongly correlated with poor alignment between the generated reasoning and the final image. To resolve this, we propose a parallel multimodal diffusion framework that enables continuous, bidirectional interaction between text and images throughout the entire denoising trajectory."

Image

Alternatives you considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions