Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
-
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Paper • 2506.18898 • Published • 32 -
44
Tar
🚀Unified MLLM with Text-Aligned Representations
-
2
Tar
🚀Unified MLLM with Text-Aligned Representations
-
60
Tar
🚀Unified MLLM with Text-Aligned Representations