CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model Paper • 2503.06472 • Published Mar 9 • 8
MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning Paper • 2506.10963 • Published Jun 12 • 9
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering Paper • 2406.10208 • Published Jun 14, 2024 • 22
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published Feb 25 • 37
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published Feb 25 • 37
openai/clip-vit-large-patch14 Zero-Shot Image Classification • 0.4B • Updated Sep 15, 2023 • 8.9M • 1.91k
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step Paper • 2406.04314 • Published Jun 6, 2024 • 30
diffusers/stable-diffusion-xl-1.0-inpainting-0.1 Text-to-Image • Updated Sep 3, 2023 • 135k • 350
stabilityai/stable-diffusion-xl-base-1.0 Text-to-Image • Updated Oct 30, 2023 • 2.75M • • 7.18k