Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.11355

about 2 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

For Content Creator

Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era

Paper • 2305.06131 • Published May 10, 2023 • 2
Perpetual Humanoid Control for Real-time Simulated Avatars

Paper • 2305.06456 • Published May 10, 2023 • 1
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

Paper • 2305.10973 • Published May 18, 2023 • 37
LDM3D: Latent Diffusion Model for 3D

Paper • 2305.10853 • Published May 18, 2023 • 11

Diffusion Models

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 63
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 62
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

Paper • 2409.01199 • Published Sep 2, 2024 • 14

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10, 2024 • 33
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 62
The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 43
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Paper • 2406.09416 • Published Jun 13, 2024 • 30
Wavelets Are All You Need for Autoregressive Image Generation

Paper • 2406.19997 • Published Jun 28, 2024 • 32
ViPer: Visual Personalization of Generative Models via Individual Preference Learning

Paper • 2407.17365 • Published Jul 24, 2024 • 13
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published Aug 20, 2024 • 12

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Paper • 2402.10210 • Published Feb 15, 2024 • 36
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 30
3D Gaussian Splatting for Real-Time Radiance Field Rendering

Paper • 2308.04079 • Published Aug 8, 2023 • 188

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 141
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Paper • 2409.14988 • Published Sep 23, 2024 • 24
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 29

AI Math: Diffusion

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 66
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 37
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 62

Generative Model

SelfEval: Leveraging the discriminative nature of generative models for evaluation

Paper • 2311.10708 • Published Nov 17, 2023 • 17
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 116
NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 75
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32

Stable Diffusion

Zero-shot Image Editing with Reference Imitation

Paper • 2406.07547 • Published Jun 11, 2024 • 34
Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 27
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4, 2024 • 26

about 2 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Paper • 2402.10210 • Published Feb 15, 2024 • 36
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 30
3D Gaussian Splatting for Real-Time Radiance Field Rendering

Paper • 2308.04079 • Published Aug 8, 2023 • 188

For Content Creator

Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era

Paper • 2305.06131 • Published May 10, 2023 • 2
Perpetual Humanoid Control for Real-time Simulated Avatars

Paper • 2305.06456 • Published May 10, 2023 • 1
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

Paper • 2305.10973 • Published May 18, 2023 • 37
LDM3D: Latent Diffusion Model for 3D

Paper • 2305.10853 • Published May 18, 2023 • 11

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 141
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Paper • 2409.14988 • Published Sep 23, 2024 • 24
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 29

Diffusion Models

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 63
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 62
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

Paper • 2409.01199 • Published Sep 2, 2024 • 14

AI Math: Diffusion

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 66
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 37
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 62

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10, 2024 • 33
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 62
The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 43
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32

Generative Model

SelfEval: Leveraging the discriminative nature of generative models for evaluation

Paper • 2311.10708 • Published Nov 17, 2023 • 17
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 116
NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 75
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Paper • 2406.09416 • Published Jun 13, 2024 • 30
Wavelets Are All You Need for Autoregressive Image Generation

Paper • 2406.19997 • Published Jun 28, 2024 • 32
ViPer: Visual Personalization of Generative Models via Individual Preference Learning

Paper • 2407.17365 • Published Jul 24, 2024 • 13
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published Aug 20, 2024 • 12

Stable Diffusion

Zero-shot Image Editing with Reference Imitation

Paper • 2406.07547 • Published Jun 11, 2024 • 34
Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 27
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4, 2024 • 26

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs