Jian Ren's picture

5 17 3

Jian Ren

alanspike

·

https://alanspike.github.io/

alanspike

AI & ML interests

None yet

Organizations

None yet

authored 7 papers 10 months ago

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

Paper • 2502.03639 • Published Feb 5 • 9

Efficient Training with Denoised Neural Weights

Paper • 2407.11966 • Published Jul 16, 2024 • 9

Scalable Ranked Preference Optimization for Text-to-Image Generation

Paper • 2410.18013 • Published Oct 23, 2024 • 15

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation

Paper • 2411.04967 • Published Nov 7, 2024 • 1

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Paper • 2412.09619 • Published Dec 12, 2024 • 28

Wonderland: Navigating 3D Scenes from a Single Image

Paper • 2412.12091 • Published Dec 16, 2024 • 16

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Paper • 2412.10494 • Published Dec 13, 2024 • 2

authored 13 papers over 1 year ago

SF-V: Single Forward Video Generation Model

Paper • 2406.04324 • Published Jun 6, 2024 • 25

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

Paper • 2406.04333 • Published Jun 6, 2024 • 38

SINE: SINgle Image Editing with Text-to-Image Diffusion Models

Paper • 2212.04489 • Published Dec 8, 2022

TextCraftor: Your Text Encoder Can be Image Quality Controller

Paper • 2403.18978 • Published Mar 27, 2024 • 15

EfficientFormer: Vision Transformers at MobileNet Speed

Paper • 2206.01191 • Published Jun 2, 2022 • 1

Motion Representations for Articulated Animation

Paper • 2104.11280 • Published Apr 22, 2021

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

Paper • 2305.17235 • Published May 26, 2023 • 2

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation

Paper • 2206.07771 • Published Jun 15, 2022

Real-Time Neural Light Field on Mobile Devices

Paper • 2212.08057 • Published Dec 15, 2022

iNVS: Repurposing Diffusion Inpainters for Novel View Synthesis

Paper • 2310.16167 • Published Oct 24, 2023 • 1

SPAD : Spatially Aware Multiview Diffusers

Paper • 2402.05235 • Published Feb 7, 2024 • 3

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

Paper • 2402.14797 • Published Feb 22, 2024 • 21