HanSaem Kim

kensaem

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

FLARE: Fast Low-rank Attention Routing Engine

upvoted a paper 18 days ago

Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization

upvoted a paper 23 days ago

FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation

View all activity

Organizations

None yet

upvoted 2 papers 18 days ago

FLARE: Fast Low-rank Attention Routing Engine

Paper • 2508.12594 • Published 24 days ago • 7

Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization

Paper • 2508.14811 • Published 22 days ago • 40

upvoted 2 papers 23 days ago

FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation

Paper • 2508.11255 • Published 27 days ago • 10

DINOv3

Paper • 2508.10104 • Published 29 days ago • 247

upvoted 3 papers 24 days ago

Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation

Paper • 2508.07901 • Published Aug 11 • 39

Story2Board: A Training-Free Approach for Expressive Storyboard Generation

Paper • 2508.09983 • Published 29 days ago • 67

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published 28 days ago • 142

upvoted 8 papers about 2 months ago

PUSA V1.0: Surpassing Wan-I2V with $500 Training Cost by Vectorized Timestep Adaptation

Paper • 2507.16116 • Published Jul 22 • 10

StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 29

Tora2: Motion and Appearance Customized Diffusion Transformer for Multi-Entity Video Generation

Paper • 2507.05963 • Published Jul 8 • 12

4KAgent: Agentic Any Image to 4K Super-Resolution

Paper • 2507.07105 • Published Jul 9 • 100

Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection

Paper • 2507.07994 • Published Jul 10 • 2

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 60

UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks

Paper • 2507.11336 • Published Jul 15 • 4

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

Paper • 2507.12841 • Published Jul 17 • 41

upvoted 2 papers 2 months ago

T-LoRA: Single Image Diffusion Model Customization Without Overfitting

Paper • 2507.05964 • Published Jul 8 • 116

SingLoRA: Low Rank Adaptation Using a Single Matrix

Paper • 2507.05566 • Published Jul 8 • 111

upvoted 3 papers 3 months ago

HanSaem Kim

AI & ML interests

Recent Activity

Organizations

kensaem's activity