hjkim

hojie11

hojie11

AI & ML interests

Computer Vision, 3D Vision, Anomaly Detection

Recent Activity

upvoted a paper 1 day ago

Qwen3-VL Technical Report

upvoted a paper 1 day ago

Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

upvoted a paper 1 day ago

PixelDiT: Pixel Diffusion Transformers for Image Generation

View all activity

Organizations

None yet

upvoted 5 papers 1 day ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 8 days ago • 96

Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Paper • 2512.03534 • Published 2 days ago • 16

PixelDiT: Pixel Diffusion Transformers for Image Generation

Paper • 2511.20645 • Published 9 days ago • 23

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

Paper • 2512.03041 • Published 2 days ago • 56

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 8 days ago • 88

upvoted a paper 2 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 3 days ago • 155

upvoted 3 papers 3 days ago

DiP: Taming Diffusion Models in Pixel Space

Paper • 2511.18822 • Published 11 days ago • 24

Vision Bridge Transformer at Scale

Paper • 2511.23199 • Published 7 days ago • 41

Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout

Paper • 2511.20649 • Published 9 days ago • 43

upvoted 2 papers 14 days ago

SAM 3D: 3Dfy Anything in Images

Paper • 2511.16624 • Published 14 days ago • 106

First Frame Is the Place to Go for Video Content Customization

Paper • 2511.15700 • Published 15 days ago • 52

upvoted a paper 15 days ago

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 119

upvoted a paper 16 days ago

A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

Paper • 2511.10555 • Published 21 days ago • 60

upvoted a paper 17 days ago

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published 22 days ago • 68

upvoted 2 papers 18 days ago

One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models

Paper • 2511.10629 • Published 21 days ago • 120

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published 21 days ago • 92

upvoted 2 papers 22 days ago

Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising

Paper • 2511.08633 • Published 25 days ago • 53

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published 23 days ago • 109

upvoted a paper 23 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published 26 days ago • 128

upvoted a paper 24 days ago

Visual Spatial Tuning

Paper • 2511.05491 • Published 27 days ago • 49

hjkim

AI & ML interests

Recent Activity

Organizations

hojie11's activity