Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Sijie Zhu's picture
2 8 3

Sijie Zhu

Zilence006
https://jeff-zilence.github.io/

AI & ML interests

None yet

Organizations

None yet

upvoted 2 papers 3 months ago

SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing

Paper • 2505.02370 • Published May 5 • 14

Vidi: Large Multimodal Models for Video Understanding and Editing

Paper • 2504.15681 • Published Apr 22 • 15
upvoted 6 papers 4 months ago

Where do Large Vision-Language Models Look at when Answering Questions?

Paper • 2503.13891 • Published Mar 18 • 8

Visual Explanation for Deep Metric Learning

Paper • 1909.12977 • Published Sep 27, 2019 • 1

TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization

Paper • 2204.00097 • Published Mar 31, 2022 • 1

TopNet: Transformer-based Object Placement Network for Image Compositing

Paper • 2304.03372 • Published Apr 6, 2023 • 1

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Paper • 2405.05949 • Published May 9, 2024 • 3

Multi-Reward as Condition for Instruction-based Image Editing

Paper • 2411.04713 • Published Nov 6, 2024 • 1
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs