An Yan's picture

5 7 4

An Yan

zzxslp

·

zzxslp

AI & ML interests

Vision and Language, text generation

Organizations

upvoted 4 papers 3 months ago

Trust but Verify: Programmatic VLM Evaluation in the Wild

Paper • 2410.13121 • Published Oct 17, 2024 • 3

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

Paper • 2311.07562 • Published Nov 13, 2023 • 15

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14 • 68

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 78

upvoted a paper 9 months ago

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Paper • 2411.07461 • Published Nov 12, 2024 • 24

upvoted a paper 12 months ago

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101

upvoted a paper over 1 year ago

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Paper • 2404.16375 • Published Apr 25, 2024 • 18