Jian Chen's picture

Open to Collab

Jian Chen

puar-playground

·

https://jian-chen.name/CV/

AI & ML interests

Audio AI, Document Intelligence, Large language models, MultiModal models, Diffusion models

Recent Activity

authored a paper 9 days ago

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

authored a paper 9 days ago

Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

upvoted a paper 11 days ago

Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

View all activity

Organizations

None yet

authored 2 papers 9 days ago

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Paper • 2507.21391 • Published Jul 28, 2025

Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

Paper • 2512.18880 • Published 13 days ago • 23

authored a paper 5 months ago

VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding

Paper • 2508.07493 • Published Aug 10, 2025 • 8

authored 3 papers 6 months ago

GUI Agents: A Survey

Paper • 2412.13501 • Published Dec 18, 2024 • 29

Towards Visual Text Grounding of Multimodal Large Language Model

Paper • 2504.04974 • Published Apr 7, 2025 • 17

MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models

Paper • 2506.23009 • Published Jun 28, 2025 • 11

authored 6 papers about 1 year ago

Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints

Paper • 2402.04754 • Published Feb 7, 2024 • 1

LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models

Paper • 2407.19185 • Published Jul 27, 2024 • 2

MMR: Evaluating Reading Ability of Large Multimodal Models

Paper • 2408.14594 • Published Aug 26, 2024 • 1

TextLap: Customizing Language Models for Text-to-Layout Planning

Paper • 2410.12844 • Published Oct 9, 2024 • 1

A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25, 2024 • 46

LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding

Paper • 2411.01106 • Published Nov 2, 2024 • 4