xhluca (Xing Han Lù)

upvoted a paper 3 months ago

Grounding Computer Use Agents on Human Demonstrations

Paper • 2511.07332 • Published Nov 10, 2025 • 106

upvoted 2 papers 4 months ago

FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents

Paper • 2510.03204 • Published Oct 3, 2025 • 7

The Markovian Thinker

Paper • 2510.06557 • Published Oct 8, 2025 • 31

upvoted a paper 7 months ago

MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

Paper • 2012.13978 • Published Dec 27, 2020 • 1

upvoted an article 7 months ago

Article

How to Train Your LLM Web Agent: A Statistical Diagnosis

Jul 8, 2025

•

15

upvoted 2 papers 7 months ago

How to Train Your LLM Web Agent: A Statistical Diagnosis

Paper • 2507.04103 • Published Jul 5, 2025 • 52

LineRetriever: Planning-Aware Observation Reduction for Web Agents

Paper • 2507.00210 • Published Jun 30, 2025 • 6

upvoted a paper 8 months ago

Build the web for agents, not agents for the web

Paper • 2506.10953 • Published Jun 12, 2025 • 21

upvoted an article 10 months ago

Article

MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before

Apr 24, 2025

•

17

upvoted 2 papers 10 months ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published Apr 11, 2025 • 28

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2, 2025 • 87

upvoted 3 papers 11 months ago

upvoted a paper 12 months ago

How to Get Your LLM to Generate Challenging Problems for Evaluation

Paper • 2502.14678 • Published Feb 20, 2025 • 18

upvoted a collection 12 months ago

CHASE

Collection

Generate challenging synthetic data to evaluate LLMs • 5 items • Updated Feb 21, 2025 • 4

upvoted a paper 12 months ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19, 2025 • 44

upvoted an article about 1 year ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15, 2025

•

223

upvoted a paper about 1 year ago

The BrowserGym Ecosystem for Web Agent Research

Paper • 2412.05467 • Published Dec 6, 2024 • 24

upvoted a paper over 1 year ago

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Paper • 2410.01679 • Published Oct 2, 2024 • 27

Xing Han Lù

AI & ML interests

Organizations

Grounding Computer Use Agents on Human Demonstrations

FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents

The Markovian Thinker

MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

How to Train Your LLM Web Agent: A Statistical Diagnosis

How to Train Your LLM Web Agent: A Statistical Diagnosis

LineRetriever: Planning-Aware Observation Reduction for Web Agents

Build the web for agents, not agents for the web

MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval

SafeArena: Evaluating the Safety of Autonomous Web Agents

Societal Alignment Frameworks Can Improve LLM Alignment

How to Get Your LLM to Generate Challenging Problems for Evaluation

CHASE

MMTEB: Massive Multilingual Text Embedding Benchmark

Train 400x faster Static Embedding Models with Sentence Transformers

The BrowserGym Ecosystem for Web Agent Research

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Xing Han Lù

AI & ML interests

Organizations

xhluca's activity

How to Train Your LLM Web Agent: A Statistical Diagnosis

MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before

Train 400x faster Static Embedding Models with Sentence Transformers