sparse-generative-ai

community

AI & ML interests

None defined yet.

Recent Activity

pingnieuk authored a paper 3 days ago

VisCoder2: Building Multi-Language Visualization Coding Agents

pingnieuk authored a paper 17 days ago

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

pingnieuk authored a paper 27 days ago

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

View all activity

pingnieuk

authored a paper 3 days ago

VisCoder2: Building Multi-Language Visualization Coding Agents

Paper • 2510.23642 • Published 8 days ago • 20

pingnieuk

authored a paper 17 days ago

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Paper • 2510.10666 • Published 20 days ago • 27

pingnieuk

authored a paper 27 days ago

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Paper • 2510.02190 • Published 30 days ago • 18

pingnieuk

authored a paper 29 days ago

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Paper • 2509.26346 • Published Sep 30 • 18

pingnieuk

authored a paper about 1 month ago

VideoScore2: Think before You Score in Generative Video Evaluation

Paper • 2509.22799 • Published Sep 26 • 24

pingnieuk

authored a paper about 2 months ago

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 72

pingnieuk

authored 10 papers 3 months ago

Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs

Paper • 2410.15438 • Published Oct 20, 2024

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published Mar 13 • 24

An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering

Paper • 2503.23415 • Published Mar 30 • 1

Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining

Paper • 2505.11293 • Published May 16

VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation

Paper • 2505.14640 • Published May 20 • 16

Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem

Paper • 2506.03295 • Published Jun 3 • 17

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation

Paper • 2506.03930 • Published Jun 4 • 26

BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent

Paper • 2508.06600 • Published Aug 8 • 40

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8, 2024 • 9

DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding

Paper • 2002.12591 • Published Feb 28, 2020

AppleSwing

updated a Space 3 months ago

OPEN-MOE-LLM-LEADERBOARD

Display and submit models for evaluation on an LLM leaderboard

AppleSwing

in sparse-generative-ai/results 3 months ago

update

#4 opened 3 months ago by

AppleSwing

updated a dataset 3 months ago

sparse-generative-ai/results

Updated Jul 29 • 150

AppleSwing

in sparse-generative-ai/results 3 months ago

update

#3 opened 3 months ago by