Sandro Pezzelle's picture

Sandro Pezzelle

sandropezzelle

·

https://sandropezzelle.github.io/

sandropezzelle

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

The LAMBADA dataset: Word prediction requiring a broad discourse context

authored a paper about 2 months ago

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

authored a paper about 2 months ago

From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions

View all activity

Organizations

authored 5 papers about 2 months ago

The LAMBADA dataset: Word prediction requiring a broad discourse context

Paper • 1606.06031 • Published Jun 20, 2016

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

Paper • 2406.18403 • Published Jun 26, 2024

From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions

Paper • 2502.13791 • Published Feb 19 • 5

GROOViST: A Metric for Grounding Objects in Visual Storytelling

Paper • 2310.17770 • Published Oct 26, 2023

Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and Repetition

Paper • 2407.04559 • Published Jul 5, 2024