Skip to main navigation Skip to search Skip to main content

Extended Overview of the CLEF-2023 LongEval Lab on Longitudinal Evaluation of Model Performance

  • Rabab Alkhalifa*
  • , Iman Bilal
  • , Hsuvas Borkakoty
  • , Jose Camacho-Collados
  • , Romain Deveaud*
  • , Alaa El-Ebshihy
  • , Luis Espinosa-Anke
  • , Gabriela Gonzalez-Saez
  • , Petra Galuščáková
  • , Lorraine Goeuriot
  • , Elena Kochkina
  • , Maria Liakata
  • , Daniel Loureiro
  • , Philippe Mulhem
  • , Florina Piroi
  • , Martin Popel
  • , Christophe Servan
  • , Harish Tayyar Madabushi
  • , Arkaitz Zubiaga
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

We describe the first edition of the LongEval CLEF 2023 shared task. This lab evaluates the temporal persistence of Information Retrieval (IR) systems and Text Classifiers. Task 1 requires IR systems to run on corpora acquired at several timestamps, and evaluates the drop in system quality (NDCG) along these timestamps. Task 2 tackles binary sentiment classification at different points in time, and evaluates the performance drop for different temporal gaps. Overall, 37 teams registered for Task 1 and 25 for Task 2. Ultimately, 14 and 4 teams participated in Task 1 and Task 2, respectively.

Original languageEnglish
Pages (from-to)2181-2203
Number of pages23
JournalCEUR Workshop Proceedings
Volume3497
StatePublished - 2023
Event24th Working Notes of the Conference and Labs of the Evaluation Forum, CLEF-WN 2023 - Thessaloniki, Greece
Duration: 18 Sep 202321 Sep 2023

Keywords

  • Evaluation
  • Temporal Generalisability
  • Temporal Persistence

Fingerprint

Dive into the research topics of 'Extended Overview of the CLEF-2023 LongEval Lab on Longitudinal Evaluation of Model Performance'. Together they form a unique fingerprint.

Cite this