-
2026
Hybrid Pooling with LLMs via Relevance Context Learning
-
2026
Can LLMs Evaluate What They Cannot Annotate? Revisiting LLM Reliability in Hate Speech Detection
LREC 2026
-
2026
LLM-Assisted Pseudo-Relevance Feedback
ECIR 2026
-
2025
How does depression talk on social media? Modeling depression language with
relevance-based statistical language models
Online Social Networks and Media
-
2025
Limitations of Automatic Relevance Assessments with Large Language Models for
Fair and Reliable Retrieval Evaluation
SIGIR 2025
-
2025
Towards Reliable Testing for Multiple Information Retrieval System Comparisons
ECIR 2025
-
2023
How Discriminative Are Your Qrels? How To Study the Statistical Significance of
Document Adjudication Methods
CIKM 2023
-
2023
Relevance feedback for building pooled test collections
Journal of Information Science
-
2021
Building Cultural Heritage Reference Collections from Social Media through
Pooling Strategies: The Case of 2020’s Tensions Over Race and Heritage
Journal on Computing and Cultural Heritage
-
2021
The wisdom of the rankers: a cost-effective method for building pooled test
collections without participant systems
ACM SAC 2021
-
2020
Beaver: Efficiently Building Test Collections for Novel Tasks
CIRCLE 2020
-
2019
Building High-Quality Datasets for Information Retrieval Evaluation at a
Reduced Cost
XoveTIC 2019
-
2019
Exploiting Pooling Methods for Building Datasets for Novel Tasks
FDIA 2019 (colocated with ESSIR 2019)