ReQue: A configurable workflow and dataset collection for query refinement

19/10/2020

ReQue: A configurable workflow and dataset collection for query refinement

Mahtab Tamannaee, Hossein Fani, Fattane Zarrinkalam, Jamil Samouh, Samad Paydar, Ebrahim Bagheri

Keywords: gold standard dataset, query refinement, reproducibility

Abstract Paper Similar Papers

Abstract: In this paper, we implement and publicly share a configurable software workflow and a collection of gold standard datasets for training and evaluating supervised query refinement methods. Existing datasets such as AOL and MS MARCO, which have been extensively used in the literature for this purpose, are based on the weak assumption that users’ input queries improve gradually within a search session, i.e., the last query where the user ends her information seeking session is the best reconstructed version of her initial query. In practice, such an assumption is not necessarily accurate for a variety of reasons, e.g., topic drift. The objective of our work is to enable researchers to build gold standard query refinement datasets without having to rely on such weak assumptions. Our software workflow, which generates such gold standard query datasets, takes three inputs: (1) a dataset of queries along with their associated relevance judgements (e.g. TREC topics), (2) an information retrieval method (e.g., BM25), and (3) an evaluation metric (e.g., MAP), and outputs a gold standard dataset. The produced gold standard dataset includes a list of revised queries for each query in the input dataset, each of which effectively improves the performance of the specified retrieval method in terms of the desirable evaluation metric. Since our workflow can be used to generate gold standard datasets for any input query set, in this paper, we have generated and publicly shared gold standard datasets for TREC queries associated with Robust04, Gov2, ClueWeb09, and ClueWeb12. The source code of our software workflow, the generated gold datasets, and benchmark results for three state-of-the-art supervised query refinement methods over these datasets are made publicly available for reproducibility purposes.

The video of this talk cannot be embedded. You can watch it here:

https://dl.acm.org/doi/10.1145/3340531.3412775#sec-supp

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CIKM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

A Transformer-based Approach for Source Code Summarization

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Keywords Paper

Source Summarization, summarization, ablation studies, Transformer-based Approach

0

0

0

0

6:14

16/11/2020

Improving AMR Parsing with Sequence-to-Sequence Pre-training

Dongqin Xu, Junhui Li, Muhua Zhu and
Min Zhang, Guodong Zhou

Keywords Paper

abstract parsing, amr parsing, sequence-to-sequence parsing, machine translation

0

0

0

0

11:42

12/07/2020

Improving Transformer Optimization Through Better Initialization

Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

14:52

19/08/2021

AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP Investigation in the Recommender System

Pengyu Zhao, Kecheng Xiao, Yuanxing Zhang and
Kaigui Bian, Wei Yan

Keywords Paper

Knowledge Representation and Reasoning, Preference Modelling and Preference-Based Reasoning, Recommender Systems, Recommender Systems

0

0

0

0

15:05

23/06/2021

RbSyn: Type- and Effect-Guided Program Synthesis

Sankha Narayan Guria, Jeffrey S. Foster, David Van Horn

Keywords Paper

program synthesis, type and effect systems, Ruby

0

0

0

0

12:40

16/11/2020

A Diagnostic Study of Explainability Techniques for Text Classification

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Keywords Paper

downstream tasks, machine learning, explainability techniques, diverse techniques

0

0

0

0

11:24

16/11/2020

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih and
Huan Sun, Yu Su

Keywords Paper

bootstrapping, fine-tuning parsers, theoretical analysis, text-to-sql problem

0

0

0

0

11:49

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

29/06/2020

An empirical study on the impact of deimplicitization on comprehension in programs using application frameworks

Jürgen Cito, Jiasi Shen, Martin Rinard

Keywords Paper

0

0

0

0

4:27

29/06/2020

Capture the feature flag: Detecting feature flags in open-source

Jens Meinicke, Juan Hoyos, Bogdan Vasilescu, Christian Kästner

Keywords Paper

0

0

0

0

7:47

19/08/2021

Towards Generating Summaries for Lexically Confusing Code through Code Erosion

Fan Yan, Ming Li

Keywords Paper

Multidisciplinary Topics and Applications, Knowledge-based Software Engineering, Mining Codebase and Software Repository

0

0

0

0

13:31

15/11/2020

Precise Inference of Expressive Units of Measurement Types

Tongtong Xiang, Jeff Y. Luo, Werner Dietl

Keywords Paper

Scientific computing, Pluggable type system, Dimensional analysis, Units of measurements, Type inference

0

0

0

0

13:39

04/07/2020

Controlled Crowdsourcing for High-Quality QA-SRL Annotation

Paul Roit, Ayal Klein, Daniela Stepanov and
Jonathan Mamou, Julian Michael, Gabriel Stanovsky, Luke Zettlemoyer, Ido Dagan

Keywords Paper

High-Quality Annotation, Question-answer Labeling, complex annotation, training

0

0

0

0

6:55

29/06/2020

Can we use SE-specific sentiment analysis tools in a cross-platform setting?

Nicole Novielli, Fabio Calefato, Davide Dongiovanni and
Daniela Girardi, Filippo Lanubile

Keywords Paper

machine learning, Sentiment analysis, NLP, human factors, empirical software engineering

0

0

0

0

17:42

19/01/2020

Provenance-Guided Synthesis of Datalog Programs

Mukund Raghothaman, Jonathan Mendelson, David Zhao and
Mayur Naik, Bernhard Scholz

Keywords Paper

Syntax-Guided Synthesis (SyGuS), Counter-Example Guided Inductive Synthesis (CEGIS), Datalog, SAT solvers, data provenance, Program synthesis

0

0

0

0

21:19

16/11/2020

Evaluating the Factual Consistency of Abstractive Text Summarization

Wojciech Kryscinski, Bryan McCann, Caiming Xiong, Richard Socher

Keywords Paper

assessing algorithms, natural inference, fact checking, auxiliary tasks

0

0

0

0

12:05

02/06/2020

Investigating Software Usage in the Social Sciences: A Knowledge Graph Approach

David Schindler, Benjamin Zapilko, Frank Krüger

Keywords Paper

0

0

0

0

27:13

02/02/2021

MetaAugment: Sample-Aware Data Augmentation Policy Learning

Fengwei Zhou, Jiawei Li, Chuanlong Xie and
Fei Chen, Lanqing Hong, Rui Sun, Zhenguo Li

Keywords Paper

0

0

0

0

18:19

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

04/07/2020

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Paper

Neural training, NMT training, document-level training, NMT objective

0

0

0

0

6:48

04/07/2020

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su and
Zhiyu Chen, William Yang Wang

Keywords Paper

Logical Generation, neural NLG, surface-level realizations, logical inference

0

0

0

0

11:48

14/06/2020

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Xingjia Pan, Yuqiang Ren, Kekai Sheng and
Weiming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu

Keywords Paper

object detection, oriented, densely packed, sku110k, feature selection, dynamic, anchor-free

0

0

0

0

5:01

02/02/2021

Semantics Altering Modifications for Evaluating Comprehension in Machine Reading

Viktor Schlegel, Goran Nenadic, Riza Batista-Navarro

Keywords Paper

0

0

0

0

18:42

04/07/2020

Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study

Xinyu Xing, Xiaosheng Fan, Xiaojun Wan

Keywords Paper

Automatic Texts, citation task, citation generation, automatically texts

0

0

0

0

10:01

06/12/2021

Robust and Decomposable Average Precision for Image Retrieval

Elias Ramzi, Nicolas THOME, Clément Rambour and
Nicolas Audebert, Xavier Bitot

Keywords Paper

deep learning

0

0

0

0

8:13

06/12/2021

Scalable Rule-Based Representation Learning for Interpretable Classification

Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang

Keywords Paper

optimization, machine learning, representation learning, interpretability

0

0

0

0

14:52

16/11/2020

DAGA: Data Augmentation with a Generation Approach forLow-resource Tagging Tasks

Bosheng Ding, Linlin Liu, Lidong Bing and
Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Paper

machine learning, generalization, low-resource tasks, named recognition

0

0

0

0

11:09

01/07/2020

Simple Compounded-Label Training for Fact Extraction and Verification

Yixin Nie, Lisa Bauer, Mohit Bansal

Keywords Paper

0

0

0

0

9:59

06/12/2021

Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space

Jiehong Lin, Hongyang Li, Ke Chen and
Jiangbo Lu, Kui Jia

Keywords Paper

vision

0

0

0

0

12:29

26/04/2020

Measuring the Reliability of Reinforcement Learning Algorithms

Stephanie C.Y. Chan, Samuel Fishman, Anoop Korattikara and
John Canny, Sergio Guadarrama

Keywords Paper

reinforcement learning, metrics, statistics, reliability

0

0

0

0

5:32

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

05/01/2021

ChartOCR: Data Extraction From Charts Images via a Deep Hybrid Framework

Junyu Luo, Zekun Li, Jinpeng Wang, Chin-Yew Lin

Keywords Paper

0

0

0

0

4:58

26/04/2020

CLN2INV: Learning Loop Invariants with Continuous Logic Networks

Gabriel Ryan, Justin Wong, Jianan Yao and
Ronghui Gu, Suman Jana

Keywords Paper

loop invariants, deep learning, logic learning

0

0

0

0

5:12

19/08/2021

Generating Senses and RoLes: An End-to-End Model for Dependency- and Span-based Semantic Role Labeling

Rexhina Blloshmi, Simone Conia, Rocco Tripodi, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Natural Language Generation, Natural Language Processing

0

0

0

0

15:18

19/04/2021

Expanding, retrieving and infilling: Diversifying cross-domain question generation with flexible templates

Xiaojing Yu, Anxiao Jiang

Keywords Paper

0

0

0

0

11:40

15/11/2020

Digging for Fold: Synthesis-Aided API Discovery for Haskell

Michael B. James, Zheng Guo, Ziteng Wang and
Shivani Doshi, Hila Peleg, Ranjit Jhala, Nadia Polikarpova

Keywords Paper

Program Synthesis, Type Inference, Human-Computer Interaction

0

0

0

0

16:01

18/07/2021

Meta-learning Hyperparameter Performance Prediction with Neural Processes

Ying WEI, Peilin Zhao, Junzhou Huang

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

5:07

14/06/2020

AugFPN: Improving Multi-Scale Feature Learning for Object Detection

Chaoxu Guo, Bin Fan, Qian Zhang and
Shiming Xiang, Chunhong Pan

Keywords Paper

object detection, augfpn, consistent supervision, residual feature augmentation, soft roi selection

0

0

0

0

1:00

14/09/2020

Active Learning for Hierarchical Multi-Label Classification

Felipe Kenji Nakano, Ricardo Cerri, Vens Celin

Keywords Paper

0

0

0

0

15:42

03/05/2021

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Tsz Him Cheung, Dit-Yan Yeung

Keywords Paper

automated data augmentation, deep learning, data augmentation, latent space

0

0

0

0

5:11