ClarQ: A large-scale and diverse dataset for Clarification Question Generation

Abstract: Question answering and conversational systems are often baffled and need help clarifying certain ambiguities. However, limitations of existing datasets hinder the development of large-scale models capable of generating and utilising clarification questions. In order to overcome these limitations, we devise a novel bootstrapping framework (based on self-supervision) that assists in the creation of a diverse, large-scale dataset of clarification questions based on post-comment tuples extracted from stackexchange. The framework utilises a neural network based architecture for classifying clarification questions. It is a two-step method where the first aims to increase the precision of the classifier and second aims to increase its recall. We quantitatively demonstrate the utility of the newly created dataset by applying it to the downstream task of question-answering. The final dataset, ClarQ, consists of ~2M examples distributed across 173 domains of stackexchange. We release this dataset in order to foster research into the field of clarification question generation with the larger goal of enhancing dialog and question answering systems.

02/02/2021

ClarQ: A large-scale and diverse dataset for Clarification Question Generation

Vaibhav Kumar, Alan W Black

Comments

Similar Papers

Reinforced History Backtracking for Conversational Question Answering

Minghui Qiu, Xinjing Huang, Cen Chen and Feng Ji, Chen Qu, Wei Wei, Jun Huang, Yin Zhang

Keywords Abstract Paper

Curriculum-Meta Learning for Order-Robust Continual Relation Extraction

Tongtong Wu, Xuekai Li, Yuan-Fang Li and Gholamreza Haffari, Guilin Qi, Yujin Zhu, Guoqiang Xu

Keywords Abstract Paper

Learning to generate reformulation actions for scalable conversational query understanding

Zihan Xu, Jiangang Zhu, Ling Geng and Yang Yang, Bojia Lin, Daxin Jiang

Keywords Abstract Paper

contextual query reformulation, question answering, conversational query understanding

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su and Zhiyu Chen, William Yang Wang

Keywords Abstract Paper

Logical Generation, neural NLG, surface-level realizations, logical inference

High Dimensional Level Set Estimation with Bayesian Neural Network

Huong Ha, Sunil Gupta, Santu Rana, Svetha Venkatesh

Keywords Abstract Paper

Entity Guided Question Generation with Contextual Structure and Sequence Information Capturing

Qingbao Huang, Mingyi Fu, Linzhang Mo and Yi Cai, Jingyun Xu, Pijian Li, Qing Li, Ho-fung Leung

Keywords Abstract Paper

Scalable Rule-Based Representation Learning for Interpretable Classification

Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang

Keywords Abstract Paper

optimization, machine learning, representation learning, interpretability

Web table retrieval using multimodal deep learning

Roee Shraga, Haggai Roitman, Guy Feigenblat, Mustafa Cannim

Keywords Abstract Paper

experimentation, multimodal deep-learning, table retrieval

Neural Topic Modeling with Cycle-Consistent Adversarial Training

Xuemeng Hu, Rui Wang, Deyu Zhou, Yuxuan Xiong

Keywords Abstract Paper

neural modeling, deep models, adversarial-neural model, adversarially network

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Abstract Paper

Neural training, NMT training, document-level training, NMT objective

Topographic VAEs learn Equivariant Capsules

T. Anderson Keller, Max Welling

Keywords Abstract Paper

deep learning, generative model, graph learning

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Alon Talmor, Oyvind Tafjord, Peter Clark and Yoav Goldberg, Jonathan Berant

Keywords Abstract Paper

Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning

Haotian Fu, Hongyao Tang, Jianye Hao and Chen Chen, Xidong Feng, Dong Li, Wulong Liu

Keywords Abstract Paper

Fact-based Text Editing

Hayate Iso, Chao Qiao, Hang Li

Keywords Abstract Paper

Fact-based Editing, text task, text editing, automatically dataset

Expanding, retrieving and infilling: Diversifying cross-domain question generation with flexible templates

Xiaojing Yu, Anxiao Jiang

Keywords Abstract Paper

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley and Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao

Keywords Abstract Paper

In Search for a SAT-friendly Binarized Neural Network Architecture

Nina Narodytska, Hongce Zhang, Aarti Gupta, Toby Walsh

Keywords Abstract Paper

verification, Boolean satisfiability, Binarized Neural Networks

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Xiuying Chen, Zhi Cui, Jiayi Zhang and Chen Wei, Jianwei Cui, Bin Wang, Dongyan Zhao, Rui Yan

Keywords Abstract Paper

Improving Segmentation for Technical Support Problems

Kushal Chauhan, Abhirut Gupta

Keywords Abstract Paper

Segmentation, Technical Problems, attempted resolution, problem resolution

Data-Efficient Paraphrase Generation to Bootstrap Intent Classification and Slot Labeling for New Features in Task-Oriented Dialog Systems

Shailza Jolly, Tobias Falke, Caglar Tirkaz, Daniil Sorokin

Keywords Abstract Paper

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Xingjia Pan, Yuqiang Ren, Kekai Sheng and Weiming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu

Keywords Abstract Paper

object detection, oriented, densely packed, sku110k, feature selection, dynamic, anchor-free

PathQG: Neural Question Generation from Facts

Siyuan Wang, Zhongyu Wei, Zhihao Fan and Zengfeng Huang, Weijian Sun, Qi Zhang, Xuanjing Huang

Minghui Qiu, Xinjing Huang, Cen Chen and
Feng Ji, Chen Qu, Wei Wei, Jun Huang, Yin Zhang

Keywords Paper

Tongtong Wu, Xuekai Li, Yuan-Fang Li and
Gholamreza Haffari, Guilin Qi, Yujin Zhu, Guoqiang Xu

Keywords Paper

Zihan Xu, Jiangang Zhu, Ling Geng and
Yang Yang, Bojia Lin, Daxin Jiang

Keywords Paper

Wenhu Chen, Jianshu Chen, Yu Su and
Zhiyu Chen, William Yang Wang

Keywords Paper

Keywords Paper

Qingbao Huang, Mingyi Fu, Linzhang Mo and
Yi Cai, Jingyun Xu, Pijian Li, Qing Li, Ho-fung Leung

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Alon Talmor, Oyvind Tafjord, Peter Clark and
Yoav Goldberg, Jonathan Berant

Keywords Paper

Haotian Fu, Hongyao Tang, Jianye Hao and
Chen Chen, Xidong Feng, Dong Li, Wulong Liu

Keywords Paper

Keywords Paper

Keywords Paper

Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley and
Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao

Keywords Paper

Keywords Paper

Xiuying Chen, Zhi Cui, Jiayi Zhang and
Chen Wei, Jianwei Cui, Bin Wang, Dongyan Zhao, Rui Yan

Keywords Paper

Keywords Paper

Keywords Paper

Xingjia Pan, Yuqiang Ren, Kekai Sheng and
Weiming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu

Keywords Paper

Siyuan Wang, Zhongyu Wei, Zhihao Fan and
Zengfeng Huang, Weijian Sun, Qi Zhang, Xuanjing Huang

Keywords Paper

Deepak Gupta, Hardik Chauhan, Ravi Tej Akella and
Asif Ekbal, Pushpak Bhattacharyya

Keywords Paper

Huaxiu Yao, Long-Kai Huang, Linjun Zhang and
Ying WEI, Li Tian, James Zou, Junzhou Huang, Zhenhui (Jessie) Li

Keywords Paper

Keywords Paper

Tianfei Zhou, Wenguan Wang, Siyuan Qi and
Haibin Ling, Jianbing Shen

Keywords Paper

Ming Zhong, Pengfei Liu, Yiran Chen and
Danqing Wang, Xipeng Qiu, Xuanjing Huang

Keywords Paper

William Merrill, Gail Weiss, Yoav Goldberg and
Roy Schwartz, Noah A. Smith, Eran Yahav

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat and
Sashank Jakkam Reddi, Sanjiv Kumar

Keywords Paper