02/06/2020

VQuAnDa: Verbalization QUestion ANswering DAtaset

Endri Kacupaj, Hamid Zafar, Jens Lehmann, Maria Maleshkova

Keywords:

Abstract: Question Answering (QA) systems over Knowledge Graphs (KGs) aim to provide a concise answer to a given natural language question. Despite the significant evolution of QA methods over the past years, there are still some core lines of work, which are lagging behind. This is especially true for methods and datasets that support the verbalization of answers in natural language. Specifically, to the best of our knowledge, none of the existing Question Answering datasets provide any verbalization data for the question-query pairs. Hence, we aim to fill this gap by providing the first QA dataset VQuAnDa that includes the verbalization of each answer. We base VQuAnDa on a commonly used large-scale QA dataset – LC-QuAD, in order to support compatibility and continuity of previous work. We complement the dataset with baseline scores for measuring future training and evaluation work, by using a set of standard sequence to sequence models and sharing the results of the experiments. This resource empowers researchers to train and evaluate a variety of models to generate answer verbalizations.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ESWC 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers