On the evaluation of vision-and-language navigation instructions

Abstract: Vision-and-Language Navigation wayfinding agents can be enhanced by exploiting automatically generated navigation instructions. However, existing instruction generators have not been comprehensively evaluated, and the automatic evaluation metrics used to develop them have not been validated. Using human wayfinders, we show that these generators perform on par with or only slightly better than a template-based generator and far worse than human instructors. Furthermore, we discover that BLEU, ROUGE, METEOR and CIDEr are ineffective for evaluating grounded navigation instructions. To improve instruction evaluation, we propose an instruction-trajectory compatibility model that operates without reference instructions. Our model shows the highest correlation with human wayfinding outcomes when scoring individual instructions. For ranking instruction generation systems, if reference instructions are available we recommend using SPICE.

06/12/2021

nice-gan, reusing discriminators for encoding, unsupervised image-to-image translation, decoupled training, multi-scale discriminators, adversarial loss, no independent component for encoding, shared layers, residual attention, cyclegan

1:01

04/07/2020

On the evaluation of vision-and-language navigation instructions

Ming Zhao, Peter Anderson, Vihan Jain, Su Wang, Alexander Ku, Jason Baldridge, Eugene Ie

Comments

Similar Papers

Supervising the Transfer of Reasoning Patterns in VQA

Corentin Kervadec, Christian Wolf, Grigory Antipov and Moez Baccouche, Madiha Nadri

Keywords Abstract Paper

theory, deep learning, vision

Improving Generalization in Reinforcement Learning with Mixture Regularization

KAIXIN WANG, Bingyi Kang, Jie Shao, Jiashi Feng

Keywords Abstract Paper

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Runfa Chen, Wenbing Huang, Binghui Huang and Fuchun Sun, Bin Fang

Keywords Abstract Paper

nice-gan, reusing discriminators for encoding, unsupervised image-to-image translation, decoupled training, multi-scale discriminators, adversarial loss, no independent component for encoding, shared layers, residual attention, cyclegan

Designing Precise and Robust Dialogue Response Evaluators

Tianyu Zhao, Divesh Lala, Tatsuya Kawahara

Keywords Abstract Paper

human evaluation, Precise Evaluators, Automatic evaluator, reference-free evaluator

Curriculum Learning for Vision-and-Language Navigation

Jiwen Zhang, zhongyu wei, Jianqing Fan, Jiajie Peng

Keywords Abstract Paper

Two Causal Principles for Improving Visual Dialog

Jiaxin Qi, Yulei Niu, Jianqiang Huang, Hanwang Zhang

Keywords Abstract Paper

visual dialog, vision and language, causality

Q-learning with Language Model for Edit-based Unsupervised Summarization

Ryosuke Kohita, Akifumi Wachi, Yang Zhao, Ryuki Tachibana

Keywords Abstract Paper

abstractive textsummarization, unsupervised summarization, unsupervised summarizers, unsupervised methods

Unsupervised Adaptation of Question Answering Systems via Generative Self-training

Steven Rennie, Etienne Marcheret, Neil Mallinar and David Nahamoo, Vaibhava Goel

Keywords Abstract Paper

question-answering tasks, self-supervised tasks, word masking, sentence entailment

Learning Meta Face Recognition in Unseen Domains

Jianzhu Guo, Xiangyu Zhu, Chenxu Zhao and Dong Cao, Zhen Lei, Stan Z. Li

Keywords Abstract Paper

face recognition, meta learning, domain generalization, metric learning

Adversarial and Domain-Aware BERT for Cross-Domain Sentiment Analysis

Chunning Du, Haifeng Sun, Jingyu Wang and Qi Qi, Jianxin Liao

Keywords Abstract Paper

Cross-Domain Analysis, Cross-domain classification, unsupervised adaptation, transferring knowledge

Systematic generalization on gSCAN with language conditioned embedding

Tong Gao, Qi Huang, Raymond Mooney

Keywords Abstract Paper

Automatic Data Augmentation for Generalization in Reinforcement Learning

Roberta Raileanu, Maxwell Goldstein, Denis Yarats and Ilya Kostrikov, Rob Fergus

Keywords Abstract Paper

reinforcement learning and planning, machine learning

Prophet Attention: Predicting Attention with Future Attention

Fenglin Liu, Xuancheng Ren, Xian Wu and Shen Ge, Wei Fan, Yuexian Zou, Xu Sun

Keywords Abstract Paper

Evaluating the Factual Consistency of Abstractive Text Summarization

Wojciech Kryscinski, Bryan McCann, Caiming Xiong, Richard Socher

Keywords Abstract Paper

assessing algorithms, natural inference, fact checking, auxiliary tasks

Auto-Navigator: Decoupled Neural Architecture Search for Visual Navigation

Tianqi Tang, Xin Yu, Xuanyi Dong, Yi Yang

Keywords Abstract Paper

Towards Faithfulness in Open Domain Table-to-text Generation from an Entity-centric View

Tianyu Liu, Xin Zheng, Baobao Chang, Zhifang Sui

Keywords Abstract Paper

Self-training pre-trained language models for zero- and few-shot multi-dialectal Arabic sequence labeling

Muhammad Khalifa, Muhammad Abdul-Mageed, Khaled Shaalan

Keywords Abstract Paper

ADA-AT/DT: An Adversarial Approach for Cross-Domain and Cross-Task Knowledge Transfer

Ruchika Chavhan, Ankit Jha, Biplab Banerjee, Subhasis Chaudhuri

Keywords Abstract Paper

Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism

Pan Xie, Zhi Cui, Xiuying Chen and XiaoHui Hu, Jianwei Cui, Bin Wang

Keywords Abstract Paper

SparseBERT: Rethinking the Importance Analysis in Self-attention

Han Shi, Jiahui Gao, Xiaozhe Ren and Hang Xu, Xiaodan Liang, Zhenguo Li, James Kwok

Keywords Abstract Paper

Applications, Natural Language Processing

Accurate Word Alignment Induction from Neural Machine Translation

Yun Chen, Yang Liu, Guanhua Chen and Xin Jiang, Qun Liu

Keywords Abstract Paper

transformer, attention mechanism, word methods, shift-att

Auxiliary Training: Towards Accurate and Robust Models

Corentin Kervadec, Christian Wolf, Grigory Antipov and
Moez Baccouche, Madiha Nadri

Keywords Paper

Keywords Paper

Runfa Chen, Wenbing Huang, Binghui Huang and
Fuchun Sun, Bin Fang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Steven Rennie, Etienne Marcheret, Neil Mallinar and
David Nahamoo, Vaibhava Goel

Keywords Paper

Jianzhu Guo, Xiangyu Zhu, Chenxu Zhao and
Dong Cao, Zhen Lei, Stan Z. Li

Keywords Paper

Chunning Du, Haifeng Sun, Jingyu Wang and
Qi Qi, Jianxin Liao

Keywords Paper

Keywords Paper

Roberta Raileanu, Maxwell Goldstein, Denis Yarats and
Ilya Kostrikov, Rob Fergus

Keywords Paper

Fenglin Liu, Xuancheng Ren, Xian Wu and
Shen Ge, Wei Fan, Yuexian Zou, Xu Sun

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Pan Xie, Zhi Cui, Xiuying Chen and
XiaoHui Hu, Jianwei Cui, Bin Wang

Keywords Paper

Han Shi, Jiahui Gao, Xiaozhe Ren and
Hang Xu, Xiaodan Liang, Zhenguo Li, James Kwok

Keywords Paper

Yun Chen, Yang Liu, Guanhua Chen and
Xin Jiang, Qun Liu

Keywords Paper

Linfeng Zhang, Muzhou Yu, Tong Chen and
Zuoqiang Shi, Chenglong Bao, Kaisheng Ma

Keywords Paper

Qingyi Si, Yuanxin Liu, Peng Fu and
Zheng Lin, Jiangnan Li, Weiping Wang

Keywords Paper

Peng Wang, Jiang Xu, Chunyi Liu and
Hao Feng, Zang Li, Jieping Ye

Keywords Paper

Keywords Paper

Zibo Lin, Deng Cai, Yan Wang and
Xiaojiang Liu, Haitao Zheng, Shuming Shi

Keywords Paper

Keywords Paper

Kai Wang, Xiaojiang Peng, Jianfei Yang and
Shijian Lu, Yu Qiao

Keywords Paper

Minghuan Liu, Hanye Zhao, Zhengyu Yang and
Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

Keywords Paper

Keywords Paper

Keywords Paper

Kibeom Kim, Min Whoo Lee, Yoonsung Kim and
JeHwan Ryu, Minsu Lee, Byoung-Tak Zhang

Keywords Paper

Yan Zhou, Fuqing Zhu, Pu Song and
Jizhong Han, Tao Guo, Songlin Hu

Keywords Paper

Baifeng Shi, Judy Hoffman, Kate Saenko and
Trevor Darrell, Huijuan Xu

Keywords Paper

Keywords Paper

Taesun Whang, Dongyub Lee, Dongsuk Oh and
Chanhee Lee, Kijong Han, Dong-hun Lee, Saebyeok Lee

Keywords Paper