Multimodal Transformer Networks for Pedestrian Trajectory Prediction

19/08/2021

Multimodal Transformer Networks for Pedestrian Trajectory Prediction

Ziyi Yin, Ruijin Liu, Zhiliang Xiong, Zejian Yuan

Keywords: Computer Vision, 2D and 3D Computer Vision, Structural and Model-Based Approaches, Knowledge Representation and Reasoning, Transportation

Abstract Paper Similar Papers

Abstract: We consider the problem of forecasting the future locations of pedestrians in an ego-centric view of a moving vehicle. Current CNNs or RNNs are flawed in capturing the high dynamics of motion between pedestrians and the ego-vehicle, and suffer from the massive parameter usages due to the inefficiency of learning long-term temporal dependencies. To address these issues, we propose an efficient multimodal transformer network that aggregates the trajectory and ego-vehicle speed variations at a coarse granularity and interacts with the optical flow in a fine-grained level to fill the vacancy of highly dynamic motion. Specifically, a coarse-grained fusion stage fuses the information between trajectory and ego-vehicle speed modalities to capture the general temporal consistency. Meanwhile, a fine-grained fusion stage merges the optical flow in the center area and pedestrian area, which compensates the highly dynamic motion of ego-vehicle and target pedestrian. Besides, the whole network is only attention-based that can efficiently model long-term sequences for better capturing the temporal variations. Our multimodal transformer is validated on the PIE and JAAD datasets and achieves state-of-the-art performance with the most light-weight model size. The codes are available at https://github.com/ericyinyzy/MTN_trajectory.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

30/11/2020

Dense-Scale Feature Learning in Person Re-Identification

Li Wang, Baoyu Fan, Zhenhua Guo and
Yaqian Zhao, Runze Zhang, Rengang Li, Weifeng Gong

Keywords Paper

0

0

0

0

9:46

02/02/2021

Adaptive Pattern-Parameter Matching for Robust Pedestrian Detection

Mengyin Liu, Chao Zhu, Jun Wang, Xu-Cheng Yin

Keywords Paper

0

0

0

0

17:18

02/02/2021

Temporal Pyramid Network for Pedestrian Trajectory Prediction with Multi-Supervision

Rongqin Liang, Yuanman Li, Xia Li and
Yi Tang, Jiantao Zhou, Wenbin Zou

Keywords Paper

0

0

0

0

19:35

07/09/2020

Anchor-free Small-scale Multispectral Pedestrian Detection

Alexander Wolpert, Michael Teutsch, Saquib Sarfraz, Rainer Stiefelhagen

Keywords Paper

pedestrian detection, human recognition, multi-modal, thermal infrared, multispectral fusion, low object resolution, multispectral data augmentation, box-less object detection

0

0

0

0

6:00

16/11/2020

Multimodal Trajectory Prediction via Topological Invariance for Navigation at Uncontrolled Intersections

Junha Roh, Christoforos Mavrogiannis, Rishabh Madan and
Dieter Fox, Siddhartha Srinivasa

Keywords Paper

0

0

0

0

4:31

14/06/2020

Transferring and Regularizing Prediction for Semantic Segmentation

Yiheng Zhang, Zhaofan Qiu, Ting Yao and
Chong-Wah Ngo, Dong Liu, Tao Mei

Keywords Paper

semantic segmentation, domain adaptation, adversarial learning

0

0

0

0

0:58

26/04/2020

Expected Information Maximization: Using the I-Projection for Mixture Density Estimation

Philipp Becker, Oleg Arenz, Gerhard Neumann

Keywords Paper

density estimation, information projection, mixture models, generative learning, multimodal modeling

0

0

0

0

5:00

02/02/2021

AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing

Qi Song, Kangfu Mei, Rui Huang

Keywords Paper

0

0

0

0

14:15

14/06/2020

Salience-Guided Cascaded Suppression Network for Person Re-Identification

Xuesong Chen, Canmiao Fu, Yong Zhao and
Feng Zheng, Jingkuan Song, Rongrong Ji, Yi Yang

Keywords Paper

person re-identification, salience feature suppression, feature fusion, attention mechanism

0

0

0

0

1:01

26/04/2020

Ae-ot: a new generative model based on extended semi-discrete optimal transport

Dongsheng An, Yang Guo, Na Lei and
Zhongxuan Luo, Shing-Tung Yau, Xianfeng Gu

Keywords Paper

Generative model, auto-encoder, optimal transport, mode collapse, regularity

0

0

0

1

4:09

19/08/2021

Predicting Traffic Congestion Evolution: A Deep Meta Learning Approach

Yidan Sun, Guiyuan Jiang, Siew Kei Lam, Peilan He

Keywords Paper

Machine Learning, Deep Learning, Transportation, Applications of Supervised Learning

0

0

0

0

14:32

22/11/2021

You Better Look Twice: a new perspective for designing accurate detectors with reduced computations

Alexandra Dana, Maor Shutman, Yotam Perlitz and
Ran Vitek, Tomer Peleg, Roy J Jevnisek

Keywords Paper

Pedestrian detection, Object detection, Two-stage object detection, Reduced computations, Pareto frontier

0

0

0

0

2:44

14/06/2020

VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation

Jiyang Gao, Chen Sun, Hang Zhao and
Yi Shen, Dragomir Anguelov, Congcong Li, Cordelia Schmid

Keywords Paper

autonomous driving, behavior prediction, motion forecasting, map representation

0

0

0

0

1:00

02/02/2021

Disentangled Multi-Relational Graph Convolutional Network for Pedestrian Trajectory Prediction

Inhwan Bae, Hae-Gon Jeon

Keywords Paper

0

0

0

0

15:25

05/01/2021

A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions

Rene Schuster, Christian Unger, Didier Stricker

Keywords Paper

0

0

0

0

4:48

23/08/2020

ConSTGAT: Contextual spatial-temporal graph attention network for travel time estimation at baidu maps

Xiaomin Fang, Jizhou Huang, Fan Wang and
Lingke Zeng, Haijin Liang, Haifeng Wang

Keywords Paper

contextual information, attention mechanism, graph neural network, transportation, baidu maps, travel time estimation

0

0

0

0

15:26

12/07/2020

Self-Modulating Nonparametric Event-Tensor Factorization

Zheng Wang, Xinqi Chu, Shandian Zhe

Keywords Paper

General Machine Learning Techniques

0

0

0

0

15:40

30/11/2020

Semantic Synthesis of Pedestrian Locomotion

Maria Priisalu, Ciprian Paduraru, Aleksis Pirinen, Cristian Sminchisescu

Keywords Paper

0

0

0

0

10:20

14/06/2020

Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians

Jialian Wu, Chunluan Zhou, Ming Yang and
Qian Zhang, Yuan Li, Junsong Yuan

Keywords Paper

pedestrian detection, heavy occlusion, local temporal context, feature enhancement

0

0

0

0

1:00

05/01/2021

Minimal Solvers for Single-View Lens-Distorted Camera Auto-Calibration

Yaroslava Lochman, Oles Dobosevych, Rostyslav Hryniv, James Pritts

Keywords Paper

0

0

0

0

5:01

19/10/2020

Generating full spatiotemporal vehicular paths: A data fusion approach

Nan Xiao, Nan Hu, Liang Yu, Cheng Long

Keywords Paper

vehicular paths, model training, data fusion, path generation, data safety

0

0

0

0

8:56

14/06/2020

STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction

Zhishuai Zhang, Jiyang Gao, Junhua Mao and
Yukai Liu, Dragomir Anguelov, Congcong Li

Keywords Paper

perception, self-driving cars, trajectory prediction, detection, pedestrians

0

0

0

0

1:01

14/06/2020

NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing

Xin Huang, Zheng Ge, Zequn Jie, Osamu Yoshie

Keywords Paper

pedestrian detection, non-maximum suppression, occlusion handling, object detection.

0

0

0

0

1:01

14/06/2020

Dual Super-Resolution Learning for Semantic Segmentation

Li Wang, Dong Li, Yousong Zhu and
Lu Tian, Yi Shan

Keywords Paper

semantic segmentation, super-resolution, feature affinity, human pose estimation

0

0

0

0

4:44

02/02/2021

SCAN: A Spatial Context Attentive Network for Joint Multi-Agent Intent Prediction

Jasmine Sekhon, Cody Fleming

Keywords Paper

0

0

0

0

17:07

02/02/2021

Consistency Regularization with High-dimensional Non-adversarial Source-guided Perturbation for Unsupervised Domain Adaptation in Segmentation

Kaihong Wang, Chenhongyi Yang, Margrit Betke

Keywords Paper

0

0

0

0

19:35

30/11/2020

Robust High Dynamic Range (HDR) Imaging with Complex Motion and Parallax

Zhiyuan Pu, Peiyao Guo, M. Salman Asif, Zhan Ma

Keywords Paper

0

0

0

0

7:38

02/02/2021

IDOL: Inertial Deep Orientation-Estimation and Localization

Scott Sun, Dennis Melamed, Kris Kitani

Keywords Paper

0

0

0

0

17:48

02/02/2021

Hierarchical Graph Convolution Network for Traffic Forecasting

Kan Guo, Yongli Hu, Yanfeng Sun and
Sean Qian, Junbin Gao, Baocai Yin

Keywords Paper

0

0

0

0

17:52

06/12/2021

Improved Transformer for High-Resolution GANs

Long Zhao, Zizhao Zhang, Ting Chen and
Dimitris Metaxas, Han Zhang

Keywords Paper

transformers, generative model

0

0

0

0

12:11

14/06/2020

Spatially-Attentive Patch-Hierarchical Network for Adaptive Motion Deblurring

Maitreya Suin, Kuldeep Purohit, A. N. Rajagopalan

Keywords Paper

motion blur, spatially varying, attention, dynamic filter, adaptive, dynamic scene, deformable, encoder decoder, hierarchical, convolutional neural network

0

0

0

0

1:00

23/08/2020

Preserving dynamic attention for long-term spatial-temporal prediction

Haoxing Lin, Rufan Bai, Weijia Jia and
Xinyu Yang, Yongjian You

Keywords Paper

attention mechanism, long-term prediction, neural network, mining spatial-temporal information

0

0

0

0

15:03

30/11/2020

Localin Reshuffle Net: Toward Naturally and Efficiently Facial Image Blending

Chengyao Zheng, Siyu Xia, Joseph Robinson and
Changsheng Lu, Wayne Wu, Chen Qian, Ming Shao

Keywords Paper

0

0

0

0

2:19

26/08/2020

Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport

François-Pierre Paty, Alexandre d'Aspremont, Marco Cuturi

Keywords Paper

0

0

0

0

14:47

02/02/2021

AttnMove: History Enhanced Trajectory Recovery via Attentional Network

Tong Xia, Yunhan Qi, Jie Feng and
Fengli Xu, Funing Sun, Diansheng Guo, Yong Li

Keywords Paper

0

0

0

0

14:07

22/11/2021

Efficient Video Super Resolution by Gated Local Self Attention

Davide Abati, Amir Ghodrati, Amirhossein Habibian

Keywords Paper

video super resolution, video efficiency, super resolution

0

0

0

0

2:51

26/08/2020

Decentralized Multi-player Multi-armed Bandits with No Collision Information

Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang

Keywords Paper

0

0

0

0

14:18

05/01/2021

End-to-End Lane Shape Prediction With Transformers

Ruijin Liu, Zejian Yuan, Tie Liu, Zhiliang Xiong

Keywords Paper

0

0

0

0

4:44

02/02/2021

Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting

Mengzhang Li, Zhanxing Zhu

Keywords Paper

0

0

0

0

17:02

30/11/2020

A Two-Stage Minimum Cost Multicut Approach to Self-Supervised Multiple Person Tracking

Kalun Ho, Amirhossein Kardoost, Franz-Josef Pfreundt and
Janis Keuper, Margret Keuper

Keywords Paper

0

0

0

0

7:13