What Can You Learn From Your Muscles? Learning Visual Representation from Human Interactions

03/05/2021

What Can You Learn From Your Muscles? Learning Visual Representation from Human Interactions

Kiana Ehsani, Daniel Gordon, Thomas H Nguyen, Roozbeh Mottaghi, Ali Farhadi

Keywords: computer vision, representation learning

Abstract Paper Similar Papers

Abstract: Learning effective representations of visual data that generalize to a variety of downstream tasks has been a long quest for computer vision. Most representation learning approaches rely solely on visual data such as images or videos. In this paper, we explore a novel approach, where we use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations. For this study, we collect a dataset of human interactions capturing body part movements and gaze in their daily lives. Our experiments show that our ``"muscly-supervised" representation that encodes interaction and attention cues outperforms a visual-only state-of-the-art method MoCo (He et al.,2020), on a variety of target tasks: scene classification (semantic), action recognition (temporal), depth estimation (geometric), dynamics prediction (physics) and walkable surface estimation (affordance). Our code and dataset are available at: https://github.com/ehsanik/muscleTorch.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

30/11/2020

Interpreting Video Features: A Comparison of 3D Convolutional Networks and Convolutional LSTM Networks

Joonatan Mänttäri, Sofia Broomé, John Folkesson, Hedvig Kjellström

Keywords Paper

0

0

0

0

9:52

14/06/2020

Learning to Observe: Approximating Human Perceptual Thresholds for Detection of Suprathreshold Image Transformations

Alan Dolhasz, Carlo Harvey, Ian Williams

Keywords Paper

percetpion, jnd, vision, deep learning, image compositing, local distortions, subjective quality

0

0

0

0

1:01

05/01/2021

3D Human Pose and Shape Estimation Through Collaborative Learning and Multi-View Model-Fitting

Zhongguo Li, Magnus Oskarsson, Anders Heyden

Keywords Paper

0

0

0

0

5:13

22/11/2021

Grid Cell Path Integration For Movement-Based Visual Object Recognition

Niels Leadholm, Marcus Lewis, Subutai Ahmad

Keywords Paper

biologically plausible, translation invariance, robustness, sequential vision, transsaccadic vision, grid cells, path integration, continual learning, predictive representations, Hebbian learning

0

0

0

0

11:21

14/06/2020

CRNet: Cross-Reference Networks for Few-Shot Segmentation

Weide Liu, Chi Zhang, Guosheng Lin, Fayao Liu

Keywords Paper

few-shot learning, segmentation

0

0

0

0

1:01

06/12/2021

H-NeRF: Neural Radiance Fields for Rendering and Temporal Reconstruction of Humans in Motion

Hongyi Xu, Thiemo Alldieck, Cristian Sminchisescu

Keywords Paper

robustness

0

0

0

0

8:39

22/11/2021

Hierarchical Contrastive Motion Learning for Video Action Recognition

Xitong Yang, Xiaodong Yang, Sifei Liu and
Deqing Sun, Larry Davis, Jan Kautz

Keywords Paper

action recognition, motion hierarchy, motion representation, contrastive learning

0

0

0

0

8:29

02/02/2021

Visual Tracking via Hierarchical Deep Reinforcement Learning

Dawei Zhang, Zhonglong Zheng, Riheng Jia, Minglu Li

Keywords Paper

0

0

0

0

15:04

26/04/2020

On the Relationship between Self-Attention and Convolutional Layers

Jean-Baptiste Cordonnier, Andreas Loukas, Martin Jaggi

Keywords Paper

self-attention, attention, transformers, convolution, CNN, image, expressivity, capacity

0

0

0

0

5:18

25/04/2020

Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods

Arianna Yuan, Yang Li

Keywords Paper

performance modeling, deep learning, scannability, convolutional neural network, webpage, visual attention

0

0

0

0

9:33

14/06/2020

Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

Aritra Bhowmik, Stefan Gumhold, Carsten Rother, Eric Brachmann

Keywords Paper

sparse features, reinforcement learning, key point detection, feature description, feature matching, relative pose estimation, ransac, essential matrix, sift, superpoint

0

0

0

0

5:01

18/07/2021

Explore Visual Concept Formation for Image Classification

Shengzhou Xiong, Yihua Tan, Guoyou Wang

Keywords Paper

Deep Learning

0

0

0

0

5:10

14/06/2020

Multi-Domain Learning for Accurate and Few-Shot Color Constancy

Jin Xiao, Shuhang Gu, Lei Zhang

Keywords Paper

color constancy, multi-domain learning, few-shot

0

0

0

0

1:01

14/06/2020

AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation

Hyeongmin Lee, Taeoh Kim, Tae-young Chung and
Daehyun Pak, Yuseok Ban, Sangyoun Lee

Keywords Paper

video frame interpolation, video temporal super-resolution, frame rate up conversion, frame synthesis, motion estimation, motion compensation, frame warping

0

0

0

0

1:01

22/11/2021

StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN

Gereon Fox, Ayush Tewari, Mohamed Elgharib, Christian Theobalt

Keywords Paper

video generation, StyleGAN, GAN, embedding, faces, hands, cars, RNN

0

0

0

0

8:07

03/05/2021

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

Yuxuan Zhang, Wenzheng Chen, Huan Ling and
Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler

Keywords Paper

GANs, inverse graphics, Differentiable rendering

0

0

0

0

10:15

02/02/2021

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

Lin Sun, Jiquan Wang, Kai Zhang and
Yindu Su, Fangsheng Weng

Keywords Paper

0

0

0

0

17:21

06/12/2020

Self-Learning Transformations for Improving Gaze and Head Redirection

Yufeng Zheng, Seonwook Park, Xucong Zhang and
Shalini De Mello, Otmar Hilliges

Keywords Paper

0

0

0

0

3:20

14/06/2020

Object-Occluded Human Shape and Pose Estimation From a Single Color Image

Tianshu Zhang, Buzhen Huang, Yangang Wang

Keywords Paper

human shape and pose estimation, occlusion, 3d human dataset, representation for 3d human

0

0

0

0

4:54

05/01/2021

Multi-Frame Recurrent Adversarial Network for Moving Object Segmentation

Prashant W. Patil, Akshay Dudhane, Subrahmanyam Murala

Keywords Paper

0

0

0

0

5:00

07/09/2020

Attention Distillation for Learning Video Representations

Miao Liu, Xin Chen, Yun Zhang and
Yin Li, James Rehg

Keywords Paper

Action Recognition, Deep Learning, Representation Learning

0

0

0

0

9:50

14/06/2020

Few-Shot Video Classification via Temporal Alignment

Kaidi Cao, Jingwei Ji, Zhangjie Cao and
Chien-Yi Chang, Juan Carlos Niebles

Keywords Paper

video classification, few-shot learning, action recognition, temporal alignment

0

0

0

0

0:57

14/06/2020

Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention

Ming Jiang, Shi Chen, Jinhui Yang, Qi Zhao

Keywords Paper

attention, task performance, immersive environment, eye-tracking, 360-degree video, visual question answering, dataset, modeling

0

0

0

0

1:01

14/06/2020

Single-View View Synthesis With Multiplane Images

Richard Tucker, Noah Snavely

Keywords Paper

view synthesis, monocular, multiplane image, image-based rendering, 3d deep learning, scale invariance

0

0

0

0

1:01

16/11/2020

MELD: Meta-Reinforcement Learning from Images via Latent State Models

Zihao Zhao, Anusha Nagabandi, Kate Rakelly and
Chelsea Finn, Sergey Levine

Keywords Paper

0

0

0

0

5:06

22/11/2021

Segmenting Invisible Moving Objects

Hala Lamdouar, Weidi Xie, Andrew Zisserman

Keywords Paper

synthetic data generation, motion segmentation, amodal segmentation, video camouflage breaking, self-attention

0

0

0

0

3:05

14/06/2020

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs

Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles

Keywords Paper

action recognition, scene graph, video understanding, relationships, composition, action, activity, video

0

0

0

0

1:01

14/06/2020

Distilled Semantics for Comprehensive Scene Understanding from Videos

Fabio Tosi, Filippo Aleotti, Pierluigi Zama Ramirez and
Matteo Poggi, Samuele Salti, Luigi Di Stefano, Stefano Mattoccia

Keywords Paper

monocular depth estimation, optical flow, semantic segmentation, motion segmentation, knowledge distillation

0

0

0

0

0:56

14/06/2020

Syntax-Aware Action Targeting for Video Captioning

Qi Zheng, Chaoyue Wang, Dacheng Tao

Keywords Paper

video and language, video captioning, action predicting

0

0

0

0

1:01

16/11/2020

Model-Based Inverse Reinforcement Learning from Visual Demonstrations

Neha Das, Sarah Bechtle, Todor Davchev and
Dinesh Jayaraman, Akshara Rai, Franziska Meier

Keywords Paper

0

0

0

0

5:03

19/08/2021

Information Bottleneck Approach to Spatial Attention Learning

Qiuxia Lai, Yu Li, Ailing Zeng and
Minhao Liu, Hanqiu Sun, Qiang Xu

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Classification, Deep Learning

0

0

0

0

14:42

26/04/2020

AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures

Michael S. Ryoo, AJ Piergiovanni, Mingxing Tan, Anelia Angelova

Keywords Paper

video representation learning, video understanding, activity recognition, neural architecture search

0

0

0

0

5:02

14/06/2020

Explaining Knowledge Distillation by Quantifying the Knowledge

Xu Cheng, Zhefan Rao, Yilan Chen, Quanshi Zhang

Keywords Paper

knowledge distillation, explainable ai

0

0

0

0

0:57

05/01/2021

Towards Contextual Learning in Few-Shot Object Classification

Mathieu Page Fortin, Brahim Chaib-draa

Keywords Paper

0

0

0

0

4:57

06/12/2021

Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering

Youngjoong Kwon, Dahun Kim, Duygu Ceylan, Henry Fuchs

Keywords Paper

transformers, vision

0

0

0

0

11:55

14/06/2020

Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning

Zhibo Yang, Lihan Huang, Yupei Chen and
Zijun Wei, Seoyoung Ahn, Gregory Zelinsky, Dimitris Samaras, Minh Hoai

Keywords Paper

scanpath prediction, inverse reinforcement learning, dataset, visual search, contextual belief, attention, goal-directed, adversarial training

0

0

0

0

5:00

14/06/2020

Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning

Yuan Yao, Chang Liu, Dezhao Luo and
Yu Zhou, Qixiang Ye

Keywords Paper

self-supervised spatio-temporal representation learning, multi-temporal resolution characteristic, playback rate perception, motion attention mechanism

0

0

0

0

1:01

03/08/2020

TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP

Nils Rethmeier, Vageesh Kumar Saxena, Isabelle Augenstein

Keywords Paper

0

0

0

0

7:32

03/05/2021

VA-RED$^2$: Video Adaptive Redundancy Reduction

Bowen Pan, Rameswar Panda, Camilo L Fosco and
Chung-Ching Lin, Alex J Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris

Keywords Paper

0

0

0

0

5:02

30/11/2020

MTNAS: Search Multi-Task Networks for Autonomous Driving

Hao Liu, Dong Li, JinZhang Peng and
Qingjie Zhao, Lu Tian, Yi Shan

Keywords Paper

0

0

0

0

9:06