Abstract

Deep video representation learning has recently attained state-of-the-art performance in video action recognition. However, when used with video clips from varied perspectives, the performance of these models degrades significantly. Existing VAR models frequently simultaneously contain both view information and action attributes, making it difficult to learn a view-invariant representation. Therefore, to study the attribute of multiview representation, we collected a large-scale time synchronous multiview video dataset from 10 subjects in both indoor and outdoor settings performing 10 different actions with three horizontal and vertical viewpoints using a smartphone, an action camera, and a drone camera. We provide the multiview video dataset with various meta-data information to facilitate further research for robust VAR systems.

Instructions:

This is a partial Dataset, we will upload the full dataset soon including data loader

Funding Agency:

DEVCOM Army Research Laboratory (ARL) and U.S. Army

Grant Number:

Grant No. W911NF21-20076

Dataset Files

subject_01_1.zip (3.88 GB)
subject_01_2.zip (4.39 GB)
subject_01_3.zip (4.69 GB)
subject_01_4.zip (5.23 GB)

Documentation

Attachment	Size
MV_dataset.pdf	2.68 MB

Datasets

Standard Dataset

MPSC_MV

Abstract

Dataset Files

Documentation

QUESTIONS?