computer vision

Visual storytelling refers to the manner of describing a set of images rather than a single image, also known as multi-image captioning. Visual Storytelling Task (VST) takes a set of images as input and aims to generate a coherent story relevant to the input images. In this dataset, we bridge the gap and present a new dataset for expressive and coherent story creation. We present the Sequential Storytelling Image Dataset (SSID), consisting of open-source video frames accompanied by story-like annotations.

Categories:
1122 Views

This is an image illustrating the work of the proposed visual SLAM algorithm.

Categories:
65 Views

This is an image illustrating the work of the proposed visual SLAM algorithm.

Categories:
18 Views

The Dataset consists of two videos, one recorded with blindfold on and the other without blindfold recorded using a 1080p Intel RealSense depth camera. It contains the videos, images extracted using ffmpeg and processed video which is made of a video with skipped frames created using ffmpeg. The scope of the dataset is for machine vision purposes to allow for tasks such as instance segmentation. A hat fixed on the head of a blindfolded person is used to record walking activities.

Categories:
373 Views

This dataset was collected with the goal of providing researchers with access to a collection of hundreds of images for efficient classification of plant attributes and multi-instance plant localisation and detection. There are two folders, i.e. Side view and Top View.Each folder includes label files and image files in the.jpg format (.txt format). Images of 30 plants grown in 5 hydroponic systems have been collected for 66 days. Thirty plants of three species (Petunia, Pansy and Calendula) were grown in a hydroponic system for the purpose of collecting and analysing images.

Categories:
394 Views

The data set has been consolidated for the task of Human Posture Apprehension. The data set consists of two postures namely -

  1. Sitting and,
  2. Standing,

There are images for each of the postures listed above. The images have a dimension of 53X160 to 1845×4608.

Categories:
980 Views

Visual perception can improve transitions between different locomotion mode controllers (e.g., level-ground walking to stair ascent) by sensing the walking environment prior to physical interactions. Here we developed the "StairNet" dataset to support the development of vision-based stair recognition systems. The dataset builds on ExoNet – the largest open-source dataset of egocentric images of real-world walking environments.

Categories:
2015 Views

Retail Gaze, a dataset for remote gaze estimation in real-world retail environments. Retail Gaze is composed of 3,922 images of individuals looking at products in a retail environment, with 12 camera capture angles.

Each image captures the third-person view of the customer and shelves. Location of the gaze point, the Bounding box of the person's head, segmentation masks of the gazed at product areas are provided as annotations.

Categories:
555 Views

This dataset was prepared to aid in the creation of a machine learning algorithm that would classify the white blood cells in thin blood smears of juvenile Visayan warty pigs. The creation of this dataset was deemed imperative because of the limited availability of blood smear images collected from the critically endangered species on the internet. The dataset contains 3,457 images of various types of white blood cells (JPEG) with accompanying cell type labels (XLSX).

Categories:
2618 Views

Pages