computer vision
Visual storytelling refers to the manner of describing a set of images rather than a single image, also known as multi-image captioning. Visual Storytelling Task (VST) takes a set of images as input and aims to generate a coherent story relevant to the input images. In this dataset, we bridge the gap and present a new dataset for expressive and coherent story creation. We present the Sequential Storytelling Image Dataset (SSID), consisting of open-source video frames accompanied by story-like annotations.
- Categories:
The Dataset consists of two videos, one recorded with blindfold on and the other without blindfold recorded using a 1080p Intel RealSense depth camera. It contains the videos, images extracted using ffmpeg and processed video which is made of a video with skipped frames created using ffmpeg. The scope of the dataset is for machine vision purposes to allow for tasks such as instance segmentation. A hat fixed on the head of a blindfolded person is used to record walking activities.
- Categories:
This dataset was collected with the goal of providing researchers with access to a collection of hundreds of images for efficient classification of plant attributes and multi-instance plant localisation and detection. There are two folders, i.e. Side view and Top View.Each folder includes label files and image files in the.jpg format (.txt format). Images of 30 plants grown in 5 hydroponic systems have been collected for 66 days. Thirty plants of three species (Petunia, Pansy and Calendula) were grown in a hydroponic system for the purpose of collecting and analysing images.
- Categories:
TO BE ADDED AFTER PUBLICATION.
- Categories:
Visual perception can improve transitions between different locomotion mode controllers (e.g., level-ground walking to stair ascent) by sensing the walking environment prior to physical interactions. Here we developed the "StairNet" dataset to support the development of vision-based stair recognition systems. The dataset builds on ExoNet – the largest open-source dataset of egocentric images of real-world walking environments.
- Categories:
Retail Gaze, a dataset for remote gaze estimation in real-world retail environments. Retail Gaze is composed of 3,922 images of individuals looking at products in a retail environment, with 12 camera capture angles.
Each image captures the third-person view of the customer and shelves. Location of the gaze point, the Bounding box of the person's head, segmentation masks of the gazed at product areas are provided as annotations.
- Categories:
This dataset was prepared to aid in the creation of a machine learning algorithm that would classify the white blood cells in thin blood smears of juvenile Visayan warty pigs. The creation of this dataset was deemed imperative because of the limited availability of blood smear images collected from the critically endangered species on the internet. The dataset contains 3,457 images of various types of white blood cells (JPEG) with accompanying cell type labels (XLSX).
- Categories: