Variable-length File Fragment Dataset (VFF-16)

Citation Author(s):
YI
WANG
Submitted by:
YI WANG
Last updated:
Mon, 03/06/2023 - 20:30
DOI:
10.21227/qb7g-g653
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

A variable-length file fragment (VFF-16) dataset with 16 file types is to reflect the file system fragmentation. The sequential memory sectors contain contextual information about file fragments. The 16 file types are ‘jpg’, ‘gif’, ‘doc’, ‘xls’, ‘ppt’,  ‘html’, ‘text’, ‘pdf’, ‘rtf’, ‘png’, ‘log’, ‘csv’, ‘gz’, ‘swf’, ‘eps’,  and ‘ps’. We split the dataset into the training and test sets with a ratio of about 4:1. There are 1,310,918 training samples and 328,599 test samples in a sector size of 512 bytes, and  167,564 training samples and 41,993 test samples in a sector size of 4,096 bytes.

Instructions: 

See readme.md for details.