Dataset for Machine Learning-Based Classification of White Blood Cells of the Juvenile Visayan Warty Pig

Citation Author(s):
Jacqueline Rose
Alipo-on
University of St. La Salle
Francesca Isabelle
Escobar
University of St. La Salle
Jemima Loise
Novia
University of St. La Salle
Monica Marie
Atienza
Talarak Foundation, Inc.
Sonny
Mana-ay
ACM Diagnostic Laboratory and Medical Clinic
Myles Joshua
Tan
University of St. La Salle
Nouar
AlDahoul
Multimedia University
Evan
Yu
Cornell University
Submitted by:
Myles Joshua Tan
Last updated:
Sat, 02/26/2022 - 08:58
DOI:
10.21227/3qsb-d447
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset was prepared to aid in the creation of a machine learning algorithm that would classify the white blood cells in thin blood smears of juvenile Visayan warty pigs. The creation of this dataset was deemed imperative because of the limited availability of blood smear images collected from the critically endangered species on the internet. The dataset contains 3,457 images of various types of white blood cells (JPEG) with accompanying cell type labels (XLSX).

Instructions: 

------------------------------

GENERAL INFORMATION

-----------------------------

Title of Dataset: Dataset for Machine Learning-Based Classification of White Blood Cells of the Juvenile Visayan Warty Pig

 Available in: https://drive.google.com/drive/folders/1CsDoL448kvAtFVd5jowVJGKjFLv3qjz4...

Creators:

Jacqueline Rose Alipo-on, University of St. La Salle, s1821459@usls.edu.ph, https://orcid.org/

0000-0001-7948-9512

Francesca Isabelle Escobar, University of St. La Salle, s1822133@usls.edu.ph, https://orcid.org/

0000-0001-6174-890X

Jemima Loise Novia, University of St. La Salle, s1820906@usls.edu.ph, https://orcid.org/

0000-0001-7046-3973  

 

Contributor(s):

Monica Marie Atienza, DVM

 

Correspondence and Project Advising:

Myles Joshua Tan, MS, MIEEE, MInstP, MIMA; mj.tan@usls.edu.ph; mylestan7996@gmail.com

Nouar AlDahoul, PhD; nouar.aldahoul@live.iium.edu.my; nouar.aldahoul@gmail.com

Evan Yu, PhD; emy24@cornell.edu 

 

Date of data collection: 2021-06 to 2021-11

 

Geographic location of data collection: Bacolod City, Negros Occidental, Philippines

 

Keywords: peripheral blood smear, microscope, white blood cell, leukocyte, basophil, eosinophil, lymphocyte, neutrophil, image processing, image augmentation, machine learning, feature extraction, classification, juvenile Visayan warty pig, Philippines

 

------------------------------

DATA & FILE OVERVIEW

------------------------------

File List: 

 

The total number of images in the dataset is 3539,

which consists of 667 raw images, 1464 augmented images, and 1408 cropped, classified images.

 

“Not Cropped” folder contains all the raw, unclassified images, with a total count of 667.

 

“Cropped Classified” folder contains five subfolders for each of the WBC type: “01 Neutrophil” (319 images),  “02 Lymphocyte” (905 images), “03 Monocyte” (82 images), “04 Eosinophil” (82 images), and “05 Basophil” (20 images).

 

“Augmented images” folder also contains four subfolders for the augmented WBC images: “Basophil” (447 images), “Eosinophil” (405 images), “Monocyte” (418 images), and “Neutrophil” (194 images).

 

“Image Processing Features Augmented” (1328R x 53C)

 

“Image Processing Features for Cropped” (1408R x 53C)

 

------------------------------

ABSTRACT

------------------------------

This dataset was prepared to aid in the creation of a machine learning algorithm that would classify the white blood cells in thin blood smears of juvenile Visayan warty pigs. 

The creation of this dataset was deemed imperative because of the limited availability of blood smear images collected from the critically endangered species on the internet. 

The dataset contains 3,457 images of various types of white blood cells (JPEG) with accompanying cell type labels (XLSX).

 

------------------------------

SHARING/ACCESS INFORMATION

------------------------------

 

Licenses/restrictions placed on the data: 

You are free to share (copy, distribute, and use the dataset), create (produce works from the dataset), and adapt (modify, transform, and build upon the dataset) as long as you attribute use and works produced from the dataset. For any use or redistribution of the dataset, or works produced from it, you must make clear to others the license of the dataset and keep intact any notices on the original dataset.

 

------------------------------

METHODOLOGICAL INFORMATION

------------------------------

 

Description of methods used for collection/generation of data: 

 

Sample Collection

A smartphone was used to capture images of the peripheral blood smears of juvenile warty pigs viewed under 100x Oil Immersion lens.

 

Methods for processing the data: 

 

In the manual classification of images to their respective WBC type, a licensed medical technologist was consulted for verification. Using Keras Preprocessing Layers, the images were augmented to generate a larger dataset. Image processing was used to extract the features that will be inputted into the chosen machine learning algorithm for automated classification.

 

People involved with sample collection, processing, analysis and/or submission: 

Monica Marie Atienza, DVM 

Sonny Mana-ay, RMT

 

------------------------------

DATA SPECIFIC INFORMATION FOR: Image Processing Features for Cropped.xlsx

------------------------------

 

Number of variables: 53

 

Number of cases/rows: 1408

 

Column Headings:

Column X - Image filename

Column Y - WBC class (1 - Neutrophil, 2 - Lymphocyte, 3 - Monocyte, 4 - Eosinophil, 5 - Basophil)

Columns 3-53 - Features extracted with Image Processing

 

------------------------------

DATA SPECIFIC INFORMATION FOR: Image Processing Features Augmented.xlsx

------------------------------

 

Number of variables: 53

 

Number of cases/rows: 1328

 

Column Headings:

Column X - Image filename

Column Y - WBC class (1 - Neutrophil, 2 - Lymphocyte, 3 - Monocyte, 4 - Eosinophil, 5 - Basophil)

Columns 3-53 - Features extracted with Image Processing

 

Dataset Files

    Files have not been uploaded for this dataset