CRAWDAD dartmouth/bluetooth-hci

Citation Author(s):
Travis
Peters
Dartmouth
Submitted by:
CRAWDAD Team
Last updated:
Sun, 03/28/2021 - 20:00
DOI:
10.15783/tjt0-b278
Data Format:
License:
209 Views
Citations:
1
Categories:
0
0 ratings - Please login to submit your rating.

Abstract 

Bluetooth hci traces collected on smartphones (btsnoop)

This dataset consists of a collection of Bluetooth HCI traces captured on a smartphone while a smartphone and smart device communicated. 00_raw` contains the raw HCI traces (btsnoop files pulled from an Android smartphone) - each subfolder contains the traces captured during communication between a specific device and its companion smartphone app. `01_processed` contains CSV-formatted files, which are parsed versions of the raw Bluetooth traces. The first row of each file contains the column labels.

We collected a large number of network traces (more than 300 in total) that captured interactions between 20 distinct devices with 13 different smartphone apps. Here, a **trace** refers to a packet capture consisting of all packets that are observed between the time that a smartphone and peripheral device establish and terminate a connection; an **interaction** refers to a semantically meaningful exchange of packets between the smartphone and peripheral.

This dataset focuses on two broad categories of smart devices that are common for Wireless Personal Area Network (WPAN) scenarios: **smart-health devices** and **smart-home devices**. We carefully selected devices to ensure that our testbed was composed of diverse devices, in terms of their functions; yet, we also wanted to study similar devices to evaluate the limitations of our approach to modeling, and the abilities of our models to differentiate between similar devices. We label devices according to their type, make, and model:

- **type** (which refers to a devices functionality and purpose)

- **make** (which refers to the manufacturer of the device)

- **model** (which refers to an identifier, such as a name or number, that is used to distinguish among devices made by the same manufacturer)

At the time this work was conducted, our testbed consisted of 20 Bluetooth-enabled, smart-health and smart-home devices: two weight and body composition scales, each from different manufacturers; five blood-pressure monitors, from three different manufacturers;13 three heart-rate monitors (two of which have identical make and model), from two different manufacturers; one pulse oximeter; one TENS unit (a unit is technically made up of two distinct Bluetooth devices); two glucose monitors, from two different manufacturers; two thermometers (same make, different models); two smart locks, each from different manufacturers; and two identical smart environment sensors (same make and model).

Bluetooth networks are ad hoc networks. The dataset was collected by connecting various Bluetooth devices with a smartphone (i.e., a smartphone ad hoc network or capturing communication between the devices and their companion app running atop the smartphone.

The smartphone apps used in data collection were installed on a Nexus 5 smartphone running Android 6.0.1 (Marshmallow), API level 23, kernel version 3.4.0. Along with executing the apps, the smartphone also served as our primary device for data collection. To capture HCI traces, we enabled the **Bluetooth HCI snoop log** developer option. (This feature is a common developer option introduced in Android 4.4. It is interesting to note that using this feature does not even require rooting the phone.) 

The HCI snoop log captures all Bluetooth HCI packets to a binary-encoded file, which it writes to an SD card; the log format resembles the Snoop Version 2 Packet Capture File Format described in RFC 1761 (btsnoop). 

Each trace captured interactions between one app-device pair. Specifically, each trace captured all communications observed at the HCI layer (and therefore all protocol layers above the HCI layer). For each app-device pair we collected at least 10 traces, each of which included 3-10 minutes of network activity.

We gathered HCI traces by manually using the apps and devices in our testbed to emulate a wide variety of normal app-device interactions. The actions we performed consisted of: navigating the official smartphone app14 and exercising features that trigger network communication with a corresponding device, as well as acting upon the devices in such a way that triggers communication with its corresponding smartphone app.

After collecting each HCI trace (a btsnoop file), we moved the raw file from the smartphone to a local VM using the Android Debugger command line tool (adb). We reset the HCI snoop file on the smartphone between each trace so that each HCI snoop file contained only packets belonging to interactions between a particular app-device combination (recall Section 4.5.1); we refer to this as an app-device session.

Collection method: The smartphone apps used in data collection were installed on a Nexus 5 smartphone running Android 6.0.1 (Marshmallow), API level 23, kernel version 3.4.0. Along with executing the apps, the smartphone also served as our primary device for data collection. To capture HCI traces, we enabled the **Bluetooth HCI snoop log** developer option. (This feature is a common developer option introduced in Android 4.4. It is interesting to note that using this feature does not even require rooting the phone.) 

The HCI snoop log captures all Bluetooth HCI packets to a binary-encoded file, which it writes to an SD card; the log format resembles the Snoop Version 2 Packet Capture File Format described in RFC 1761 (btsnoop). 

Each trace captured interactions between one app-device pair. Specifically, each trace captured all communications observed at the HCI layer (and therefore all protocol layers above the HCI layer). For each app-device pair we collected at least 10 traces, each of which included 3-10 minutes of network activity.

We gathered HCI traces by manually using the apps and devices in our testbed to emulate a wide variety of normal app-device interactions. The actions we performed consisted of: navigating the official smartphone app14 and exercising features that trigger network communication with a corresponding device, as well as acting upon the devices in such a way that triggers communication with its corresponding smartphone app.

After collecting each HCI trace (a btsnoop file), we moved the raw file from the smartphone to a local VM using the Android Debugger command line tool (adb). We reset the HCI snoop file on the smartphone between each trace so that each HCI snoop file contained only packets belonging to interactions between a particular app-device combination (recall Section 4.5.1); we refer to this as an app-device session.

An example of a filename for a trace: `raw.init.bpmonitor.choice.ua.00.2019.06.21.16.33.59.btsnoop.log`

Each information field is separated by a '.'

- Field 1: 'raw' = raw data (raw btsnoop files collected from the smartphone)

- Field 2: trace context

    - 'init' = trace from the pairing ('init'ialization) procedure

    - 'using' = trace from typical usage (i.e., connect, use the app/device, disconnect)

- Fields 3-5: device label specfied as `type.make.model-instance`. Note: a `-instance` is used as an identifier to distinguish between multiple instances of otherwise similar devices. 

- Field 6: repetition counter (trial number) - i.e., the data collection procedure was repeated multiple times

- Fields 7-12: timestamp (`year.month.day.hour.minute.second`) 

- Fields 13-14: standard file extension for btsnoop files (logs) captured on the smartphone. 

An example of a filename for a parsed version of a raw trace: `out.raw.init.bpmonitor.choice.ua.00.2019.06.21.16.33.59.btsnoop.txt`

The fields are mostly similar except that:

- **New Prefix:**: the parsed output file has the word 'out' prepended to it

- **Modified Suffix:** the suffix has been changed to `.txt` to indicate that it is stored in a human-readable format.

keywords: Bluetooth,Packet Trace

measurement purposes: Computer Malware (Worms) Investigation,Human Behavior Modeling,Network Diagnosis,Network Security

Data sanitization method: None needed.

Start date: 2019-06-21

End date: 2019-07-01

Methodology limitations: It was not our intention to discover and exercise every functional feature (and thus every BLE service or characteristic) of a particular app/device. Rather, it was our intention to observe typical features and interactions between devices and their official app, which could be used to construct normality models suitable for performing verification in future app-device interactions.

Trace note: Raw Data vs. Parsed Data This dataset includes both the raw traces as well as parsed versions of the files to make analysis easier (e.g., if you are not familiar with parsing Bluetooth\/BLE packets.) If you have your own Bluetooth protocol parser, the raw files should be sufficient. Otherwise, you may prefer to work with the processed files. 

To parse HCI traces, we extended two open-source projects: https://github.com/traviswpeters/btsnoop and https://github.com/traviswpeters/bluepy. Our extensions extend the parsing of the HCI protocol and other protocols that the HCI protocol encapsulates; namely, we extract features for each packet within the HCI traces, such as packet types, lengths, endpoint identifiers, protocol semantics, and segmented headers and payloads belonging to higher-level Bluetooth protocols (e.g., the Attribute Protocol (ATT), the Security Management Protocol (SMP), the Signaling Protocol); and, because each trace captured a single app-device session, we labeled each packet according to the device it was sent to\/from. These features and labels were written to CSV-formatted files for subsequent (offline) analysis.

Instructions: 

The files in this directory are a CRAWDAD dataset hosted by IEEE DataPort. 

About CRAWDAD: the Community Resource for Archiving Wireless Data At Dartmouth is a data resource for the research community interested in wireless networks and mobile computing. 

CRAWDAD was founded at Dartmouth College in 2004, led by Tristan Henderson, David Kotz, and Chris McDonald. CRAWDAD datasets are hosted by IEEE DataPort as of November 2022. 

Note: Please use the Data in an ethical and responsible way with the aim of doing no harm to any person or entity for the benefit of society at large. Please respect the privacy of any human subjects whose wireless-network activity is captured by the Data and comply with all applicable laws, including without limitation such applicable laws pertaining to the protection of personal information, security of data, and data breaches. Please do not apply, adapt or develop algorithms for the extraction of the true identity of users and other information of a personal nature, which might constitute personally identifiable information or protected health information under any such applicable laws. Do not publish or otherwise disclose to any other person or entity any information that constitutes personally identifiable information or protected health information under any such applicable laws derived from the Data through manual or automated techniques. 

Please acknowledge the source of the Data in any publications or presentations reporting use of this Data. 

Citation:

Travis Peters, dartmouth/bluetooth-hci, https://doi.org/10.15783/tjt0-b278 , Date: 20210329

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.

Documentation

AttachmentSize
File dartmouth-bluetooth-hci-readme.txt1.57 KB

These datasets are part of Community Resource for Archiving Wireless Data (CRAWDAD). CRAWDAD began in 2004 at Dartmouth College as a place to share wireless network data with the research community. Its purpose was to enable access to data from real networks and real mobile users at a time when collecting such data was challenging and expensive. The archive has continued to grow since its inception, and starting in summer 2022 is being housed on IEEE DataPort.

Questions about CRAWDAD? See our CRAWDAD FAQ. Interested in submitting your dataset to the CRAWDAD collection? Get started, by submitting an Open Access Dataset.