Data Collection

The purpose, design, and implementation of the data collection phase of our study.

Purpose


The purpose of the data collection phase was to collect keystroke data from a random sample of participants. The resulting dataset from our collection phase was later used to test the validity of our models in verifying or denying users. Participants were given the same ten character phrase used in a study completed at Carnegie Mellon. Software used for data collection included a combination of HTML, CSS, and Javascript.

Design


Requirements for Data Collection

There were four main goals considered in designing the software for collecting ans storing keystroke data.
  1. For each participant, collect the max amount of keystroke data while not exhausting a user.
  2. Collect data that reflected real-life scenario of password set-up and entry.
  3. Data is comparable to the password (.tie5Roanl) used in the study at Carnegie Mellon.
  4. Avoid user bias of artifically altering keystroke data.

Results



Figure 1: Graphic of data collection process.

Each user would enter the ten character password thirty times. Of the 51 participants used in the study, none had previous access to the earlier implementations of the data collection phase. Each respondent's input was divided into three different groups. These three groups served different roles of the data interpretation and analyzation for our models.