The first studies on the effectiveness of keystroke characteristics as personal identifiers occurred in 1977
and 1980
(for a
fuller
treatment of work prior to 1990, see Joyce and Gupta
). Over the
years
, many different
classifiers
have been evaluated in an effort to improve recognition capabilities of keystroke biometrics ,
ranging
from statistical analysis to neural networks. It is beyond the scope of this chapter to delve into the details of each approach. In general, each classifier measures the similarity between an input keystroke timing pattern and a reference model of the
legitimate
user's typing pattern. The model is built from training samples previously provided by each
user
and maintains varying characteristics depending on the classifier. The time required to generate each model also varies according to the classifier, with neural networks
generally
taking significantly longer than other approaches.
|
Authors/Year
|
Input Data
|
Design
|
Features
|
Preprocessing
|
Classifiers
|
Notes
|
|
Gaines, Lisowski, Press, and Shapiro
a
; 1980
|
Three 300400 character passages
|
Seven professional secretaries typed two samples each with a delay of four months between samples
|
Interkey delays
|
Used only the 87 digraphs that had at least 10 or more replications per sample and per user; eliminated outliers; took logarithm of values
|
Two-sample t-test on whether the means of each value were the same
assuming
that variances were the same
|
Identified five
core
digraphs that discriminated
perfectly
: in, io, no, on, and ul
|
|
Umphress and Williams
b
; 1985
|
Fixed 1,400-character reference input, 300-character test input
|
17 programmers typed samples with a delay of at least one month; errors allowed
|
Interkey delays
|
Single
low-pass
temporal filter to remove outliers
|
Closeness between test value and corresponding reference value, measured according to a standard deviation threshold and a passing ratio
|
|
|
Leggett and Williams
c
; 1988
|
Two samples of fixed 537-character input
|
36 individuals typed samples with a delay of at least one month; errors allowed
|
Interkey delays; mean of delays
|
Various; resulted in 12 different
subsets
of feature vectors to analyze
|
Closeness measure as in Umphress and Williams
|
Found that means of delays do not further discriminate between users; using all lowercase digraphs yielded best results
|
|
Joyce and Gupta
d
; 1990
|
Username, password, first name, last
name
|
33 users typed all samples in a single session
|
Key press delays
|
None
|
Minimum distance from reference model, with verification threshold according to each user's typing variance
|
Found that more
experienced
users were more difficult for imposters to replicate
|
|
Bleha, Slavinski, and Hussein
e
; 1990
|
Username and fixed 32-character phrase
|
32 users typed samples over a period of weeks
|
Digraph latencies
|
Combined two samples into one; dimension reduction to reduce
size
of feature vector
|
Normalized minimum distance; normalized Bayesian
|
Applied different fixed thresholds for authentication
|
|
Leggett and Williams
et al
.
f
; 1991
|
Same as 1988
|
Same as 1988
|
Interkey delays
|
N/A
|
N/A
|
Introduced dynamic characterization of users by their typing patterns
|
|
Bleha, Knopp, and Obaidat
g
; 1992
|
Fixed 32-character phrase
|
Users typed the sample at least once per day for five weeks
|
Digraph latencies
|
None
|
Linear perception
|
|
|
Brown and Rogers
h
; 1993
|
First and last name
|
25 users typed on a single keyboard
|
Digraph latencies
|
Removed outliers
|
Minimum distance; back-propagation neural network; partially connected
back-propagation
neural network
|
Found that partially connected back-propagation network performed the best
|
|
Obaidat
i
; 1995
|
Username and password
|
15 users typed on a single keyboard over 8 weeks
|
Durations; digraph latencies
|
None
|
Various pattern recognition (k-means, cosine measure, minimum distance, Bayesian, potential function); various neural networks (BP, SOM, ART-2, RBFN, LVQ, RNN, SOP, HSOP)
|
Potential function and Bayesian performed the best, while cosine measure performed the worst; using only durations was more successful than using only latencies
|
|
Lin
j
; 1997
|
Password
|
90 valid users and 61 invalid users logged into system
|
Durations; key press delays
|
Derived invalid vectors by extending valid vector with random
numbers
and multiplying by a factor
|
Three-layer back-propagation neural network
|
|
|
de Ru and Eloff
k
; 1997
|
Password
|
30 users typed on single keyboard; used assembler code to produce time intervals in clock cycles
|
Interkey delays; category indicating typing difficulty of password
|
Related precise delays to four time interval categories (a value can belong to more than one category through probabilistic assignment)
|
Fuzzy logic with four categories and five rules
|
Found typing difficult to be less discriminating than timing interval
|
|
Song, Venable, and Perrig
l
; 1997
|
Continuous monitoring of keystrokes
|
Several hours of keystroke data gathered for each user; coarse timing granularity of 10 ms due to X server implementation
|
Digraph, trigraph, and wordgraph key events for each incoming keystroke
|
Measured closeness of incoming key events to the respective digraph, trigraph, and wordgraph models for that user
|
Final probabilistic prediction based on a weighted sum of the incoming keystroke's closeness measurement and the previous keystroke's closeness measurement
|
Empirical observations on a single user showed promise, but lack of quantitative results
|
|
Robinson
et. al
.
m
; 1998
|
Username
|
140 students routinely logged into campus network;
replaced
standard login module with one that collected keystrokes
|
Digraph latencies
|
Randomly selected 10 usernames for training and 10 usernames for testing; discarded 24% of samples due to typing errors
|
Minimum distance; nonlinear measure similar to Umphress and Williams; inductive learning based on nonparametric density estimation
|
Found that inductive learning classifier using both duration and latencies performed the best; using duration time alone was better than latencies
|
|
Monrose, Reiter, and Wetzel
n
; 1999
|
Fixed eight-character password
|
20 users logged into server at least five times over six months; Java applet recorded keystrokes
|
Durations; digraph latencies
|
Selected distinguishing features based on mean and standard deviation, and thresholds
|
Binary classification (slow and fast) for each distinguishing feature
|
Attempted to
demonstrate
how passwords can be more securely stored on servers, and did not seek to minimize FAR
|
|
Monrose and Rubin
o
; 2000
|
N/A
|
63 users typed on local Sun workstations at their convenience over 11 months
|
N/A
|
Selected most significant features
|
Minimum distance; weighted and nonweighted probability; Bayesian
|
Bayesian classifier performed the best
|
|
Peacock
p
; 2000
|
Username, password, fixed nine-character word
|
11 users typed samples from own machines in one session; Java applet recorded keystrokes
|
Durations; digraph latencies
|
None
|
K-
nearest
neighbor
|
|
|
Cho, Han, Han, and Kim
q
; 2000
|
Seven-character password
|
25 users typed samples over several days
|
Durations; digraph latencies
|
Removed two users; 6%-50% of training data discarded for every user
|
Minimum distance;
autoassociative
neural network
|
Neural network performed the best
|
|
Haider, Abbas, and Zaidi
r
; 2000
|
Seven-character password
|
Users typed samples into DOS-based application
|
Interkey delays
|
None
|
Fuzzy logic with five categories; three-layer neural network; statistical confidence interval; combinations thereof
|
A combination of approaches performed the best
|
|
Changshui and Yanhua
s
; 2000
|
Fixed 1,100-character text
|
24 users typed sample 18 times
|
Durations; key press delays
|
Removed outliers
|
Autoregressive model with coefficients by the Yule-Walker and Burg
methods
|
Low accuracy relative to previous results
|
|
Wong
et al
.
t
; 2001
|
User-selected password
|
10 users typed on 2 dedicated machines; 100 unauthorized attempts
|
Interkey delay
|
Removed outliers
|
Single-layer perceptron network; minimum distance
|
Tradeoff between FRR and FAR for the two classifiers used, with the neural network having a high FAR
|
|
Bergadano, Gunetti, and Picardi
u
; 2002
|
Fixed 683-character text
|
44 users typed sample over one month, with no two samples from a user collected on the same day; errors allowed
|
Trigraph durations
|
None
|
Disorder between arrays of sorted trigraph durations
|
The method was also
tested
on digraphs, 4-graphs, and 6-graphs, but trigraphs performed the best
|
|
Clarke
et al
.
v
; 2002
|
Four-digit number, fixed phone number, varying phone numbers
|
16 users typed on mobile handset
|
N/A
|
N/A
|
Back-propagation neural network
|
|
|
Kacholia and Pandit
w
; 2003
|
Username and password
|
20 users typed on a single machine
|
N/A
|
N/A
|
Clustering to produce reference models; threshold deviation for classification
|
|
|
Yu and Cho
x
; 2003
|
Seven-character password
|
25 users typed samples over several days (data from same experiment as Cho, 2000)
|
Durations; digraph latencies
|
Various, with the best results after performing feature selection based on a genetic algorithmSVM-based wrapper
|
Support Vector Machine (SVM) novelty detector models
|
SVM approach is about 1,000 times more efficient than multilayer perceptrons but has the same degree of accuracy; large training sample needed to attain most accurate results
|