The objective of automated scoring algorithms for polygraph data is to create reliable and statistically valid classification schemes minimizing both false positive and false negative rates. With increasing computing power and well developed statistical methods for modeling and classification we often launch analyses without much consideration for the quality of the datasets and the underlying assumptions of the data collection. In this paper we try to assess the validity of logistic regression when faced with a highly variable but small dataset.
We evaluate 149 real-life specific incident polygraph cases and review current automated scoring algorithms. The data exhibit enormous variability in the subject of investigation, format, structure, and administration making them hard to standardize within an individual and across individuals. This makes it difficult to develop generalizable statistical procedures. We outline steps and detailed decisions required for the conversion of continuous polygraph readings into a set of features. With a relativelly simple approach we obtain accuracy rates comparable to those currently reported by other algorithms and manual scoring. Complexity that underlines assessment and classification of examinee's deceptiveness is evident in a number of models that account for different predictors yet give similar results, typically "overfitting" with the increasing number of features. While computerized systems have the potential to reduce examineer variability and bias, the evidence that they have achieved this potential is meager at best.