Double Data Entry


The R program provided here automates comparisons for double data entry. Data may be entered in SPSS, Excel, or as in comma-separated-values format. The notation is to call the two files the "master" and the "slave". The two files need not be in the same file format, though the variable names must match. In addition to detecting cell differences, the program provides column summaries.


Once the program is started, the files are chosen interactively.

You are asked if you want to compare by line number or ID column. If the chosen ID column does not have unique values, comparison will be done by line number. Any IDs found only in one file are noted.

After the initial analysis, you are asked whether the full column-by-column analysis should go to a file or the screen. (The file name is created automatically and any existing "compare" file is overwritten.)

Columns are compared by name. Case differences in the name are noted and ignored. Unmatched columns in either file are noted and ignored. Strings are considered different even if only case or spacing differ, but this is flagged. Strings that differ only by upper vs. lower case or leading/trailing spaces are marked as different, but include a "*" in a column called "onlyCaseSpaceDiff" in case you want to consider them equal.

One time setup

The program requires R ( on your computer and has only been tested on Windows.

Download dataCompare.R, start R, change the working directory (e.g., using the R menu) to the location of dataCompare.R, and source it (e.g., using the R menu). The program will start immediately. (If you do not see the file dialog to load the master file, try using alt-tab to cycle through the windows.)

Optionally it is convenient to create a shortcut to the program. Right click on the R icon and choose "Copy". Go to the location where you want the shortcut (Desktop is fine). Right click and choose "Paste shortcut". Right-click on the new icon and choose "Properties". Under "Target", add " --quiet" (without the quotes) to the end. Change "Start in" to the location of the dataCompare.R file.

To test the program download one or more pairs from testFiles anywhere on your computer.

Technical note: The program automatically ends R. If you want to remain in R when you quit the program, create an empty file called "debug.txt" in the directory containing dataCompare.R.

