You know that "autocomplete" feature on your smart phone or tablet that occasionally (or, in my case, frequently) turns into an "autocorrupt" feature? I just ran into it in an
R script.
I wrote a web-based application for a colleague that lets students upload data, run a regression, ponder various outputs and, if they wish, export (download) selected results. In the server script, I created an empty list named "
export". As users generated various outputs, they would be added to the list for possible download (to avoid having to regenerate them at download time). For instance, if the user generated a histogram of the residuals, then the plot would be stored in
export$hist. Similarly, if the user looked at the adjusted R-squared, it would be parked in
export$adjr2.
All was well until, in beta testing, I bumped into a bug involving the p-value for the F test of overall fit (you know, the test where failure to reject the null hypothesis would signal that your model contended for the worst regression model in the history of statistics). Rather than getting a single number between 0 and 1, in one test it printed out as a vector of numbers well outside that range. Huh???
I beat my head against an assortment of flat surfaces before I found the bug. The following chunk of demonstration code sums it up.
export <- list() # create an empty export list
print(export$f) # result: NULL
export$fitted <- c(2, 3, 1, 7) # (simulated) fitted values
print(export$f) # result: [1] 2 3 1 7
Created by Pretty R at inside-R.org
The intent was to store the p-value of the test of overall fit in
export$f, and the fitted values in
export$fitted. If the user never checked the F test, I wanted
export$f to be null, which would signal the export subroutine to skip it. Instead, the export subroutine autocompleted
export$f (which did not exist) to
export$fitted (which did exist) and spat out the mystery vector. There are multiple ways to avoid the bug, the simplest being to rename
export$f to something like
export$fprob, where "fprob" is not a substring of the name of any other entry of
export.
I do my R coding inside
RStudio, which provides autocompletion
suggestions. Somewhere along the line, I think I came across the fact that the R interpreter autocompletes some things. It never occurred to me that this would happen when a script ran. When running commands interactively, I suppose the autocomplete feature saves some keystrokes. That's not generally an issue when running scripts, so I don't know why autocomplete is not turned off when "sourcing" a script.
At any rate, letting the betting commence on how long it will take me to forget this (and trip over it again).