Since I teach stepwise in my seminar, I would like to demonstrate it in R (not to mention some of my students are learning R while doing their homework, which includes a stepwise problem). The catch is that R seems to lack any library routines to do stepwise as it is normally taught. There is a function (leaps::regsubsets) that does both best subsets regression and a form of stepwise regression, but it uses AIC or BIC to select models. That's fine for best subsets, but stepwise (at least as I've seen it in every book or paper where I've encountered it) uses nested model F tests to make decisions. Again, if you search around, you can find some (fairly old) posts on help forums by people searching (fruitlessly, it appears) for a stepwise implementation in R.
So I finally forced myself to write a stepwise function for R. My knowledge of R coding is very, very limited, so I made no attempt to tack on very many bells and whistles, let alone make it a library package. Unlike most R routines, it does not create an object; it just merrily writes to the standard output stream. There are a number of limitations (expressed in the comments), and I've only tested it on a few data sets. All that said, I'm going to post it below, in case someone else is desperate to do conventional stepwise regression in R.
=== code follows ===
# # This is an R function to perform stepwise regression based on a "nested model" F test for inclusion/exclusion # of a predictor. To keep it simple, I made no provision for forcing certain variables to be included in # all models, did not allow for specification of a data frame, and skipped some consistency checks (such as whether # the initial model is a subset of the full model). # # One other note: since the code uses R's drop1 and add1 functions, it respects hierarchy in models. That is, # regardless of p values, it will not attempt to drop a term while retaining a higher order interaction # involving that term, nor will it add an interaction term if the lower order components are not all present. # (You can of course defeat this by putting interactions into new variables and feeding it what looks like # a first-order model.) # # Consider this to be "beta" code (and feel free to improve it). I've done very limited testing on it. # # Author: Paul A. Rubin (rubin@msu.edu) # stepwise <- function(full.model, initial.model, alpha.to.enter, alpha.to.leave) { # full.model is the model containing all possible terms # initial.model is the first model to consider # alpha.to.enter is the significance level above which a variable may enter the model # alpha.to.leave is the significance level below which a variable may be deleted from the model # (Useful things for someone to add: specification of a data frame; a list of variables that must be included) full <- lm(full.model); # fit the full model msef <- (summary(full)$sigma)^2; # MSE of full model n <- length(full$residuals); # sample size allvars <- attr(full$terms, "predvars"); # this gets a list of all predictor variables current <- lm(initial.model); # this is the current model while (TRUE) { # process each model until we break out of the loop temp <- summary(current); # summary output for the current model rnames <- rownames(temp$coefficients); # list of terms in the current model print(temp$coefficients); # write the model description p <- dim(temp$coefficients)[1]; # current model's size mse <- (temp$sigma)^2; # MSE for current model cp <- (n-p)*mse/msef - (n-2*p); # Mallow's cp fit <- sprintf("\nS = %f, R-sq = %f, R-sq(adj) = %f, C-p = %f", temp$sigma, temp$r.squared, temp$adj.r.squared, cp); write(fit, file=""); # show the fit write("=====", file=""); # print a separator if (p > 1) { # don't try to drop a term if only one is left d <- drop1(current, test="F"); # looks for significance of terms based on F tests pmax <- max(d[-1,6]); # maximum p-value of any term (have to skip the intercept to avoid an NA value) if (pmax > alpha.to.leave) { # we have a candidate for deletion var <- rownames(d)[d[,6] == pmax]; # name of variable to delete if (length(var) > 1) { # if an intercept is present, it will be the first name in the list # there also could be ties for worst p-value # taking the second entry if there is more than one is a safe solution to both issues var <- var[2]; } write(paste("--- Dropping", var, "\n"), file=""); # print out the variable to be dropped f <- formula(current); # current formula f <- as.formula(paste(f[2], "~", paste(f[3], var, sep=" - "))); # modify the formula to drop the chosen variable (by subtracting it) current <- lm(f); # fit the modified model next; # return to the top of the loop } } # if we get here, we failed to drop a term; try adding one # note: add1 throws an error if nothing can be added (current == full), which we trap with tryCatch a <- tryCatch(add1(current, scope=full, test="F"), error=function(e) NULL); # looks for significance of possible additions based on F tests if (is.null(a)) { break; # there are no unused variables (or something went splat), so we bail out } pmin <- min(a[-1,6]); # minimum p-value of any term (skipping the intercept again) if (pmin < alpha.to.enter) { # we have a candidate for addition to the model var <- rownames(a)[a[,6] == pmin]; # name of variable to add if (length(var) > 1) { # same issue with ties, intercept as above var <- var[2]; } write(paste("+++ Adding", var, "\n"), file=""); # print the variable being added f <- formula(current); # current formula f <- as.formula(paste(f[2], "~", paste(f[3], var, sep=" + "))); # modify the formula to add the chosen variable current <- lm(f); # fit the modified model next; # return to the top of the loop } # if we get here, we failed to make any changes to the model; time to punt break; } }
Update (08/21/17): I've posted an updated/improved (?) version of the code, including a demonstration, in an R notebook.