step(object, scope, scale=0, direction=c("both", "backward", "forward"),
trace=1, keep=NULL, steps=1000, k=2, ...)
object
| an object representing a model of an appropriate class. This is used as the initial model in the stepwise search. |
scope
| defines the range of models examined in the stepwise search. |
scale
|
used in the definition of the AIC statistic for selecting the models,
currently only for lm, aov and
glm models.
|
direction
|
the mode of stepwise search, can be one of "both", "backward",
or "forward", with a default of "both".
If the scope argument is missing,
the default for direction is "backward".
|
trace
|
if positive, information is printed during the running of step.
|
keep
|
a filter function whose input is a fitted model object and the
associated AIC statistic, and whose output is arbitrary.
Typically keep will select a subset of the components of
the object and return them. The default is not to keep anything.
|
steps
| the maximum number of steps to be considered. The default is 1000 (essentially as many as required). It is typically used to stop the process early. |
k
|
the multiple of the number of degrees of freedom used for the penalty.
Only k=2 gives the genuine AIC: k = log(n) is sometimes
referred to as BIC or SBC.
|
...
|
any additional arguments to extractAIC.
|
step uses add1 and drop1
repeatedly; it will work for any method for which they work, and that
is determined by having a valid method for extractAIC.
When the additive constant can be chosen so that AIC is equal to
Mallows' Cp, this is done and the tables are labelled appropriately.
There is a potential problem in using glm fits with a variable
scale, as in that case the deviance is not simply related to the
maximized log-likelihood. The function extractAIC.glm makes the
appropriate adjustment for a gaussian family, but may need to be
amended for other cases. (The binomial and poisson
families have fixed scale by default and do not correspond
to a particular maximum-likelihood problem for variable scale.)
"anova" component corresponding to the
steps taken in the search, as well as a "keep" component if the
keep= argument was supplied in the call. The
"Resid. Dev" column of the analysis of deviance table refers
to a constant minus twice the maximized log likelihood: it will be a
deviance only in cases where a saturated model is well-defined
(thus excluding lm, aov and survreg fits, for example).add1, drop1example(lm) step(lm.D9) data(swiss) summary(lm1 <- lm(Fertility ~ ., data = swiss)) slm1 <- step(lm1) summary(slm1) slm1 $ anova