FAQ
Why include insignificant terms in a model?
Original question from a Principal Scientist:
“Why does Stat-Ease software include factor B in my model when only A and AB are significant at p<0.05?”
Answer :
Terms such as factor B, though insignificant, must be included to maintain hierarchy and thus avoid creating a predictive model that does not convert properly from the coded to actual equation. It’s a math thing.* Without the insignificant terms present as placeholders, some cross products from the conversion fall away, creating what’s called an “ill-formulated polynomial model” that is not invariant to coding. “Consequently, measures of goodness of fit [such as R^squared!] of a not-well-formulated model may be affected by coding transformations.”*
If that is not scary enough, consider the implications of saying that a factor like B is not significant when it really creates an impact in combination with factor A. That can be very misleading. From my perspective as a process development and certified quality engineer who many-a-time defended results from designed experiments, this may be a most compelling reason to include model terms needed to maintain hierarchy.
While being trained by George Box, I heard him say that “if you are going to do something, you may as well do it right.” So, my advice when the software warns you about a non-hierarchical model and asks if you should correct it, is just say “Yes.”
*See our program-Help advanced topic detailing how to convert a coded response surface model to actual.
**“A Property of Well-Formulated Polynomial Regression Models,” Julio L. Peixoto, The American Statistician, Vol. 44, No. 1 (Feb, 1990), pp. 26-30.
PS: My consulting colleague Joe provides this helpful detail: “To maintain hierarchy, our software starts by testing highest order effects and then proceeds to the lower order ones. That is, it goes from bottom to top on the ANOVA table. If a higher order term is significant, the software (unless you say ‘No’ to hierarchy, which I do not recommend) retains any lower order terms that compose it.
(Learn more about predictive modeling by enrolling in the next Modern DOE for Process Optimization public workshop.)