Simple intercepts, simple slopes, and regions of significance in LCA 3-way interactions
Kristopher J. Preacher (Vanderbilt University)
Patrick J. Curran (University of North Carolina at Chapel Hill)
Daniel J. Bauer (University of North Carolina at Chapel Hill)
Get a printable PDF version of these instructions.
If the Rweb server is not working
The code generated by this utility can be pasted directly into an R console window. R (a free, open-source statistical computing environment) may be obtained here: http://cran.r-project.org/.
This web page calculates simple intercepts, simple slopes, and the region of significance to facilitate the testing and probing of three-way interactions estimated in latent curve analysis (LCA) models. In LCA, repeated measures of a variable y are modeled as functions of latent factors representing aspects of change or latent curves, typically an intercept factor and one or more slope factors. We use the standard structural equation modeling (SEM) notation to define equations, and we assume that the user is knowledgeable both in the general SEM and in the testing, probing, and interpretation of interactions in multiple linear regression (e.g., Aiken & West, 1991). The following material is intended to facilitate the calculation of the methods presented in Curran, Bauer, and Willoughby (2004), and we recommend consulting this paper for further details, as well as our companion web page on 2-way interactions in LCA.
Let yit represent repeated measures of variable y for i = 1, 2, ..., N individuals at t = 1, 2, ..., T occasions (all of these techniques generalize to times varying over i, but for simplicity we assume that all individuals are measured at the same occasions; see Curran et al., 2004, p. 222 for details). In matrix notation, the general form of an LCA measurement model is
(1) |
where y is a T x 1 vector of repeated measures for individual i, is a T x k matrix of factor loadings (where k is the number of latent curve factors, here 2 to define a linear trajectory), is a k x 1 vector of latent curve factors, and is a T x 1 vector of time-specific residuals.
In most applications of LCA, the elements of are constrained to reflect linear growth, e.g.:
(2) |
The first column of contains loadings on the intercept factor. In LCA models, time is not explicitly included as a variable, but rather is incorporated in the model as elements of the second column of . The variance of the slope factor, in turn, represents individual differences in the slope of the latent trajectory. For more detail see Curran, et al. (2004).
An expression for the latent curve factors is:
(3) |
where is a k x 1 vector of latent curve factor means and is a k x 1 vector of residuals. Scalar expressions for elements in with no exogenous predictors are:
(4) |
A typical element of y is:
(5) |
One of the primary advantages of the LCA framework is that the factors representing intercept and slope can serve as endogenous (dependent) variables in other model equations. The figure above represents just such a conditional LCA model, in which the intercept and slope representing the latent trajectory of the repeated measures of y are modeled as dependent variables regressed on x1, x2, and the product of x1 and x2 to represent the interactive effect of two exogenous predictors on the latent curve factors. In such cases, the latent curve factors may be expressed as functions of the exogenous predictors x1, x2, and x1x2:
(6) |
where is a k x p matrix of regression parameters linking the k latent curve factors to the p exogenous predictors and x is a p x 1 vector of exogenous predictors x1, x2, and x1x2. Substituting Equation 6 into Equation 1 yields a reduced form equation for y:
(7) |
(8) |
The first parenthetical term in Equation 8 is referred to as the fixed component and the second parenthetical term as the random component.
The prediction of with time-invariant predictors x represents a three-way interaction with time. To see why this is so, consider the scalar expressions for elements in when the exogenous predictors in x include x1, x2, and x1x2:
(9) |
a typical element of y is:
(10) |
The fixed component of Equation 10 can be seen to contain an intercept term (i.e., ), conditional main effects for time (i.e., ), x1 (i.e., 1), and x2 (i.e., 2), conditional two-way interaction effects for x1x2 (i.e., 3), time and x1 (i.e., 4), and time and x2 (i.e., 5), and the three-way interaction of time, x1, and x2 (i.e., 6). Thus, the effect of time on y depends in part on the levels of x1 and x2. Given this, we can draw upon classical techniques for testing and plotting conditional effects in multiple regression. See our supporting material for probing interactions in standard regression here.
yt on t regressions at x1 and x2. The regression of y on time for specific values of x1 and x2 we term yt on t regressions at x1 and x2. Taking the expectation of Equation 10 and rearranging clarifies the role of x1 and x2 when they moderate the magnitude of the regression of y on time:
(11) |
Note that Equation 11 has the form of a simple regression of y on t where the first parenthetical term is the intercept of the simple regression and the second parenthetical term is the slope of the simple regression. We will refer to the first parenthetical term as the simple intercept and the second term as the simple slope. It can be seen that the simple intercept and simple slope are compound coefficients that result from the linear combination of other parameters. To further explicate this, we can re-express Equation 11 in terms of sample estimates such that
(12) |
where
(13) |
These general expressions for the simple intercept (0) and simple slope (1) define the conditional regression of y on t as a function of x1 and x2. Because these are sample estimates, we must compute standard errors to conduct inferential tests of these effects. The computation of these standard errors is one of the key purposes of our calculator.
The preceding material addresses the strategy of probing the three-way interaction of time, x1, and x2 such that conditional trajectories are examined for chosen values of x1 and x2. Alternatively, the effect of x1 on y can be seen to depend on time and x2, and the effect of x2 on y can be seen to depend on time and x1. Although tests of these effects can be highly informative, our primary interest is likely to be in conditional trajectories calculated at specific levels of x1 and x2. Consequently, we do not explore these alternative expressions here.
We are primarily interested in the estimation of the simple intercept (0) and the simple slope (1) of the conditional regression of outcome y on time as a function of the moderators x1 and x2. We have developed a calculator for yt on t regressions at x1 and x2 (see Curran, Bauer, & Willoughby, 2004 for details). We now turn to a brief description of the values that can be calculated using our table below.
The first available output is the region of significance of the simple slope describing the relation between the outcome y and time as a function of moderators x1 and x2. We do not provide the region of significance for the simple intercept given that this is rarely of interest in practice. The region of significance defines the specific values of the moderator at which the regression of y on time transitions from non-significance to significance. Although this region can be easily obtained when testing a two-way interaction, these are much more complex to compute for a three-way interaction (see Curran, Bauer, & Willoughby, 2004 for futher details). As is proposed in Curran et al. (2004, p. 227), the table allows for the calculation of the region of significance of the regression of y on time across values of x1 at a particular value of x2. This is a melding of the simple slopes and region approach. There are lower and upper bounds to the region. In many cases, the regression of y on time is significant at values of the moderator that are less than the lower bound and greater than the upper bound, and the regression is non-significant at values of the moderator falling within the region. However, there are some cases in which the opposite holds (e.g., the significant slopes fall within the region). Consequently, the output will explicitly denote how the region should be defined in terms of the significance and non-significance of the simple slopes. There are also instances in which the region cannot be mathematically obtained, and an error is displayed if this occurs for a given application. However, this region is calculated for a specific conditional value of x2. The region can be re-calculated at several different conditional values of x2 (e.g., ±1SD) to gain a better understanding of the structure of the three-way interaction. By default, the region is calculated at = .05, but this may be changed by the user. Finally, the point estimates and standard errors of both the simple intercepts and the simple slopes are automatically calculated precisely at the lower and upper bounds of the region.
Simple Intercepts and Simple Slopes
The second available output is the calculation of point estimates and standard errors for up to two simple intercepts and simple slopes of the regression of y on time at specific levels of the moderators. In the table we refer to these specific values as conditional values. We can choose from a variety of potential conditional values of x1 and x2 for the computation of the simple intercepts and slopes. If x1 or x2 is dichotomous, we could select conditional values of 0 and 1 to compute the regression of y on time within group 0 and group 1. If x1 or x2 is continuous, we might select conditional values that are one standard deviation above the mean of x1 or x2 and one standard deviation below the mean of x1 or x2. Whatever the conditional values chosen, these specific values are entered in the sections labeled "Conditional Values of x1" and "Conditional Values of x2," and this will provide the corresponding simple slopes of y on time at those values of x1 and x2. The calculation of simple intercepts and slopes at specific values of the moderator is optional; the user may leave any or all of the conditional value fields blank.
Simple intercepts, simple slopes, and the region of significance can be obtained by following these five steps. Use as many significant digits as possible for optimal precision.
Once all of the necessary information is entered into the table, click "Calculate." The status box will identify any errors that might have been encountered. If no errors are found, the results will be presented in the output window. Although the results in the output window cannot be saved, the contents can be copied and pasted into any word processor for printing.
Below the output window are two additional windows. If conditional values of t (Points to Plot) and x1, as well as at least one conditional value of x2, are entered, clicking on "Calculate" will also generate R code for producing a plot of the interaction between time and x1 at the lowest value of x2 (R is a statistical computing language). This R code can be submitted to a remote Rweb server by clicking on "Submit above to Rweb." A new window will open containing a plot of the interaction effect. The user may make any desired changes to the generated code before submitting, but changes are not necessary to obtain a basic plot. Indeed, this window can be used as an all-purpose interface for R.
Assuming enough information is entered into the interactive table, the second output window below the table will include R syntax for generating confidence bands. The user is expected to supply lower and upper values for the moderator x1 (-10 and +10 by default). As above, this R code can be submitted to a remote Rweb server by clicking on "Submit above to Rweb." A new window will open containing a plot of confidence bands.
|
Curran, P. J., Bauer, D. J, & Willoughby, M. T. (2004). Testing main effects and interactions in latent curve analysis. Psychological Methods, 9, 220-237.
Preacher, K. J., Curran, P. J., & Bauer, D. J. (2006). Computational tools for probing interaction effects in multiple linear regression, multilevel modeling, and latent curve analysis. Journal of Educational and Behavioral Statistics, 31, 437-448.
Original version posted September, 2003. Free JavaScripts provided by The JavaScript Source and John C. Pezzullo.