Nested Data

The final topic we will consider is the way we account for nesting within longitudinal data. While technically all of longitudinal modeling involves nested data (i.e., multiple observations nested within the same individual), here we focus more on between-person nesting groups (e.g., classroom, family, data collection sites). As we will see, there are several forms of the models that we have already used that deal with nesting. The achieve dataset we will work with here are from a 4-wave school-based assessment of math and science achievement across ages \(12\) - \(17\). Schools were drawn from \(5\) different metropolitan areas and assessments were conducted by separate research teams in each city. Here we will mostly focus on the science achievement data, but may draw examples from math achievement if they differ in interesting ways.

achieve <- read.csv("data/achieve.csv", header = TRUE)

achieve.wide <- achieve %>%
  pivot_wider(id_cols = c("site", "school", "id", "male"), 
              names_from = "wave",
              values_from = c("sci", "math"),
              names_sep = ".",
              values_fill = NA)

As always, we can fit an unconditional growth model to give us a baseline of expectations for the effects we will in subsequent models. Here we will fit the MLM and LCM versions as they are the simplest representations.

Unconditional Model

unc.mlm <- lmer(sci ~ 1 + wave + (1 + wave | id),
                na.action = na.omit,
                REML = TRUE,
                data = achieve)

summary(unc.mlm, correlation = FALSE)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + (1 + wave | id)
##    Data: achieve
## 
## REML criterion at convergence: 24100
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.5673 -0.4152  0.0084  0.4532  4.2100 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 94.998   9.747         
##           wave         4.035   2.009    -0.09
##  Residual             12.367   3.517         
## Number of obs: 3643, groups:  id, 1200
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept) 5.881e+01  3.180e-01 1.133e+03   185.0   <2e-16 ***
## wave        1.987e+00  8.716e-02 8.628e+02    22.8   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

unc.lcm <- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
            slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4"

unc.lcm.fit <- growth(unc.lcm,
                      data = achieve.wide,
                      estimator = "ML",
                      missing = "FIML")

summary(unc.lcm.fit, fit.measures = TRUE, estimates = TRUE, 
        standardize = TRUE, rsquare = TRUE)

## lavaan 0.6.13 ended normally after 138 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         9
## 
##                                                   Used       Total
##   Number of observations                          1200        1237
##   Number of missing patterns                        15            
## 
## Model Test User Model:
##                                                       
##   Test statistic                                55.402
##   Degrees of freedom                                 5
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              3806.133
##   Degrees of freedom                                 6
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.987
##   Tucker-Lewis Index (TLI)                       0.984
##                                                       
##   Robust Comparative Fit Index (CFI)             0.984
##   Robust Tucker-Lewis Index (TLI)                0.980
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -12038.652
##   Loglikelihood unrestricted model (H1)     -12010.951
##                                                       
##   Akaike (AIC)                               24095.304
##   Bayesian (BIC)                             24141.115
##   Sample-size adjusted Bayesian (SABIC)      24112.527
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.092
##   90 Percent confidence interval - lower         0.071
##   90 Percent confidence interval - upper         0.114
##   P-value H_0: RMSEA <= 0.050                    0.001
##   P-value H_0: RMSEA >= 0.080                    0.829
##                                                       
##   Robust RMSEA                                   0.127
##   90 Percent confidence interval - lower         0.097
##   90 Percent confidence interval - upper         0.160
##   P-value H_0: Robust RMSEA <= 0.050             0.000
##   P-value H_0: Robust RMSEA >= 0.080             0.994
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.031
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                               9.722    0.934
##     sci.2             1.000                               9.722    0.898
##     sci.3             1.000                               9.722    0.851
##     sci.4             1.000                               9.722    0.757
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.910    0.176
##     sci.3             2.000                               3.820    0.334
##     sci.4             3.000                               5.730    0.446
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~~                                                                
##     slp               3.184    1.033    3.083    0.002    0.172    0.172
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##     int              60.782    0.299  203.517    0.000    6.252    6.252
##     slp               2.048    0.090   22.857    0.000    1.072    1.072
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            13.812    1.437    9.610    0.000   13.812    0.128
##    .sci.2            12.641    0.948   13.341    0.000   12.641    0.108
##    .sci.3             8.580    0.950    9.028    0.000    8.580    0.066
##    .sci.4            18.517    1.984    9.331    0.000   18.517    0.112
##     int              94.510    4.430   21.333    0.000    1.000    1.000
##     slp               3.648    0.442    8.256    0.000    1.000    1.000
## 
## R-Square:
##                    Estimate
##     sci.1             0.872
##     sci.2             0.892
##     sci.3             0.934
##     sci.4             0.888

As a general takeaway, we have an average science achievement of around \(60\) (these scores are arbitrarily scaled so that isn’t incredibly informative) but more of interest, science achievement is increasing over time approximately \(2\) points per year of school.

Categorical Predictors

Nesting induces correlations between observations because members of the same group are more similar to one another than members of other groups. The practical upshot of these models is that we obtain different values of parameters across group (although the methods for this will vary). Perhaps the simplest method for doing so, and therefore accounting for group differences, is to predict the mean of the outcome variable using a categorical predictor. In longitudinal modeling, this most often occurs at the level of the individual as a TIC. As such, we already know how to fit this kind of model from the last chapter. We can show this syntax again below. Here we predict the intercept and slope random effects with male to examine differences between self-reported males and females. In the MLM syntax, we simply include male in the regression equation if we want to predict mean differences in science achievement.

cat.mlm.1 <- lmer(sci ~ 1 + wave + male + (1 + wave | id),
                na.action = na.omit,
                REML = TRUE,
                data = achieve)

summary(cat.mlm.1, correlation = FALSE)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + male + (1 + wave | id)
##    Data: achieve
## 
## REML criterion at convergence: 24092
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.5819 -0.4169  0.0084  0.4561  4.1991 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 94.728   9.733         
##           wave         4.041   2.010    -0.09
##  Residual             12.364   3.516         
## Number of obs: 3643, groups:  id, 1200
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)   58.0466     0.4255 1235.3629 136.425  < 2e-16 ***
## wave           1.9866     0.0872  862.4361  22.783  < 2e-16 ***
## male           1.5965     0.5918 1190.5962   2.698  0.00708 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here we see that males show \(1.6\) units higher science achievement compared with females (the reference category). It is possible to predict other parameters in the model, although the options in mixed-effects models tend to be more limited compared with SEM approaches. For instance, if we thought error variance might differ across groups, we could include this in our MLM syntax through the variance identity (varIdent) function. The form equation, ~ 1 | male, mimics the regression equation where we have a separate “intercept” (here just read this as estimate) nested within group (male). We can then go through the procedure we covered in Chapter 3 to recover the different estimates for males and females (here in standard deviations, but we can easily square these values to obtain variances).

cat.mlm.2 <- lme(sci ~ 1 + wave + male,
                 random = ~ 1 + wave | id,
                 weights = varIdent(form = ~ 1 | male),
                 na.action = na.omit,
                 method = "REML",
                 data = achieve)

summary(cat.mlm.2)

## Linear mixed-effects model fit by REML
##   Data: achieve 
##        AIC      BIC    logLik
##   24090.94 24140.54 -12037.47
## 
## Random effects:
##  Formula: ~1 + wave | id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev   Corr  
## (Intercept) 9.705310 (Intr)
## wave        2.004751 -0.091
## Residual    3.278687       
## 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | male 
##  Parameter estimates:
##        0        1 
## 1.000000 1.153432 
## Fixed effects:  sci ~ 1 + wave + male 
##                Value Std.Error   DF   t-value p-value
## (Intercept) 58.06283 0.4214427 2442 137.77158  0.0000
## wave         1.97570 0.0869694 2442  22.71723  0.0000
## male         1.60921 0.5911793 1198   2.72203  0.0066
##  Correlation: 
##      (Intr) wave  
## wave -0.260       
## male -0.660 -0.019
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -4.3578043 -0.4226075  0.0073624  0.4638169  3.9981046 
## 
## Number of Observations: 3643
## Number of Groups: 1200

c(cat.mlm.2$sigma, cat.mlm.2$sigma * exp(as.numeric(cat.mlm.2$modelStruct$varStruct)))

## [1] 3.278687 3.781742

These results show that males appear to have increased variability in their science achievement compared to females in addition to a higher average. Note that we are not getting a direct effect estimate here, but rather estimating values separately for each group. There are not clear ways of implementing that kind of effect in lme4, so we will not go too much further except to mention that within the mixed-effect world of models, you can directly model predictors of variability with location-scale models, although those generally require more intensive longitudinal kinds of data. As such, we will not consider this further.

Because these categorical variables enter the model at the person level, in the mixed-effects models they predict the random effects rather than the individual time-specific observations. However, in the SEM, we can additionally predict one (or more) of the individual repeated measures. We can see this approach in a version of a MIMIC model, where we predict both the latent factors and the first repeated measure (sci.1).

cat.lcm <- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
            slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4

            int ~ male
            slp ~ male
            
            sci.1 ~ male"

cat.lcm.fit <- growth(cat.lcm,
                      data = achieve.wide,
                      estimator = "ML",
                      missing = "FIML")

summary(cat.lcm.fit, fit.measures = FALSE, estimates = TRUE, 
        standardize = TRUE, rsquare = FALSE)

## lavaan 0.6.13 ended normally after 149 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        12
## 
##   Number of observations                          1237
##   Number of missing patterns                        16
## 
## Model Test User Model:
##                                                       
##   Test statistic                                37.324
##   Degrees of freedom                                 6
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                               9.812    0.944
##     sci.2             1.000                               9.812    0.904
##     sci.3             1.000                               9.812    0.859
##     sci.4             1.000                               9.812    0.768
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.941    0.179
##     sci.3             2.000                               3.883    0.340
##     sci.4             3.000                               5.824    0.456
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~                                                                 
##     male              2.659    0.649    4.095    0.000    0.271    0.135
##   slp ~                                                                 
##     male             -0.288    0.217   -1.324    0.186   -0.148   -0.074
##   sci.1 ~                                                               
##     male             -1.511    0.336   -4.503    0.000   -1.511   -0.073
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .int ~~                                                                
##    .slp               2.789    1.036    2.693    0.007    0.148    0.148
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##    .int              60.057    0.412  145.793    0.000    6.121    6.121
##    .slp               1.907    0.121   15.788    0.000    0.982    0.982
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            13.139    1.423    9.234    0.000   13.139    0.122
##    .sci.2            12.579    0.946   13.301    0.000   12.579    0.107
##    .sci.3             8.827    0.954    9.255    0.000    8.827    0.068
##    .sci.4            17.512    1.947    8.994    0.000   17.512    0.107
##    .int              94.515    4.423   21.370    0.000    0.982    0.982
##    .slp               3.749    0.440    8.517    0.000    0.995    0.995

These approaches are relatively simple ways to incorporate nesting information into the model, but highlight the continuity with more complex approaches we will discuss as we move forward in this chapter.

Multiple Groups Models

While the categorical predictor allows us to vary some parameters across groups, a unique power of the SEM is the multiple group model, where we can estimate unique parameters of essentially any type across any number of groups (although the number of groups tends to be small in practice). The reason for a small number of groups tends to be due to sample size requirements within-group. SEM is still a large-sample estimator in the multiple groups model, so we have to be careful we aren’t under-powering the model at the group level, even if the overall sample size is large. Below we can see how the multiple-group LCM is specified. Nothing changes about the syntax object mult.g1 from what we have seen before, however, now we include the argument group = "male" to indicate that we want to obtain unique estimates for females and males. Typically, these models require setting some small set of parameters equal in the model just to identify the model, so here we conveniently achieve this through having the same set factor loadings for the growth model. We can see the results below.

mult.g1 <- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
            slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4"

mult.g1.fit <- growth(mult.g1,
                      data = achieve.wide,
                      group = "male",
                      estimator = "ML",
                      missing = "FIML")

summary(mult.g1.fit, fit.measures = TRUE, estimates = TRUE, 
        standardize = TRUE, rsquare = FALSE)

## lavaan 0.6.13 ended normally after 226 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        18
## 
##   Number of observations per group:               Used       Total
##     1                                              576         600
##     0                                              624         637
##   Number of missing patterns per group:                           
##     1                                               15            
##     0                                               15            
## 
## Model Test User Model:
##                                                       
##   Test statistic                                61.451
##   Degrees of freedom                                10
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     1                                           30.756
##     0                                           30.695
## 
## Model Test Baseline Model:
## 
##   Test statistic                              3772.258
##   Degrees of freedom                                12
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.986
##   Tucker-Lewis Index (TLI)                       0.984
##                                                       
##   Robust Comparative Fit Index (CFI)             0.983
##   Robust Tucker-Lewis Index (TLI)                0.979
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -12013.745
##   Loglikelihood unrestricted model (H1)     -11983.019
##                                                       
##   Akaike (AIC)                               24063.489
##   Bayesian (BIC)                             24155.111
##   Sample-size adjusted Bayesian (SABIC)      24097.936
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.093
##   90 Percent confidence interval - lower         0.071
##   90 Percent confidence interval - upper         0.115
##   P-value H_0: RMSEA <= 0.050                    0.001
##   P-value H_0: RMSEA >= 0.080                    0.841
##                                                       
##   Robust RMSEA                                   0.130
##   90 Percent confidence interval - lower         0.098
##   90 Percent confidence interval - upper         0.164
##   P-value H_0: Robust RMSEA <= 0.050             0.000
##   P-value H_0: Robust RMSEA >= 0.080             0.994
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.036
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Group 1 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                              10.624    0.933
##     sci.2             1.000                              10.624    0.903
##     sci.3             1.000                              10.624    0.859
##     sci.4             1.000                              10.624    0.765
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.824    0.155
##     sci.3             2.000                               3.648    0.295
##     sci.4             3.000                               5.472    0.394
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~~                                                                
##     slp               4.373    1.704    2.567    0.010    0.226    0.226
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##     int              61.564    0.472  130.490    0.000    5.795    5.795
##     slp               2.217    0.136   16.342    0.000    1.216    1.216
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            16.927    2.377    7.121    0.000   16.927    0.130
##    .sci.2            13.629    1.534    8.882    0.000   13.629    0.098
##    .sci.3             9.128    1.557    5.861    0.000    9.128    0.060
##    .sci.4            24.031    3.370    7.131    0.000   24.031    0.124
##     int             112.875    7.647   14.760    0.000    1.000    1.000
##     slp               3.327    0.713    4.665    0.000    1.000    1.000
## 
## 
## Group 2 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                               8.752    0.936
##     sci.2             1.000                               8.752    0.894
##     sci.3             1.000                               8.752    0.844
##     sci.4             1.000                               8.752    0.750
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.967    0.201
##     sci.3             2.000                               3.934    0.379
##     sci.4             3.000                               5.901    0.506
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~~                                                                
##     slp               1.811    1.234    1.467    0.142    0.105    0.105
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##     int              60.051    0.372  161.225    0.000    6.861    6.861
##     slp               1.902    0.118   16.087    0.000    0.967    0.967
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            10.906    1.731    6.301    0.000   10.906    0.125
##    .sci.2            11.783    1.175   10.028    0.000   11.783    0.123
##    .sci.3             8.222    1.177    6.984    0.000    8.222    0.076
##    .sci.4            13.792    2.349    5.870    0.000   13.792    0.101
##     int              76.605    4.979   15.386    0.000    1.000    1.000
##     slp               3.870    0.547    7.069    0.000    1.000    1.000

We can see that we now have separate test statistics and sample sizes that reflect the fact that we are - to a large extent - estimating separate models. We also can see that our parameter estimates are broken up into Group 1 [1] and Group 2 [0]. The value in brackets represents the value of the grouping variable male, so here Group 1 is males and Group 2 is female (you could change this variable into a labeled factor and then those labels would be more informative). It’s important to note the ordering of the groups for future steps.

Here we can qualitatively say that among other things, males seem to start higher (\(61.6\) vs. \(60.1\)) and increase more rapidly (\(2.22\) vs. \(1.90\)) in their achievement, and have a higher correlation between starting point and rate of change (\(r = 0.226\) vs. \(r = .105\)), compared with females. However, a natural question is whether these differences are significant. We can explicitly test differences across groups by introducing group-specific labels into our model syntax and building composite parameters where we get a formal inferential test on the difference. A really nice feature of maximum likelihood parameters is that linear operations on the parameter are themselves maximum likelihood parameters themselves. To achieve this, we simply pre-multiply a given parameter by a vector of labels (e.g., c(Mi, Fi)) and then define the composite parameter using the := operator. Here we will test the difference between the intercept (c(Mi, Fi)) and slope factor means (c(Ms, Fs)), and the covariance between factors (c(Mc, Fc)). This is where keeping track of the group order becomes important so we don’t mis-label our parameters of interest and reverse our inference. The D* composite parameters are then created by simply subtracting F* from the M* parameters.

mult.g2 <- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
            slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4
            
            int ~ c(Mi, Fi)*1
            slp ~ c(Ms, Fs)*1
            
            int ~~ c(Mc, Fc)*slp
            
            Di := Mi - Fi
            Ds := Ms - Fs
            Dc := Mc - Fc"

mult.g2.fit <- growth(mult.g2,
                      data = achieve.wide,
                      group = "male",
                      estimator = "ML",
                      missing = "FIML")

summary(mult.g2.fit, fit.measures = FALSE, estimates = TRUE, 
        standardize = TRUE, rsquare = FALSE)

## lavaan 0.6.13 ended normally after 215 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        18
## 
##   Number of observations per group:               Used       Total
##     1                                              576         600
##     0                                              624         637
##   Number of missing patterns per group:                           
##     1                                               15            
##     0                                               15            
## 
## Model Test User Model:
##                                                       
##   Test statistic                                61.451
##   Degrees of freedom                                10
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     1                                           30.756
##     0                                           30.695
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Group 1 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                              10.624    0.933
##     sci.2             1.000                              10.624    0.903
##     sci.3             1.000                              10.624    0.859
##     sci.4             1.000                              10.624    0.765
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.824    0.155
##     sci.3             2.000                               3.648    0.295
##     sci.4             3.000                               5.472    0.394
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~~                                                                
##     slp       (Mc)    4.373    1.704    2.567    0.010    0.226    0.226
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     int       (Mi)   61.564    0.472  130.490    0.000    5.795    5.795
##     slp       (Ms)    2.217    0.136   16.342    0.000    1.216    1.216
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            16.927    2.377    7.121    0.000   16.927    0.130
##    .sci.2            13.629    1.534    8.882    0.000   13.629    0.098
##    .sci.3             9.128    1.557    5.861    0.000    9.128    0.060
##    .sci.4            24.031    3.370    7.131    0.000   24.031    0.124
##     int             112.875    7.647   14.760    0.000    1.000    1.000
##     slp               3.327    0.713    4.665    0.000    1.000    1.000
## 
## 
## Group 2 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                               8.752    0.936
##     sci.2             1.000                               8.752    0.894
##     sci.3             1.000                               8.752    0.844
##     sci.4             1.000                               8.752    0.750
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.967    0.201
##     sci.3             2.000                               3.934    0.379
##     sci.4             3.000                               5.901    0.506
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~~                                                                
##     slp       (Fc)    1.811    1.234    1.467    0.142    0.105    0.105
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     int       (Fi)   60.051    0.372  161.225    0.000    6.861    6.861
##     slp       (Fs)    1.902    0.118   16.087    0.000    0.967    0.967
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            10.906    1.731    6.301    0.000   10.906    0.125
##    .sci.2            11.783    1.175   10.028    0.000   11.783    0.123
##    .sci.3             8.222    1.177    6.984    0.000    8.222    0.076
##    .sci.4            13.792    2.349    5.870    0.000   13.792    0.101
##     int              76.605    4.979   15.386    0.000    1.000    1.000
##     slp               3.870    0.547    7.069    0.000    1.000    1.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     Di                1.513    0.601    2.517    0.012   -1.066   -1.066
##     Ds                0.316    0.180    1.753    0.080    0.249    0.249
##     Dc                2.562    2.104    1.217    0.223    0.120    0.120

We will only focus here on the Defined Parameters section, where we can see that while males have higher absolute estimates on all of these parameters we have tested, only the mean of the intercept factor significantly differs across groups. These are only a subset we chose for demonstration purposes, and we could explore others if we chose.

However, this model allows maximally different estimates, which belies one of the real strengths of the multiple-groups model, which is to selectively set a subset of parameters equal across groups. This allows us to bridge the gap between fitting the model in each group uniquely and fitting a single-group model where we ignore nesting altogether. The parameters we set equal benefit from increased power because it is not diluted by the smaller sub-samples and reduce the number of parameters we need to interpret. One common approach is to start with the type of model we first fit, with all parameters unequal, and then progressively build in equality constraints and test whether they reduce the fit of the model. We can do this in two ways, either through the syntax (changing labels like c(Mi, Fi) to be the same like c(i, i)) or through the argument group.equal in the model fitting function. The first approach gives us more control, but can be tedious in large models, while the latter is less precise but quick. For instance, if we want to constrain the factor means equal across groups, we can simply include means in the group.equal argument. We can see the results below.

mult.g3 <- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
           slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4"

mult.g3.fit <- growth(mult.g3,
                      data = achieve.wide,
                      group = "male",
                      group.equal = c("means"),
                      estimator = "ML",
                      missing = "FIML")

summary(mult.g3.fit, fit.measures = TRUE, estimates = TRUE, 
        standardize = TRUE, rsquare = FALSE)

## lavaan 0.6.13 ended normally after 163 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        18
##   Number of equality constraints                     2
## 
##   Number of observations per group:               Used       Total
##     1                                              576         600
##     0                                              624         637
##   Number of missing patterns per group:                           
##     1                                               15            
##     0                                               15            
## 
## Model Test User Model:
##                                                       
##   Test statistic                                71.847
##   Degrees of freedom                                12
##   P-value (Chi-square)                           0.000
##   Test statistic for each group:
##     1                                           37.058
##     0                                           34.789
## 
## Model Test Baseline Model:
## 
##   Test statistic                              3772.258
##   Degrees of freedom                                12
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.984
##   Tucker-Lewis Index (TLI)                       0.984
##                                                       
##   Robust Comparative Fit Index (CFI)             0.981
##   Robust Tucker-Lewis Index (TLI)                0.981
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -12018.943
##   Loglikelihood unrestricted model (H1)     -11983.019
##                                                       
##   Akaike (AIC)                               24069.886
##   Bayesian (BIC)                             24151.327
##   Sample-size adjusted Bayesian (SABIC)      24100.505
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.091
##   90 Percent confidence interval - lower         0.072
##   90 Percent confidence interval - upper         0.112
##   P-value H_0: RMSEA <= 0.050                    0.000
##   P-value H_0: RMSEA >= 0.080                    0.832
##                                                       
##   Robust RMSEA                                   0.123
##   90 Percent confidence interval - lower         0.095
##   90 Percent confidence interval - upper         0.152
##   P-value H_0: Robust RMSEA <= 0.050             0.000
##   P-value H_0: Robust RMSEA >= 0.080             0.993
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.058
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Group 1 [1]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                              10.658    0.933
##     sci.2             1.000                              10.658    0.901
##     sci.3             1.000                              10.658    0.857
##     sci.4             1.000                              10.658    0.764
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.822    0.154
##     sci.3             2.000                               3.643    0.293
##     sci.4             3.000                               5.465    0.392
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~~                                                                
##     slp               4.638    1.697    2.733    0.006    0.239    0.239
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##     int     (.20.)   60.622    0.295  205.481    0.000    5.688    5.688
##     slp     (.21.)    2.037    0.090   22.661    0.000    1.118    1.118
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            17.021    2.367    7.190    0.000   17.021    0.130
##    .sci.2            13.673    1.541    8.872    0.000   13.673    0.098
##    .sci.3             9.399    1.557    6.035    0.000    9.399    0.061
##    .sci.4            23.359    3.312    7.054    0.000   23.359    0.120
##     int             113.598    7.692   14.768    0.000    1.000    1.000
##     slp               3.318    0.710    4.674    0.000    1.000    1.000
## 
## 
## Group 2 [0]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int =~                                                                
##     sci.1             1.000                               8.773    0.936
##     sci.2             1.000                               8.773    0.894
##     sci.3             1.000                               8.773    0.844
##     sci.4             1.000                               8.773    0.749
##   slp =~                                                                
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.975    0.201
##     sci.3             2.000                               3.949    0.380
##     sci.4             3.000                               5.924    0.505
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int ~~                                                                
##     slp               1.853    1.241    1.494    0.135    0.107    0.107
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##     int     (.20.)   60.622    0.295  205.481    0.000    6.910    6.910
##     slp     (.21.)    2.037    0.090   22.661    0.000    1.031    1.031
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1            10.894    1.725    6.315    0.000   10.894    0.124
##    .sci.2            11.772    1.171   10.053    0.000   11.772    0.122
##    .sci.3             8.040    1.165    6.902    0.000    8.040    0.074
##    .sci.4            14.177    2.357    6.015    0.000   14.177    0.103
##     int              76.967    5.013   15.354    0.000    1.000    1.000
##     slp               3.899    0.550    7.086    0.000    1.000    1.000

Now note that the factor means (labels .20. and .21.) are equal across males and females. We could mix and match syntax and argument constraints if we like; like many things in SEM, we can have full control as long as we keep things organized. We can test whether this more constrained model fits significantly worse than the maximally-different model using a likelihood ratio test.

lavTestLRT(mult.g1.fit, mult.g3.fit)

## 
## Chi-Squared Difference Test
## 
##             Df   AIC   BIC  Chisq Chisq diff    RMSEA Df diff Pr(>Chisq)   
## mult.g1.fit 10 24064 24155 61.451                                          
## mult.g3.fit 12 24070 24151 71.847     10.397 0.083649       2   0.005526 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here it appears that constraining the factor means does significantly decrease model fit, so we wouldn’t conclude that this constraint is appropriate. We could do this kind of testing with other parameters if we chose, although we won’t here for space and simplicity. The complexity and utility of the multiple groups models is not something we can explore fully here, and interested readers can find a much larger literature on these kinds of models if they wish to implement them in their own research (see the manuscript for a general starting point).

Fixed Effect Approach

The fixed effect approach is (rarely for quantitative methods) exactly what it sounds like in the context of mixed-effect models, we simply include our grouping variable as a fixed effect in the model. This has the convenience of account for unmeasured or otherwise unmodeled differences between groups that might bias the effects. In the most common use of the fixed-effects approach, we account for intercept differences by including a main effect of the group. In the context of our data example, we will model the effect of site on science achievement using a fixed approach. Within the lmer() syntax, we can use the short-cute of factor(site) to generate a series of dummy-codes, where site == 1 is the reference category. We can see the results of this below.

fixed.mlm1 <- lmer(sci ~ 1 + wave + factor(site) + (1 + wave | id),
                na.action = na.omit,
                REML = TRUE,
                data = achieve)
summary(fixed.mlm1, correlation = FALSE)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + factor(site) + (1 + wave | id)
##    Data: achieve
## 
## REML criterion at convergence: 24082.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.5642 -0.4218  0.0130  0.4559  4.2084 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 94.706   9.732         
##           wave         4.048   2.012    -0.10
##  Residual             12.355   3.515         
## Number of obs: 3643, groups:  id, 1200
## 
## Fixed effects:
##                 Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)     58.59757    0.56691 1247.30166 103.363   <2e-16 ***
## wave             1.98858    0.08722  861.42604  22.800   <2e-16 ***
## factor(site)2    1.58400    0.86061 1183.63741   1.841   0.0659 .  
## factor(site)3    0.40383    0.89404 1189.48204   0.452   0.6516    
## factor(site)4    0.25571    0.83030 1186.72499   0.308   0.7582    
## factor(site)5   -2.21401    1.05705 1193.40188  -2.095   0.0364 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that the (Intercept) term is the intercept for Site 1 and each of the effects factor(site)* indicate the relative difference from that term for the other sites. If we wished to use an absolute coding scheme, we change the 1 to a 0 (indicating that we don’t wish to estimate a reference intercept) and re-estimate the model.

fixed.mlm2 <- lmer(sci ~ 0 + wave + factor(site) + (1 + wave | id),
                na.action = na.omit,
                REML = TRUE,
                data = achieve)
summary(fixed.mlm2, correlation = FALSE)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 0 + wave + factor(site) + (1 + wave | id)
##    Data: achieve
## 
## REML criterion at convergence: 24082.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.5642 -0.4218  0.0130  0.4559  4.2084 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 94.706   9.732         
##           wave         4.048   2.012    -0.10
##  Residual             12.355   3.515         
## Number of obs: 3643, groups:  id, 1200
## 
## Fixed effects:
##                Estimate Std. Error        df t value Pr(>|t|)    
## wave          1.989e+00  8.722e-02 8.614e+02   22.80   <2e-16 ***
## factor(site)1 5.860e+01  5.669e-01 1.247e+03  103.36   <2e-16 ***
## factor(site)2 6.018e+01  6.675e-01 1.226e+03   90.16   <2e-16 ***
## factor(site)3 5.900e+01  7.105e-01 1.240e+03   83.04   <2e-16 ***
## factor(site)4 5.885e+01  6.287e-01 1.248e+03   93.61   <2e-16 ***
## factor(site)5 5.638e+01  9.073e-01 1.231e+03   62.14   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

These results are equivalent to the first coding scheme but we get the intercept estimate for each site instead of computing differences from a chosen site (note the change in what the inference test tells us).

Like mentioned above, the fixed effect approach is most commonly used for intercept differences, but can logically be extended to account for effect differences across sites (or another grouping variable). We can accomplish this by modeling the interaction of site with our time predictor like below.

fixed.mlm3 <- lmer(sci ~ 1 + wave + factor(site) + wave:factor(site) + (1 + wave | id),
                na.action = na.omit,
                REML = TRUE,
                data = achieve)
summary(fixed.mlm3, correlation = FALSE)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + factor(site) + wave:factor(site) + (1 + wave |  
##     id)
##    Data: achieve
## 
## REML criterion at convergence: 24076.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.5317 -0.4176  0.0117  0.4573  4.2531 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  id       (Intercept) 94.674   9.730         
##           wave         3.997   1.999    -0.09
##  Residual             12.365   3.516         
## Number of obs: 3643, groups:  id, 1200
## 
## Fixed effects:
##                      Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)          58.40288    0.59914 1154.84398  97.477   <2e-16 ***
## wave                  2.13593    0.17030  972.01926  12.542   <2e-16 ***
## factor(site)2         1.74603    0.92295 1120.17539   1.892   0.0588 .  
## factor(site)3         1.10856    0.96191 1134.56798   1.152   0.2494    
## factor(site)4         0.03638    0.89457 1141.90715   0.041   0.9676    
## factor(site)5        -1.45252    1.13802 1139.05117  -1.276   0.2021    
## wave:factor(site)2   -0.12248    0.25311  860.24892  -0.484   0.6286    
## wave:factor(site)3   -0.52808    0.26582  877.36603  -1.987   0.0473 *  
## wave:factor(site)4    0.15811    0.24806  912.54273   0.637   0.5240    
## wave:factor(site)5   -0.56471    0.31249  860.82676  -1.807   0.0711 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here we return to the reference coding to test whether the effect of wave differs across sites from the reference. We can see how parameter do tend to multiply rapidly, however, with a relatively small set of grouping levels (here we only have 5 site), this is manageable. With larger numbers of groups (e.g., families), this approach would become less tenable. Note that we still have our random effect across individuals, but we’ve removed site-level differences from those estimates through the use of the fixed effects approach. Very often we will not interpret the associated parameters, but include them to de-confound our estimates of interest. Another thing to mention is that this isn’t a “pure” form of the fixed-effects approach, since we still include the random effect of individual. The purely fixed-effect approach is more common/appropriate for group-based clustered (e.g., site, country) rather than individual-based clustering (as we always have in longitudinal data) since we cannot include individual level fixed-effects without breaking the model (it isn’t identified).

Random Effect Approach

If we want to move into accounting for grouping variables with many levels, the random effects framework provides a nice and ready solution. Indeed, we have been using it for the nesting of repeated measures within individuals this whole time. Within MEMs, we can include higher levels of nesting by expanding the structure of the random effects in our model syntax. Here we expand our classic (1 + wave | id) to include the higher level of nesting individuals within schools (1 + wave | school/id).

rand.mlm <- lmer(sci ~ 1 + wave + (1 + wave | school/id),
                na.action = na.omit,
                REML = TRUE,
                data = achieve)

summary(rand.mlm, correlation = FALSE)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + (1 + wave | school/id)
##    Data: achieve
## 
## REML criterion at convergence: 23997.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.4223 -0.4199  0.0095  0.4600  4.3843 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev. Corr 
##  id:school (Intercept) 84.8326  9.2105        
##            wave         3.5757  1.8909   -0.10
##  school    (Intercept) 12.8283  3.5817        
##            wave         0.5724  0.7566   -0.14
##  Residual              12.3154  3.5093        
## Number of obs: 3643, groups:  id:school, 1200; school, 41
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  58.4281     0.6514 35.6571   89.70  < 2e-16 ***
## wave          1.9598     0.1516 34.9520   12.93 7.07e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The ordering of school/id is important otherwise we have schools nested within individual which will make the model sad (it’s true). We can check that we’ve done this in the model summary where we have \(1200\) individuals nested within schools (id:school) and \(41\) schools. If we compare this model to the model where we ignore nesting within school, we will notice that the fixed effects have not changed that much, but that our estimate of individual-level variance has been broken up into individual- and school-level variability. This situation is fairly common (although not guaranteed), with fixed effects that are relatively robust to omitting higher levels of nesting, but mis-attributed variance estimates at lower levels of nesting that over-estimate inter-individual variance.

Compared with the fairly natural extension of the MEMs to higher levels of nesting, multilevel SEM estimation is much more challenging and the current options in R are fairly limited. For instance, lavaan currently only allows complete case continuous data with a random intercept at the higher level of nesting (see the current tutorial). This is an active area of improvement so additional options are likely to be available soon. We provide Mplus examples of the full model that more closely resembles the MLM above. However, for completeness, we fit the model we can in lavaan below.

mlsem <- "
           level: 1
           int.w =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
           slp.w =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4
           
           sci.1 ~~ a*sci.1
           sci.2 ~~ a*sci.2
           sci.3 ~~ a*sci.3
           sci.4 ~~ a*sci.4
           
           level: 2
           int.b =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
           
           sci.1 ~~ b*sci.1
           sci.2 ~~ b*sci.2
           sci.3 ~~ b*sci.3
           sci.4 ~~ b*sci.4"



mlsem.fit <- growth(mlsem,
                    data = achieve.wide,
                    cluster = "school",
                    estimator = "ML",
                    missing = "listwise",
                    optim.method = "em")

summary(mlsem.fit, fit.measures = TRUE, estimates = TRUE, 
        standardize = TRUE, rsquare = TRUE)

## lavaan 0.6.13 ended normally after 41 iterations
## 
##   Estimator                                         ML
##   Optimization method                               EM
##   Number of model parameters                        15
##   Number of equality constraints                     6
## 
##                                                   Used       Total
##   Number of observations                           482        1237
##   Number of clusters [school]                       35            
## 
## Model Test User Model:
##                                                       
##   Test statistic                                68.230
##   Degrees of freedom                                15
##   P-value (Chi-square)                           0.000
## 
## Model Test Baseline Model:
## 
##   Test statistic                              2526.704
##   Degrees of freedom                                12
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    0.979
##   Tucker-Lewis Index (TLI)                       0.983
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5969.695
##   Loglikelihood unrestricted model (H1)      -5935.580
##                                                       
##   Akaike (AIC)                               11957.390
##   Bayesian (BIC)                             11994.991
##   Sample-size adjusted Bayesian (SABIC)      11966.426
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.086
##   90 Percent confidence interval - lower         0.066
##   90 Percent confidence interval - upper         0.107
##   P-value H_0: RMSEA <= 0.050                    0.002
##   P-value H_0: RMSEA >= 0.080                    0.700
## 
## Standardized Root Mean Square Residual (corr metric):
## 
##   SRMR (within covariance matrix)                0.012
##   SRMR (between covariance matrix)               0.025
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Observed
##   Observed information based on                Hessian
## 
## 
## Level 1 [within]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int.w =~                                                              
##     sci.1             1.000                               8.513    0.947
##     sci.2             1.000                               8.513    0.915
##     sci.3             1.000                               8.513    0.858
##     sci.4             1.000                               8.513    0.789
##   slp.w =~                                                              
##     sci.1             0.000                               0.000    0.000
##     sci.2             1.000                               1.750    0.188
##     sci.3             2.000                               3.501    0.353
##     sci.4             3.000                               5.251    0.487
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int.w ~~                                                              
##     slp.w             1.317    0.922    1.428    0.153    0.088    0.088
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##     int.w             3.402    0.427    7.965    0.000    0.400    0.400
##     slp.w             2.098    0.134   15.668    0.000    1.198    1.198
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1      (a)    8.373    0.395   21.224    0.000    8.373    0.104
##    .sci.2      (a)    8.373    0.395   21.224    0.000    8.373    0.097
##    .sci.3      (a)    8.373    0.395   21.224    0.000    8.373    0.085
##    .sci.4      (a)    8.373    0.395   21.224    0.000    8.373    0.072
##     int.w            72.471    5.263   13.771    0.000    1.000    1.000
##     slp.w             3.063    0.324    9.441    0.000    1.000    1.000
## 
## R-Square:
##                    Estimate
##     sci.1             0.896
##     sci.2             0.903
##     sci.3             0.915
##     sci.4             0.928
## 
## 
## Level 2 [school]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   int.b =~                                                              
##     sci.1             1.000                               4.083    0.969
##     sci.2             1.000                               4.083    0.969
##     sci.3             1.000                               4.083    0.969
##     sci.4             1.000                               4.083    0.969
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1             0.000                               0.000    0.000
##    .sci.2             0.000                               0.000    0.000
##    .sci.3             0.000                               0.000    0.000
##    .sci.4             0.000                               0.000    0.000
##     int.b            59.767    0.427  139.929    0.000   14.639   14.639
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .sci.1      (b)    1.090    0.302    3.609    0.000    1.090    0.061
##    .sci.2      (b)    1.090    0.302    3.609    0.000    1.090    0.061
##    .sci.3      (b)    1.090    0.302    3.609    0.000    1.090    0.061
##    .sci.4      (b)    1.090    0.302    3.609    0.000    1.090    0.061
##     int.b            16.668    6.440    2.588    0.010    1.000    1.000
## 
## R-Square:
##                    Estimate
##     sci.1             0.939
##     sci.2             0.939
##     sci.3             0.939
##     sci.4             0.939

Here we get a similar layout for the results as the multiple groups model, with the within estimates first, and then the school-level estimates next. We needed to set the residuals to be homoscedastic to achieve convergence here, but in principle these can be allowed to vary.

Cluster Correction

The fixed effects point estimates within the model tend to be robust to the omission of higher levels of nesting, however, their standard errors do tend to be inaccurate. As such, sometimes we might wish to simply not interpret the variance components and focus on regression coefficients or other parts of our model. This might be less frequent in growth modeling, but for completeness we will consider this type of application. In this instance, we can simply implement a cluster-correction to our standard errors and then proceed as usual to interpret the model parameters. To our knowledge, this is not currently an option in R, so we provide the Mplus syntax to do so below.