Nested Data
The final topic we will consider is the way we account for nesting within longitudinal data. While technically all of longitudinal modeling involves nested data (i.e., multiple observations nested within the same individual), here we focus more on between-person nesting groups (e.g., classroom, family, data collection sites). As we will see, there are several forms of the models that we have already used that deal with nesting. The achieve
dataset we will work with here are from a 4-wave school-based assessment of math and science achievement across ages \(12\) - \(17\). Schools were drawn from \(5\) different metropolitan areas and assessments were conducted by separate research teams in each city. Here we will mostly focus on the science achievement data, but may draw examples from math achievement if they differ in interesting ways.
<- read.csv("data/achieve.csv", header = TRUE)
achieve
<- achieve %>%
achieve.wide pivot_wider(id_cols = c("site", "school", "id", "male"),
names_from = "wave",
values_from = c("sci", "math"),
names_sep = ".",
values_fill = NA)
As always, we can fit an unconditional growth model to give us a baseline of expectations for the effects we will in subsequent models. Here we will fit the MLM and LCM versions as they are the simplest representations.
Unconditional Model
<- lmer(sci ~ 1 + wave + (1 + wave | id),
unc.mlm na.action = na.omit,
REML = TRUE,
data = achieve)
summary(unc.mlm, correlation = FALSE)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + (1 + wave | id)
## Data: achieve
##
## REML criterion at convergence: 24100
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.5673 -0.4152 0.0084 0.4532 4.2100
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 94.998 9.747
## wave 4.035 2.009 -0.09
## Residual 12.367 3.517
## Number of obs: 3643, groups: id, 1200
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 5.881e+01 3.180e-01 1.133e+03 185.0 <2e-16 ***
## wave 1.987e+00 8.716e-02 8.628e+02 22.8 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
<- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
unc.lcm slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4"
<- growth(unc.lcm,
unc.lcm.fit data = achieve.wide,
estimator = "ML",
missing = "FIML")
summary(unc.lcm.fit, fit.measures = TRUE, estimates = TRUE,
standardize = TRUE, rsquare = TRUE)
## lavaan 0.6.13 ended normally after 138 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 9
##
## Used Total
## Number of observations 1200 1237
## Number of missing patterns 15
##
## Model Test User Model:
##
## Test statistic 55.402
## Degrees of freedom 5
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 3806.133
## Degrees of freedom 6
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.987
## Tucker-Lewis Index (TLI) 0.984
##
## Robust Comparative Fit Index (CFI) 0.984
## Robust Tucker-Lewis Index (TLI) 0.980
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -12038.652
## Loglikelihood unrestricted model (H1) -12010.951
##
## Akaike (AIC) 24095.304
## Bayesian (BIC) 24141.115
## Sample-size adjusted Bayesian (SABIC) 24112.527
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.092
## 90 Percent confidence interval - lower 0.071
## 90 Percent confidence interval - upper 0.114
## P-value H_0: RMSEA <= 0.050 0.001
## P-value H_0: RMSEA >= 0.080 0.829
##
## Robust RMSEA 0.127
## 90 Percent confidence interval - lower 0.097
## 90 Percent confidence interval - upper 0.160
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 0.994
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.031
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 9.722 0.934
## sci.2 1.000 9.722 0.898
## sci.3 1.000 9.722 0.851
## sci.4 1.000 9.722 0.757
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.910 0.176
## sci.3 2.000 3.820 0.334
## sci.4 3.000 5.730 0.446
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~~
## slp 3.184 1.033 3.083 0.002 0.172 0.172
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## int 60.782 0.299 203.517 0.000 6.252 6.252
## slp 2.048 0.090 22.857 0.000 1.072 1.072
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 13.812 1.437 9.610 0.000 13.812 0.128
## .sci.2 12.641 0.948 13.341 0.000 12.641 0.108
## .sci.3 8.580 0.950 9.028 0.000 8.580 0.066
## .sci.4 18.517 1.984 9.331 0.000 18.517 0.112
## int 94.510 4.430 21.333 0.000 1.000 1.000
## slp 3.648 0.442 8.256 0.000 1.000 1.000
##
## R-Square:
## Estimate
## sci.1 0.872
## sci.2 0.892
## sci.3 0.934
## sci.4 0.888
As a general takeaway, we have an average science achievement of around \(60\) (these scores are arbitrarily scaled so that isn’t incredibly informative) but more of interest, science achievement is increasing over time approximately \(2\) points per year of school.
Categorical Predictors
Nesting induces correlations between observations because members of the same group are more similar to one another than members of other groups. The practical upshot of these models is that we obtain different values of parameters across group (although the methods for this will vary). Perhaps the simplest method for doing so, and therefore accounting for group differences, is to predict the mean of the outcome variable using a categorical predictor. In longitudinal modeling, this most often occurs at the level of the individual as a TIC. As such, we already know how to fit this kind of model from the last chapter. We can show this syntax again below. Here we predict the intercept and slope random effects with male
to examine differences between self-reported males and females. In the MLM syntax, we simply include male
in the regression equation if we want to predict mean differences in science achievement.
.1 <- lmer(sci ~ 1 + wave + male + (1 + wave | id),
cat.mlmna.action = na.omit,
REML = TRUE,
data = achieve)
summary(cat.mlm.1, correlation = FALSE)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + male + (1 + wave | id)
## Data: achieve
##
## REML criterion at convergence: 24092
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.5819 -0.4169 0.0084 0.4561 4.1991
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 94.728 9.733
## wave 4.041 2.010 -0.09
## Residual 12.364 3.516
## Number of obs: 3643, groups: id, 1200
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 58.0466 0.4255 1235.3629 136.425 < 2e-16 ***
## wave 1.9866 0.0872 862.4361 22.783 < 2e-16 ***
## male 1.5965 0.5918 1190.5962 2.698 0.00708 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here we see that males show \(1.6\) units higher science achievement compared with females (the reference category). It is possible to predict other parameters in the model, although the options in mixed-effects models tend to be more limited compared with SEM approaches. For instance, if we thought error variance might differ across groups, we could include this in our MLM syntax through the variance identity (varIdent
) function. The form
equation, ~ 1 | male
, mimics the regression equation where we have a separate “intercept” (here just read this as estimate) nested within group (male
). We can then go through the procedure we covered in Chapter 3 to recover the different estimates for males and females (here in standard deviations, but we can easily square these values to obtain variances).
.2 <- lme(sci ~ 1 + wave + male,
cat.mlmrandom = ~ 1 + wave | id,
weights = varIdent(form = ~ 1 | male),
na.action = na.omit,
method = "REML",
data = achieve)
summary(cat.mlm.2)
## Linear mixed-effects model fit by REML
## Data: achieve
## AIC BIC logLik
## 24090.94 24140.54 -12037.47
##
## Random effects:
## Formula: ~1 + wave | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 9.705310 (Intr)
## wave 2.004751 -0.091
## Residual 3.278687
##
## Variance function:
## Structure: Different standard deviations per stratum
## Formula: ~1 | male
## Parameter estimates:
## 0 1
## 1.000000 1.153432
## Fixed effects: sci ~ 1 + wave + male
## Value Std.Error DF t-value p-value
## (Intercept) 58.06283 0.4214427 2442 137.77158 0.0000
## wave 1.97570 0.0869694 2442 22.71723 0.0000
## male 1.60921 0.5911793 1198 2.72203 0.0066
## Correlation:
## (Intr) wave
## wave -0.260
## male -0.660 -0.019
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.3578043 -0.4226075 0.0073624 0.4638169 3.9981046
##
## Number of Observations: 3643
## Number of Groups: 1200
c(cat.mlm.2$sigma, cat.mlm.2$sigma * exp(as.numeric(cat.mlm.2$modelStruct$varStruct)))
## [1] 3.278687 3.781742
These results show that males appear to have increased variability in their science achievement compared to females in addition to a higher average. Note that we are not getting a direct effect estimate here, but rather estimating values separately for each group. There are not clear ways of implementing that kind of effect in lme4, so we will not go too much further except to mention that within the mixed-effect world of models, you can directly model predictors of variability with location-scale models, although those generally require more intensive longitudinal kinds of data. As such, we will not consider this further.
Because these categorical variables enter the model at the person level, in the mixed-effects models they predict the random effects rather than the individual time-specific observations. However, in the SEM, we can additionally predict one (or more) of the individual repeated measures. We can see this approach in a version of a MIMIC model, where we predict both the latent factors and the first repeated measure (sci.1
).
<- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
cat.lcm slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4
int ~ male
slp ~ male
sci.1 ~ male"
<- growth(cat.lcm,
cat.lcm.fit data = achieve.wide,
estimator = "ML",
missing = "FIML")
summary(cat.lcm.fit, fit.measures = FALSE, estimates = TRUE,
standardize = TRUE, rsquare = FALSE)
## lavaan 0.6.13 ended normally after 149 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 12
##
## Number of observations 1237
## Number of missing patterns 16
##
## Model Test User Model:
##
## Test statistic 37.324
## Degrees of freedom 6
## P-value (Chi-square) 0.000
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 9.812 0.944
## sci.2 1.000 9.812 0.904
## sci.3 1.000 9.812 0.859
## sci.4 1.000 9.812 0.768
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.941 0.179
## sci.3 2.000 3.883 0.340
## sci.4 3.000 5.824 0.456
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~
## male 2.659 0.649 4.095 0.000 0.271 0.135
## slp ~
## male -0.288 0.217 -1.324 0.186 -0.148 -0.074
## sci.1 ~
## male -1.511 0.336 -4.503 0.000 -1.511 -0.073
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .int ~~
## .slp 2.789 1.036 2.693 0.007 0.148 0.148
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## .int 60.057 0.412 145.793 0.000 6.121 6.121
## .slp 1.907 0.121 15.788 0.000 0.982 0.982
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 13.139 1.423 9.234 0.000 13.139 0.122
## .sci.2 12.579 0.946 13.301 0.000 12.579 0.107
## .sci.3 8.827 0.954 9.255 0.000 8.827 0.068
## .sci.4 17.512 1.947 8.994 0.000 17.512 0.107
## .int 94.515 4.423 21.370 0.000 0.982 0.982
## .slp 3.749 0.440 8.517 0.000 0.995 0.995
These approaches are relatively simple ways to incorporate nesting information into the model, but highlight the continuity with more complex approaches we will discuss as we move forward in this chapter.
Multiple Groups Models
While the categorical predictor allows us to vary some parameters across groups, a unique power of the SEM is the multiple group model, where we can estimate unique parameters of essentially any type across any number of groups (although the number of groups tends to be small in practice). The reason for a small number of groups tends to be due to sample size requirements within-group. SEM is still a large-sample estimator in the multiple groups model, so we have to be careful we aren’t under-powering the model at the group level, even if the overall sample size is large. Below we can see how the multiple-group LCM is specified. Nothing changes about the syntax object mult.g1
from what we have seen before, however, now we include the argument group = "male"
to indicate that we want to obtain unique estimates for females and males. Typically, these models require setting some small set of parameters equal in the model just to identify the model, so here we conveniently achieve this through having the same set factor loadings for the growth model. We can see the results below.
<- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
mult.g1 slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4"
<- growth(mult.g1,
mult.g1.fit data = achieve.wide,
group = "male",
estimator = "ML",
missing = "FIML")
summary(mult.g1.fit, fit.measures = TRUE, estimates = TRUE,
standardize = TRUE, rsquare = FALSE)
## lavaan 0.6.13 ended normally after 226 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 18
##
## Number of observations per group: Used Total
## 1 576 600
## 0 624 637
## Number of missing patterns per group:
## 1 15
## 0 15
##
## Model Test User Model:
##
## Test statistic 61.451
## Degrees of freedom 10
## P-value (Chi-square) 0.000
## Test statistic for each group:
## 1 30.756
## 0 30.695
##
## Model Test Baseline Model:
##
## Test statistic 3772.258
## Degrees of freedom 12
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.986
## Tucker-Lewis Index (TLI) 0.984
##
## Robust Comparative Fit Index (CFI) 0.983
## Robust Tucker-Lewis Index (TLI) 0.979
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -12013.745
## Loglikelihood unrestricted model (H1) -11983.019
##
## Akaike (AIC) 24063.489
## Bayesian (BIC) 24155.111
## Sample-size adjusted Bayesian (SABIC) 24097.936
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.093
## 90 Percent confidence interval - lower 0.071
## 90 Percent confidence interval - upper 0.115
## P-value H_0: RMSEA <= 0.050 0.001
## P-value H_0: RMSEA >= 0.080 0.841
##
## Robust RMSEA 0.130
## 90 Percent confidence interval - lower 0.098
## 90 Percent confidence interval - upper 0.164
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 0.994
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.036
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
##
## Group 1 [1]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 10.624 0.933
## sci.2 1.000 10.624 0.903
## sci.3 1.000 10.624 0.859
## sci.4 1.000 10.624 0.765
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.824 0.155
## sci.3 2.000 3.648 0.295
## sci.4 3.000 5.472 0.394
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~~
## slp 4.373 1.704 2.567 0.010 0.226 0.226
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## int 61.564 0.472 130.490 0.000 5.795 5.795
## slp 2.217 0.136 16.342 0.000 1.216 1.216
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 16.927 2.377 7.121 0.000 16.927 0.130
## .sci.2 13.629 1.534 8.882 0.000 13.629 0.098
## .sci.3 9.128 1.557 5.861 0.000 9.128 0.060
## .sci.4 24.031 3.370 7.131 0.000 24.031 0.124
## int 112.875 7.647 14.760 0.000 1.000 1.000
## slp 3.327 0.713 4.665 0.000 1.000 1.000
##
##
## Group 2 [0]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 8.752 0.936
## sci.2 1.000 8.752 0.894
## sci.3 1.000 8.752 0.844
## sci.4 1.000 8.752 0.750
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.967 0.201
## sci.3 2.000 3.934 0.379
## sci.4 3.000 5.901 0.506
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~~
## slp 1.811 1.234 1.467 0.142 0.105 0.105
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## int 60.051 0.372 161.225 0.000 6.861 6.861
## slp 1.902 0.118 16.087 0.000 0.967 0.967
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 10.906 1.731 6.301 0.000 10.906 0.125
## .sci.2 11.783 1.175 10.028 0.000 11.783 0.123
## .sci.3 8.222 1.177 6.984 0.000 8.222 0.076
## .sci.4 13.792 2.349 5.870 0.000 13.792 0.101
## int 76.605 4.979 15.386 0.000 1.000 1.000
## slp 3.870 0.547 7.069 0.000 1.000 1.000
We can see that we now have separate test statistics and sample sizes that reflect the fact that we are - to a large extent - estimating separate models. We also can see that our parameter estimates are broken up into Group 1 [1]
and Group 2 [0]
. The value in brackets represents the value of the grouping variable male
, so here Group 1 is males and Group 2 is female (you could change this variable into a labeled factor and then those labels would be more informative). It’s important to note the ordering of the groups for future steps.
Here we can qualitatively say that among other things, males seem to start higher (\(61.6\) vs. \(60.1\)) and increase more rapidly (\(2.22\) vs. \(1.90\)) in their achievement, and have a higher correlation between starting point and rate of change (\(r = 0.226\) vs. \(r = .105\)), compared with females. However, a natural question is whether these differences are significant. We can explicitly test differences across groups by introducing group-specific labels into our model syntax and building composite parameters where we get a formal inferential test on the difference. A really nice feature of maximum likelihood parameters is that linear operations on the parameter are themselves maximum likelihood parameters themselves. To achieve this, we simply pre-multiply a given parameter by a vector of labels (e.g., c(Mi, Fi)
) and then define the composite parameter using the :=
operator. Here we will test the difference between the intercept (c(Mi, Fi)
) and slope factor means (c(Ms, Fs)
), and the covariance between factors (c(Mc, Fc)
). This is where keeping track of the group order becomes important so we don’t mis-label our parameters of interest and reverse our inference. The D*
composite parameters are then created by simply subtracting F*
from the M*
parameters.
<- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
mult.g2 slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4
int ~ c(Mi, Fi)*1
slp ~ c(Ms, Fs)*1
int ~~ c(Mc, Fc)*slp
Di := Mi - Fi
Ds := Ms - Fs
Dc := Mc - Fc"
<- growth(mult.g2,
mult.g2.fit data = achieve.wide,
group = "male",
estimator = "ML",
missing = "FIML")
summary(mult.g2.fit, fit.measures = FALSE, estimates = TRUE,
standardize = TRUE, rsquare = FALSE)
## lavaan 0.6.13 ended normally after 215 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 18
##
## Number of observations per group: Used Total
## 1 576 600
## 0 624 637
## Number of missing patterns per group:
## 1 15
## 0 15
##
## Model Test User Model:
##
## Test statistic 61.451
## Degrees of freedom 10
## P-value (Chi-square) 0.000
## Test statistic for each group:
## 1 30.756
## 0 30.695
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
##
## Group 1 [1]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 10.624 0.933
## sci.2 1.000 10.624 0.903
## sci.3 1.000 10.624 0.859
## sci.4 1.000 10.624 0.765
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.824 0.155
## sci.3 2.000 3.648 0.295
## sci.4 3.000 5.472 0.394
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~~
## slp (Mc) 4.373 1.704 2.567 0.010 0.226 0.226
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int (Mi) 61.564 0.472 130.490 0.000 5.795 5.795
## slp (Ms) 2.217 0.136 16.342 0.000 1.216 1.216
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 16.927 2.377 7.121 0.000 16.927 0.130
## .sci.2 13.629 1.534 8.882 0.000 13.629 0.098
## .sci.3 9.128 1.557 5.861 0.000 9.128 0.060
## .sci.4 24.031 3.370 7.131 0.000 24.031 0.124
## int 112.875 7.647 14.760 0.000 1.000 1.000
## slp 3.327 0.713 4.665 0.000 1.000 1.000
##
##
## Group 2 [0]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 8.752 0.936
## sci.2 1.000 8.752 0.894
## sci.3 1.000 8.752 0.844
## sci.4 1.000 8.752 0.750
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.967 0.201
## sci.3 2.000 3.934 0.379
## sci.4 3.000 5.901 0.506
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~~
## slp (Fc) 1.811 1.234 1.467 0.142 0.105 0.105
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int (Fi) 60.051 0.372 161.225 0.000 6.861 6.861
## slp (Fs) 1.902 0.118 16.087 0.000 0.967 0.967
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 10.906 1.731 6.301 0.000 10.906 0.125
## .sci.2 11.783 1.175 10.028 0.000 11.783 0.123
## .sci.3 8.222 1.177 6.984 0.000 8.222 0.076
## .sci.4 13.792 2.349 5.870 0.000 13.792 0.101
## int 76.605 4.979 15.386 0.000 1.000 1.000
## slp 3.870 0.547 7.069 0.000 1.000 1.000
##
## Defined Parameters:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## Di 1.513 0.601 2.517 0.012 -1.066 -1.066
## Ds 0.316 0.180 1.753 0.080 0.249 0.249
## Dc 2.562 2.104 1.217 0.223 0.120 0.120
We will only focus here on the Defined Parameters
section, where we can see that while males have higher absolute estimates on all of these parameters we have tested, only the mean of the intercept factor significantly differs across groups. These are only a subset we chose for demonstration purposes, and we could explore others if we chose.
However, this model allows maximally different estimates, which belies one of the real strengths of the multiple-groups model, which is to selectively set a subset of parameters equal across groups. This allows us to bridge the gap between fitting the model in each group uniquely and fitting a single-group model where we ignore nesting altogether. The parameters we set equal benefit from increased power because it is not diluted by the smaller sub-samples and reduce the number of parameters we need to interpret. One common approach is to start with the type of model we first fit, with all parameters unequal, and then progressively build in equality constraints and test whether they reduce the fit of the model. We can do this in two ways, either through the syntax (changing labels like c(Mi, Fi)
to be the same like c(i, i)
) or through the argument group.equal
in the model fitting function. The first approach gives us more control, but can be tedious in large models, while the latter is less precise but quick. For instance, if we want to constrain the factor means equal across groups, we can simply include means
in the group.equal
argument. We can see the results below.
<- "int =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
mult.g3 slp =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4"
<- growth(mult.g3,
mult.g3.fit data = achieve.wide,
group = "male",
group.equal = c("means"),
estimator = "ML",
missing = "FIML")
summary(mult.g3.fit, fit.measures = TRUE, estimates = TRUE,
standardize = TRUE, rsquare = FALSE)
## lavaan 0.6.13 ended normally after 163 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 18
## Number of equality constraints 2
##
## Number of observations per group: Used Total
## 1 576 600
## 0 624 637
## Number of missing patterns per group:
## 1 15
## 0 15
##
## Model Test User Model:
##
## Test statistic 71.847
## Degrees of freedom 12
## P-value (Chi-square) 0.000
## Test statistic for each group:
## 1 37.058
## 0 34.789
##
## Model Test Baseline Model:
##
## Test statistic 3772.258
## Degrees of freedom 12
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.984
## Tucker-Lewis Index (TLI) 0.984
##
## Robust Comparative Fit Index (CFI) 0.981
## Robust Tucker-Lewis Index (TLI) 0.981
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -12018.943
## Loglikelihood unrestricted model (H1) -11983.019
##
## Akaike (AIC) 24069.886
## Bayesian (BIC) 24151.327
## Sample-size adjusted Bayesian (SABIC) 24100.505
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.091
## 90 Percent confidence interval - lower 0.072
## 90 Percent confidence interval - upper 0.112
## P-value H_0: RMSEA <= 0.050 0.000
## P-value H_0: RMSEA >= 0.080 0.832
##
## Robust RMSEA 0.123
## 90 Percent confidence interval - lower 0.095
## 90 Percent confidence interval - upper 0.152
## P-value H_0: Robust RMSEA <= 0.050 0.000
## P-value H_0: Robust RMSEA >= 0.080 0.993
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.058
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
##
## Group 1 [1]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 10.658 0.933
## sci.2 1.000 10.658 0.901
## sci.3 1.000 10.658 0.857
## sci.4 1.000 10.658 0.764
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.822 0.154
## sci.3 2.000 3.643 0.293
## sci.4 3.000 5.465 0.392
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~~
## slp 4.638 1.697 2.733 0.006 0.239 0.239
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## int (.20.) 60.622 0.295 205.481 0.000 5.688 5.688
## slp (.21.) 2.037 0.090 22.661 0.000 1.118 1.118
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 17.021 2.367 7.190 0.000 17.021 0.130
## .sci.2 13.673 1.541 8.872 0.000 13.673 0.098
## .sci.3 9.399 1.557 6.035 0.000 9.399 0.061
## .sci.4 23.359 3.312 7.054 0.000 23.359 0.120
## int 113.598 7.692 14.768 0.000 1.000 1.000
## slp 3.318 0.710 4.674 0.000 1.000 1.000
##
##
## Group 2 [0]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int =~
## sci.1 1.000 8.773 0.936
## sci.2 1.000 8.773 0.894
## sci.3 1.000 8.773 0.844
## sci.4 1.000 8.773 0.749
## slp =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.975 0.201
## sci.3 2.000 3.949 0.380
## sci.4 3.000 5.924 0.505
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int ~~
## slp 1.853 1.241 1.494 0.135 0.107 0.107
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## int (.20.) 60.622 0.295 205.481 0.000 6.910 6.910
## slp (.21.) 2.037 0.090 22.661 0.000 1.031 1.031
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 10.894 1.725 6.315 0.000 10.894 0.124
## .sci.2 11.772 1.171 10.053 0.000 11.772 0.122
## .sci.3 8.040 1.165 6.902 0.000 8.040 0.074
## .sci.4 14.177 2.357 6.015 0.000 14.177 0.103
## int 76.967 5.013 15.354 0.000 1.000 1.000
## slp 3.899 0.550 7.086 0.000 1.000 1.000
Now note that the factor means (labels .20.
and .21.
) are equal across males and females. We could mix and match syntax and argument constraints if we like; like many things in SEM, we can have full control as long as we keep things organized. We can test whether this more constrained model fits significantly worse than the maximally-different model using a likelihood ratio test.
lavTestLRT(mult.g1.fit, mult.g3.fit)
##
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff RMSEA Df diff Pr(>Chisq)
## mult.g1.fit 10 24064 24155 61.451
## mult.g3.fit 12 24070 24151 71.847 10.397 0.083649 2 0.005526 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here it appears that constraining the factor means does significantly decrease model fit, so we wouldn’t conclude that this constraint is appropriate. We could do this kind of testing with other parameters if we chose, although we won’t here for space and simplicity. The complexity and utility of the multiple groups models is not something we can explore fully here, and interested readers can find a much larger literature on these kinds of models if they wish to implement them in their own research (see the manuscript for a general starting point).
Fixed Effect Approach
The fixed effect approach is (rarely for quantitative methods) exactly what it sounds like in the context of mixed-effect models, we simply include our grouping variable as a fixed effect in the model. This has the convenience of account for unmeasured or otherwise unmodeled differences between groups that might bias the effects. In the most common use of the fixed-effects approach, we account for intercept differences by including a main effect of the group. In the context of our data example, we will model the effect of site
on science achievement using a fixed approach. Within the lmer()
syntax, we can use the short-cute of factor(site)
to generate a series of dummy-codes, where site == 1
is the reference category. We can see the results of this below.
<- lmer(sci ~ 1 + wave + factor(site) + (1 + wave | id),
fixed.mlm1 na.action = na.omit,
REML = TRUE,
data = achieve)
summary(fixed.mlm1, correlation = FALSE)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + factor(site) + (1 + wave | id)
## Data: achieve
##
## REML criterion at convergence: 24082.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.5642 -0.4218 0.0130 0.4559 4.2084
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 94.706 9.732
## wave 4.048 2.012 -0.10
## Residual 12.355 3.515
## Number of obs: 3643, groups: id, 1200
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 58.59757 0.56691 1247.30166 103.363 <2e-16 ***
## wave 1.98858 0.08722 861.42604 22.800 <2e-16 ***
## factor(site)2 1.58400 0.86061 1183.63741 1.841 0.0659 .
## factor(site)3 0.40383 0.89404 1189.48204 0.452 0.6516
## factor(site)4 0.25571 0.83030 1186.72499 0.308 0.7582
## factor(site)5 -2.21401 1.05705 1193.40188 -2.095 0.0364 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note that the (Intercept)
term is the intercept for Site 1 and each of the effects factor(site)*
indicate the relative difference from that term for the other sites. If we wished to use an absolute coding scheme, we change the 1
to a 0
(indicating that we don’t wish to estimate a reference intercept) and re-estimate the model.
<- lmer(sci ~ 0 + wave + factor(site) + (1 + wave | id),
fixed.mlm2 na.action = na.omit,
REML = TRUE,
data = achieve)
summary(fixed.mlm2, correlation = FALSE)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 0 + wave + factor(site) + (1 + wave | id)
## Data: achieve
##
## REML criterion at convergence: 24082.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.5642 -0.4218 0.0130 0.4559 4.2084
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 94.706 9.732
## wave 4.048 2.012 -0.10
## Residual 12.355 3.515
## Number of obs: 3643, groups: id, 1200
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## wave 1.989e+00 8.722e-02 8.614e+02 22.80 <2e-16 ***
## factor(site)1 5.860e+01 5.669e-01 1.247e+03 103.36 <2e-16 ***
## factor(site)2 6.018e+01 6.675e-01 1.226e+03 90.16 <2e-16 ***
## factor(site)3 5.900e+01 7.105e-01 1.240e+03 83.04 <2e-16 ***
## factor(site)4 5.885e+01 6.287e-01 1.248e+03 93.61 <2e-16 ***
## factor(site)5 5.638e+01 9.073e-01 1.231e+03 62.14 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
These results are equivalent to the first coding scheme but we get the intercept estimate for each site instead of computing differences from a chosen site (note the change in what the inference test tells us).
Like mentioned above, the fixed effect approach is most commonly used for intercept differences, but can logically be extended to account for effect differences across sites (or another grouping variable). We can accomplish this by modeling the interaction of site with our time predictor like below.
<- lmer(sci ~ 1 + wave + factor(site) + wave:factor(site) + (1 + wave | id),
fixed.mlm3 na.action = na.omit,
REML = TRUE,
data = achieve)
summary(fixed.mlm3, correlation = FALSE)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + factor(site) + wave:factor(site) + (1 + wave |
## id)
## Data: achieve
##
## REML criterion at convergence: 24076.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.5317 -0.4176 0.0117 0.4573 4.2531
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 94.674 9.730
## wave 3.997 1.999 -0.09
## Residual 12.365 3.516
## Number of obs: 3643, groups: id, 1200
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 58.40288 0.59914 1154.84398 97.477 <2e-16 ***
## wave 2.13593 0.17030 972.01926 12.542 <2e-16 ***
## factor(site)2 1.74603 0.92295 1120.17539 1.892 0.0588 .
## factor(site)3 1.10856 0.96191 1134.56798 1.152 0.2494
## factor(site)4 0.03638 0.89457 1141.90715 0.041 0.9676
## factor(site)5 -1.45252 1.13802 1139.05117 -1.276 0.2021
## wave:factor(site)2 -0.12248 0.25311 860.24892 -0.484 0.6286
## wave:factor(site)3 -0.52808 0.26582 877.36603 -1.987 0.0473 *
## wave:factor(site)4 0.15811 0.24806 912.54273 0.637 0.5240
## wave:factor(site)5 -0.56471 0.31249 860.82676 -1.807 0.0711 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here we return to the reference coding to test whether the effect of wave differs across sites from the reference. We can see how parameter do tend to multiply rapidly, however, with a relatively small set of grouping levels (here we only have 5 site), this is manageable. With larger numbers of groups (e.g., families), this approach would become less tenable. Note that we still have our random effect across individuals, but we’ve removed site-level differences from those estimates through the use of the fixed effects approach. Very often we will not interpret the associated parameters, but include them to de-confound our estimates of interest. Another thing to mention is that this isn’t a “pure” form of the fixed-effects approach, since we still include the random effect of individual. The purely fixed-effect approach is more common/appropriate for group-based clustered (e.g., site, country) rather than individual-based clustering (as we always have in longitudinal data) since we cannot include individual level fixed-effects without breaking the model (it isn’t identified).
Random Effect Approach
If we want to move into accounting for grouping variables with many levels, the random effects framework provides a nice and ready solution. Indeed, we have been using it for the nesting of repeated measures within individuals this whole time. Within MEMs, we can include higher levels of nesting by expanding the structure of the random effects in our model syntax. Here we expand our classic (1 + wave | id)
to include the higher level of nesting individuals within schools (1 + wave | school/id)
.
<- lmer(sci ~ 1 + wave + (1 + wave | school/id),
rand.mlm na.action = na.omit,
REML = TRUE,
data = achieve)
summary(rand.mlm, correlation = FALSE)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: sci ~ 1 + wave + (1 + wave | school/id)
## Data: achieve
##
## REML criterion at convergence: 23997.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.4223 -0.4199 0.0095 0.4600 4.3843
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id:school (Intercept) 84.8326 9.2105
## wave 3.5757 1.8909 -0.10
## school (Intercept) 12.8283 3.5817
## wave 0.5724 0.7566 -0.14
## Residual 12.3154 3.5093
## Number of obs: 3643, groups: id:school, 1200; school, 41
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 58.4281 0.6514 35.6571 89.70 < 2e-16 ***
## wave 1.9598 0.1516 34.9520 12.93 7.07e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The ordering of school/id
is important otherwise we have schools nested within individual which will make the model sad (it’s true). We can check that we’ve done this in the model summary where we have \(1200\) individuals nested within schools (id:school
) and \(41\) schools. If we compare this model to the model where we ignore nesting within school, we will notice that the fixed effects have not changed that much, but that our estimate of individual-level variance has been broken up into individual- and school-level variability. This situation is fairly common (although not guaranteed), with fixed effects that are relatively robust to omitting higher levels of nesting, but mis-attributed variance estimates at lower levels of nesting that over-estimate inter-individual variance.
Compared with the fairly natural extension of the MEMs to higher levels of nesting, multilevel SEM estimation is much more challenging and the current options in R are fairly limited. For instance, lavaan currently only allows complete case continuous data with a random intercept at the higher level of nesting (see the current tutorial). This is an active area of improvement so additional options are likely to be available soon. We provide Mplus examples of the full model that more closely resembles the MLM above. However, for completeness, we fit the model we can in lavaan below.
<- "
mlsem level: 1
int.w =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
slp.w =~ 0*sci.1 + 1*sci.2 + 2*sci.3 + 3*sci.4
sci.1 ~~ a*sci.1
sci.2 ~~ a*sci.2
sci.3 ~~ a*sci.3
sci.4 ~~ a*sci.4
level: 2
int.b =~ 1*sci.1 + 1*sci.2 + 1*sci.3 + 1*sci.4
sci.1 ~~ b*sci.1
sci.2 ~~ b*sci.2
sci.3 ~~ b*sci.3
sci.4 ~~ b*sci.4"
<- growth(mlsem,
mlsem.fit data = achieve.wide,
cluster = "school",
estimator = "ML",
missing = "listwise",
optim.method = "em")
summary(mlsem.fit, fit.measures = TRUE, estimates = TRUE,
standardize = TRUE, rsquare = TRUE)
## lavaan 0.6.13 ended normally after 41 iterations
##
## Estimator ML
## Optimization method EM
## Number of model parameters 15
## Number of equality constraints 6
##
## Used Total
## Number of observations 482 1237
## Number of clusters [school] 35
##
## Model Test User Model:
##
## Test statistic 68.230
## Degrees of freedom 15
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 2526.704
## Degrees of freedom 12
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.979
## Tucker-Lewis Index (TLI) 0.983
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -5969.695
## Loglikelihood unrestricted model (H1) -5935.580
##
## Akaike (AIC) 11957.390
## Bayesian (BIC) 11994.991
## Sample-size adjusted Bayesian (SABIC) 11966.426
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.086
## 90 Percent confidence interval - lower 0.066
## 90 Percent confidence interval - upper 0.107
## P-value H_0: RMSEA <= 0.050 0.002
## P-value H_0: RMSEA >= 0.080 0.700
##
## Standardized Root Mean Square Residual (corr metric):
##
## SRMR (within covariance matrix) 0.012
## SRMR (between covariance matrix) 0.025
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Observed
## Observed information based on Hessian
##
##
## Level 1 [within]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int.w =~
## sci.1 1.000 8.513 0.947
## sci.2 1.000 8.513 0.915
## sci.3 1.000 8.513 0.858
## sci.4 1.000 8.513 0.789
## slp.w =~
## sci.1 0.000 0.000 0.000
## sci.2 1.000 1.750 0.188
## sci.3 2.000 3.501 0.353
## sci.4 3.000 5.251 0.487
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int.w ~~
## slp.w 1.317 0.922 1.428 0.153 0.088 0.088
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## int.w 3.402 0.427 7.965 0.000 0.400 0.400
## slp.w 2.098 0.134 15.668 0.000 1.198 1.198
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 (a) 8.373 0.395 21.224 0.000 8.373 0.104
## .sci.2 (a) 8.373 0.395 21.224 0.000 8.373 0.097
## .sci.3 (a) 8.373 0.395 21.224 0.000 8.373 0.085
## .sci.4 (a) 8.373 0.395 21.224 0.000 8.373 0.072
## int.w 72.471 5.263 13.771 0.000 1.000 1.000
## slp.w 3.063 0.324 9.441 0.000 1.000 1.000
##
## R-Square:
## Estimate
## sci.1 0.896
## sci.2 0.903
## sci.3 0.915
## sci.4 0.928
##
##
## Level 2 [school]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## int.b =~
## sci.1 1.000 4.083 0.969
## sci.2 1.000 4.083 0.969
## sci.3 1.000 4.083 0.969
## sci.4 1.000 4.083 0.969
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 0.000 0.000 0.000
## .sci.2 0.000 0.000 0.000
## .sci.3 0.000 0.000 0.000
## .sci.4 0.000 0.000 0.000
## int.b 59.767 0.427 139.929 0.000 14.639 14.639
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .sci.1 (b) 1.090 0.302 3.609 0.000 1.090 0.061
## .sci.2 (b) 1.090 0.302 3.609 0.000 1.090 0.061
## .sci.3 (b) 1.090 0.302 3.609 0.000 1.090 0.061
## .sci.4 (b) 1.090 0.302 3.609 0.000 1.090 0.061
## int.b 16.668 6.440 2.588 0.010 1.000 1.000
##
## R-Square:
## Estimate
## sci.1 0.939
## sci.2 0.939
## sci.3 0.939
## sci.4 0.939
Here we get a similar layout for the results as the multiple groups model, with the within estimates first, and then the school-level estimates next. We needed to set the residuals to be homoscedastic to achieve convergence here, but in principle these can be allowed to vary.
Cluster Correction
The fixed effects point estimates within the model tend to be robust to the omission of higher levels of nesting, however, their standard errors do tend to be inaccurate. As such, sometimes we might wish to simply not interpret the variance components and focus on regression coefficients or other parts of our model. This might be less frequent in growth modeling, but for completeness we will consider this type of application. In this instance, we can simply implement a cluster-correction to our standard errors and then proceed as usual to interpret the model parameters. To our knowledge, this is not currently an option in R, so we provide the Mplus syntax to do so below.