HW #12: SEM II

Load Packages

# Projects
library(here)

# R Markdown
library(knitr)
library(rbbt)
library(kableExtra)

# Statistics
library(psych)
library(lavaan)

# Tidyverse
library(haven)
library(tidyverse)
library(magrittr)

1. Intercorrelated Measurement Model

Modeling in R

First, I need to import the covariance matrix for the items. I will also clean it up a bit.

Note: I need to take the absolute value of the data because lavaan does not accept negative values in its sample covariance matrices. I believe this should not affect the results by destroying information.

tbl.1 <- read_dta(here('data', 'ambcori.dta'))

tbl.1 <- tbl.1[3:length(tbl.1)] %>%
  filter(!is.na(amb1)) %>%
  abs

I believe the SEM measurement model is equivalent to a CFA, so I may need to use the cfa() function in place of sem() to reproduce the LISREL/Stata analysis.¹

I will also standardize the factors to a mean of zero and a variance of one. That’s my understanding of what PH=ST in LISREL and var(AMB@1 CD@1 TP@1 HI@1 ANX@1) in Stata are doing. It also allows lavaan to estimate loadings for the first item in each factor instead of using those items as scales.

mtx.1 <- tbl.1 %>%
  as.matrix %>%
  set_rownames(colnames(.))

chr.1 <- '
    # measurement
    amb =~ amb1 + amb2 + amb3 + amb4 + amb5
    cd =~ cd1 + cd2 + cd3 + cd4
    tp =~ tp1 + tp2 + tp3 + tp4 + tp5
    hi =~ hi1 + hi2 + hi3 + hi4 + hi5 + hi6
    anx =~ anx1 + anx2 + anx3 + anx4 + anx5 + anx6 + anx7 + anx8
'

fit.1 <- cfa(chr.1, sample.cov = mtx.1, sample.nobs = 211, std.lv = T)

As a spot test, I’m going to check whether the \(\chi^2\) for the R model matches that of the LISREL model.

fitMeasures(fit.1, "chisq")

##   chisq 
## 570.988

Oy, that does not match the \(\chi^2 \approx 605\) in the LISREL output. But it’s kind of close. I’ll check the RMSEA.

fitMeasures(fit.1, "rmsea")

## rmsea 
## 0.057

That’s also close, but not exactly right. OK, here’s what I’m going to do. I will proceed with the analysis using the results in R but continue to check them against the LISREL output for now.

Item Loadings in Measurement-Only Model

tbl.1.a <- fit.1 %>%
  parameterEstimates(ci = F)

tbl.1.a %>%
  filter(op == "=~") %>%
  kable

lhs	op	rhs	est	se	z	pvalue
amb	=~	amb1	0.5951850	0.0707325	8.414587	0.0000000
amb	=~	amb2	0.7046465	0.0686909	10.258218	0.0000000
amb	=~	amb3	0.4263819	0.0739855	5.763046	0.0000000
amb	=~	amb4	0.5036952	0.0725881	6.939086	0.0000000
amb	=~	amb5	0.4373524	0.0737996	5.926214	0.0000000
cd	=~	cd1	0.6861771	0.0635125	10.803818	0.0000000
cd	=~	cd2	0.4299105	0.0695283	6.183247	0.0000000
cd	=~	cd3	0.8402813	0.0589320	14.258482	0.0000000
cd	=~	cd4	0.8598589	0.0583342	14.740230	0.0000000
tp	=~	tp1	0.7062261	0.0683542	10.331857	0.0000000
tp	=~	tp2	0.6847743	0.0687352	9.962498	0.0000000
tp	=~	tp3	0.5045142	0.0724502	6.963603	0.0000000
tp	=~	tp4	0.4895761	0.0727391	6.730574	0.0000000
tp	=~	tp5	0.1815471	0.0768394	2.362684	0.0181432
hi	=~	hi1	0.6055980	0.0709772	8.532294	0.0000000
hi	=~	hi2	0.3839978	0.0747851	5.134682	0.0000003
hi	=~	hi3	0.3524860	0.0752262	4.685682	0.0000028
hi	=~	hi4	0.4123199	0.0743594	5.544960	0.0000000
hi	=~	hi5	0.4933564	0.0730056	6.757787	0.0000000
hi	=~	hi6	0.5680924	0.0716443	7.929346	0.0000000
anx	=~	anx1	0.5471883	0.0705601	7.754931	0.0000000
anx	=~	anx2	0.2610233	0.0752670	3.467963	0.0005244
anx	=~	anx3	0.5657732	0.0701435	8.065940	0.0000000
anx	=~	anx4	0.6092491	0.0691282	8.813328	0.0000000
anx	=~	anx5	0.5326560	0.0708775	7.515160	0.0000000
anx	=~	anx6	0.7269644	0.0662256	10.977091	0.0000000
anx	=~	anx7	0.5513192	0.0704685	7.823630	0.0000000
anx	=~	anx8	0.3548766	0.0740934	4.789581	0.0000017

First, the item loadings are approximately the same to those in LISREL, albeit not identical.

Second, all the item loadings are significant across the board.

Third, however, some of the loadings are worryingly low. TP5 loads on TP with just .18 and ANX2 loads on ANX with just .26.

TP5 is “Faced with daily deadlines at work”, while most of the other TP items refer to time pressure outside of work. There may be an issue with that.

ANX2 is “Felt as though I might faint”, which is both in a different tense than the other items (past vs. present) and is less a core symptom of anxiety than the other items (cf. “upset”, “restless”, “panicky”). Thus, it’s also possible to reason why there might be an issue with that item.

Overall, frankly, few factor loadings exceed the common .7 rule-of-thumb. All these factors and loadings should therefore be interpreted with a grain of salt.

Factor Correlations

chr.1.a <- c("amb", "cd", "tp", "hi", "anx")

tbl.1.a %>%
  filter(op == "~~", lhs %in% chr.1.a, lhs != rhs) %>%
  kable

lhs	op	rhs	est	se	z	pvalue
amb	~~	cd	0.7840136	0.0501677	15.627861	0.0000000
amb	~~	tp	0.3900178	0.0872893	4.468107	0.0000079
amb	~~	hi	0.4744290	0.0878285	5.401766	0.0000001
amb	~~	anx	0.2118842	0.0897756	2.360154	0.0182674
cd	~~	tp	0.3871620	0.0773331	5.006421	0.0000006
cd	~~	hi	0.4267790	0.0797100	5.354146	0.0000001
cd	~~	anx	0.1406183	0.0820861	1.713059	0.0867016
tp	~~	hi	0.8360994	0.0602762	13.871137	0.0000000
tp	~~	anx	0.5161111	0.0743602	6.940695	0.0000000
hi	~~	anx	0.6026375	0.0733199	8.219288	0.0000000

Again, the estimates produced by R are similar, but not identical to, the estimates produced by LISREL. At this point, I will assume that trend continues and focus on the R output from here on.

Our model supposes that, whether through direct or indirect effects, every variable is correlated with every other. The estimates of factor correlations mostly bear that out. With the possible exception of CD and ANX (\(\phi_{CD, ANX} \approx .14, p < .1\)), every pair of factors is significantly correlated. In addition, AMB and ANX are only weakly correlated (\(\phi_{AMB, ANX}\approx.21, p < .05\)). But AMB and CD are theoretically most distal from ANX in the model, so this is not really a problem.

We will have to see what happens when these factors are formally introduced to each other in the structural model.

Modification Indices

Modification indices (MIs) are ‘what if’ tests: “an MI reflects the improvement in model fit that would result if a previously omitted parameter were to be added and freely estimated” (Curran-Bauer Analytics, 2019). There are as many MIs for a model as excluded parameters. In our case, the MIs are mostly loadings on other factors—for instance, AMB3’s loading on HI. What’s interesting, then, are the items that, when loaded on a different factor, improve the model significantly. MIs represent potential changes in \(\chi^2\), so larger values imply a larger \(\chi^2\) and improved model fit.

There are nearly 500 modification indices, so I truncated the full results to just a list of the 10 largest factor loadings in rank order.

fit.1 %>%
  modificationIndices(sort. = T, op = "=~") %>%
  head(10) %>%
  kable

	lhs	op	rhs	mi	epc	sepc.lv	sepc.all	sepc.nox
128	tp	=~	hi1	21.64422	1.0926219	1.0926219	1.0952206	1.0952206
130	tp	=~	hi3	17.06270	-0.9228530	-0.9228530	-0.9250480	-0.9250480
181	anx	=~	hi4	14.59955	0.4028095	0.4028095	0.4037675	0.4037675
81	amb	=~	hi1	12.87624	-0.3275390	-0.3275390	-0.3283181	-0.3283181
139	tp	=~	anx6	11.39455	0.2856008	0.2856008	0.2862805	0.2862805
72	amb	=~	cd1	11.29562	0.4562679	0.4562679	0.4573527	0.4573527
82	amb	=~	hi2	11.12923	0.3063939	0.3063939	0.3071226	0.3071226
106	cd	=~	hi2	10.88502	0.2713766	0.2713766	0.2720220	0.2720220
144	hi	=~	amb3	10.62096	0.2945618	0.2945618	0.2952622	0.2952622
121	tp	=~	amb3	10.01354	0.2689161	0.2689161	0.2695555	0.2695555

HI1, HI3, HI4 appear to not belong in HI! Let’s take a look at the items:

HI1: Always rushed
HI3: Act immediately under pressure
HI4: Always hurry

The MIs show that HI1 may belong TP. HI1 conceptually does seem to fit in TP because it’s in the form of a participle adjective, whereas HI is conceived as behavior.

HI3 may also belong in TP for the obvious reason that it says “under PRESSURE”, which not-so-gently reminds me of “time PRESSURE”.

HI4 may belong in ANX according to its MI, but I don’t see an obvious reason why. This may be a case where the MI suggests a change that’s not supported by theory.

Residuals

mtx.1.a <- resid(fit.1)$cov

mtx.1.a %>%
  kable(digits = 3, row.names = T) %>%
  kable_styling(font_size = 13) %>%
  scroll_box(width = "100%", height = "500px",
             fixed_thead = list(enabled = T, background = "#63a0e1"))

	amb1	amb2	amb3	amb4	amb5	cd1	cd2	cd3	cd4	tp1	tp2	tp3	tp4	tp5	hi1	hi2	hi3	hi4	hi5	hi6	anx1	anx2	anx3	anx4	anx5	anx6	anx7	anx8
amb1	0.000	0.056	-0.159	0.052	-0.080	0.156	-0.041	0.024	-0.032	-0.051	-0.141	-0.004	0.063	0.104	-0.164	0.138	-0.029	-0.061	0.028	0.055	0.004	0.122	-0.057	-0.075	-0.007	-0.056	-0.065	-0.017
amb2	0.056	0.000	-0.055	-0.063	-0.010	0.126	0.013	0.029	-0.004	0.010	-0.133	-0.033	0.013	0.072	-0.201	0.164	-0.046	-0.125	0.030	-0.005	0.011	0.109	-0.072	-0.086	0.006	0.026	0.010	0.006
amb3	-0.159	-0.055	0.000	0.075	0.187	-0.044	0.020	-0.034	0.053	0.121	0.097	0.186	0.041	0.049	0.060	0.095	0.035	0.039	0.144	0.139	0.120	0.007	0.002	-0.026	0.100	0.118	0.120	0.094
amb4	0.052	-0.063	0.075	0.000	0.108	0.005	-0.148	-0.018	-0.072	0.035	0.036	-0.038	0.002	0.149	0.009	0.039	0.040	-0.040	0.033	0.176	-0.057	0.195	-0.058	-0.028	-0.003	-0.073	-0.039	-0.015
amb5	-0.080	-0.010	0.187	0.108	0.000	0.008	0.058	-0.083	-0.074	0.056	0.039	0.077	0.000	-0.016	0.033	-0.030	-0.038	-0.062	-0.031	0.092	0.022	0.143	-0.045	-0.029	0.108	0.138	0.025	0.113
cd1	0.156	0.126	-0.044	0.005	0.008	0.000	0.042	-0.019	-0.027	0.027	-0.092	0.052	0.063	0.007	-0.115	0.176	0.073	-0.071	0.108	-0.004	-0.036	0.076	0.000	-0.038	0.001	0.053	-0.026	-0.023
cd2	-0.041	0.013	0.020	-0.148	0.058	0.042	0.000	-0.041	0.030	0.033	-0.038	0.000	-0.008	0.094	-0.110	0.171	0.024	0.008	0.113	-0.051	0.103	0.065	0.021	0.054	0.006	0.048	0.107	0.053
cd3	0.024	0.029	-0.034	-0.018	-0.083	-0.019	-0.041	0.000	0.017	-0.026	-0.067	-0.006	0.004	0.090	-0.069	0.125	0.009	-0.128	0.041	0.053	-0.019	0.063	0.008	-0.059	0.033	-0.003	-0.006	-0.022
cd4	-0.032	-0.004	0.053	-0.072	-0.074	-0.027	0.030	0.017	0.000	0.025	-0.028	0.089	0.035	0.056	-0.094	0.146	-0.031	-0.069	0.099	-0.011	0.111	0.007	-0.029	-0.064	0.014	-0.005	-0.018	-0.037
tp1	-0.051	0.010	0.121	0.035	0.056	0.027	0.033	-0.026	0.025	0.000	0.053	-0.069	-0.028	-0.012	0.146	-0.100	-0.127	-0.083	-0.113	0.096	-0.011	-0.018	-0.114	-0.035	-0.046	0.082	-0.073	0.018
tp2	-0.141	-0.133	0.097	0.036	0.039	-0.092	-0.038	-0.067	-0.028	0.053	0.000	0.036	-0.024	-0.092	0.136	-0.073	-0.200	-0.063	-0.048	-0.044	-0.011	0.030	-0.051	-0.050	-0.085	0.083	0.029	-0.070
tp3	-0.004	-0.033	0.186	-0.038	0.077	0.052	0.000	-0.006	0.089	-0.069	0.036	0.000	0.020	0.002	0.027	-0.085	-0.025	-0.010	-0.004	0.026	0.040	0.067	-0.040	-0.004	0.074	0.110	0.063	-0.086
tp4	0.063	0.013	0.041	0.002	0.000	0.063	-0.008	0.004	0.035	-0.028	-0.024	0.020	0.000	0.010	-0.052	-0.004	0.003	0.042	-0.037	0.162	-0.110	0.080	0.084	-0.067	-0.015	0.117	-0.067	-0.007
tp5	0.104	0.072	0.049	0.149	-0.016	0.007	0.094	0.090	0.056	-0.012	-0.092	0.002	0.010	0.000	-0.089	0.040	-0.013	0.039	0.137	0.241	0.110	-0.014	-0.030	-0.049	-0.039	-0.063	-0.003	-0.019
hi1	-0.164	-0.201	0.060	0.009	0.033	-0.115	-0.110	-0.069	-0.094	0.146	0.136	0.027	-0.052	-0.089	0.000	0.001	0.012	-0.090	-0.068	-0.002	-0.051	-0.046	-0.087	0.045	0.075	0.055	-0.074	0.062
hi2	0.138	0.164	0.095	0.039	-0.030	0.176	0.171	0.125	0.146	-0.100	-0.073	-0.085	-0.004	0.040	0.001	0.000	0.211	-0.079	0.145	-0.037	-0.095	-0.027	-0.093	-0.120	-0.107	-0.005	-0.071	-0.070
hi3	-0.029	-0.046	0.035	0.040	-0.038	0.073	0.024	0.009	-0.031	-0.127	-0.200	-0.025	0.003	-0.013	0.012	0.211	0.000	0.122	0.032	0.001	-0.021	-0.029	0.024	0.030	0.054	0.059	-0.034	0.038
hi4	-0.061	-0.125	0.039	-0.040	-0.062	-0.071	0.008	-0.128	-0.069	-0.083	-0.063	-0.010	0.042	0.039	-0.090	-0.079	0.122	0.000	0.152	-0.025	0.139	0.068	0.194	0.082	0.049	0.091	0.179	0.097
hi5	0.028	0.030	0.144	0.033	-0.031	0.108	0.113	0.041	0.099	-0.113	-0.048	-0.004	-0.037	0.137	-0.068	0.145	0.032	0.152	0.000	-0.074	0.043	0.073	-0.001	-0.048	-0.018	0.032	0.103	0.024
hi6	0.055	-0.005	0.139	0.176	0.092	-0.004	-0.051	0.053	-0.011	0.096	-0.044	0.026	0.162	0.241	-0.002	-0.037	0.001	-0.025	-0.074	0.000	-0.148	-0.067	-0.029	-0.049	-0.071	-0.044	-0.086	0.007
anx1	0.004	0.011	0.120	-0.057	0.022	-0.036	0.103	-0.019	0.111	-0.011	-0.011	0.040	-0.110	0.110	-0.051	-0.095	-0.021	0.139	0.043	-0.148	0.000	0.032	0.029	-0.029	0.032	-0.013	0.065	-0.056
anx2	0.122	0.109	0.007	0.195	0.143	0.076	0.065	0.063	0.007	-0.018	0.030	0.067	0.080	-0.014	-0.046	-0.027	-0.029	0.068	0.073	-0.067	0.032	0.000	-0.001	-0.010	-0.028	-0.050	0.036	0.117
anx3	-0.057	-0.072	0.002	-0.058	-0.045	0.000	0.021	0.008	-0.029	-0.114	-0.051	-0.040	0.084	-0.030	-0.087	-0.093	0.024	0.194	-0.001	-0.029	0.029	-0.001	0.000	0.039	0.006	-0.015	0.022	-0.054
anx4	-0.075	-0.086	-0.026	-0.028	-0.029	-0.038	0.054	-0.059	-0.064	-0.035	-0.050	-0.004	-0.067	-0.049	0.045	-0.120	0.030	0.082	-0.048	-0.049	-0.029	-0.010	0.039	0.000	0.004	0.005	0.002	0.012
anx5	-0.007	0.006	0.100	-0.003	0.108	0.001	0.006	0.033	0.014	-0.046	-0.085	0.074	-0.015	-0.039	0.075	-0.107	0.054	0.049	-0.018	-0.071	0.032	-0.028	0.006	0.004	0.000	0.015	-0.044	0.004
anx6	-0.056	0.026	0.118	-0.073	0.138	0.053	0.048	-0.003	-0.005	0.082	0.083	0.110	0.117	-0.063	0.055	-0.005	0.059	0.091	0.032	-0.044	-0.013	-0.050	-0.015	0.005	0.015	0.000	-0.042	-0.022
anx7	-0.065	0.010	0.120	-0.039	0.025	-0.026	0.107	-0.006	-0.018	-0.073	0.029	0.063	-0.067	-0.003	-0.074	-0.071	-0.034	0.179	0.103	-0.086	0.065	0.036	0.022	0.002	-0.044	-0.042	0.000	0.075
anx8	-0.017	0.006	0.094	-0.015	0.113	-0.023	0.053	-0.022	-0.037	0.018	-0.070	-0.086	-0.007	-0.019	0.062	-0.070	0.038	0.097	0.024	0.007	-0.056	0.117	-0.054	0.012	0.004	-0.022	0.075	0.000

What I would be curious to see here is whether any of the residuals represent statistically significant changes. The largest residual is for the correlation between HI6 and TP5. I’m just going to do a quick paired-r test using Fisher r-to-z transformations on the observed and predicted values for this correlation.

fun.1 <- function(var1, var2) {
  r <- mtx.1[var1, var2]
  t <- paired.r(r, r + mtx.1.a[var1, var2], n = 211)
  print(str_glue("z: {round(t$z, 3)}, p: {round(t$p, 3)}"))
}

fun.1("tp5", "hi6")

## z: 3.117, p: 0.002

Right, so that’s a significant change, as expected for a residual over .2. TP5 is “Faced with daily deadlines at work” and HI6 is “Hurry more than others”. A high residual means that the factors these items are loading on are more correlated than the items themselves. And I think that’s not surprising in this case; time pressure and hurried/impatient behavior seem conceptually related to me while deadlines at work and hurrying more than others are not closely related.

Why don’t we look at one more before moving on? I’ll take a low residual to see how that translates into fit between loadings and factors. CD2 and ANX3 have a residual of just .021. I doubt that’s significantly significant, but, oh, what the heck.

fun.1("cd2", "anx3")

## z: 0.216, p: 0.829

Nope. Just nope.

Now let’s look at the items. CD2 is “Hard-driving and competitive when younger”, and ANX3 is “Feel uneasy and restless”. The items are not closely related, and the correlation between them reflects that. And we already know that the correlation between CD and ANX is relatively low. Two low correlations produce a low residual. Makes sense to me.

However, I do wonder if there is a floor effect in play. What I mean is, are two low correlations going to have a smaller difference than two high correlations? Since correlations are coefficients in linear equations with (assumed) constant variance, I really don’t think it should be an issue.

Measurement Model Fit

Before we put this model away for a bit, let’s look at its overall fit. As with the models in the previous assignment, I’ll pull up the \(\chi^2\), RMSEA, and TLI.

fitMeasures(fit.1, fit.measures = c("chisq", "df", "pvalue", "rmsea", "tli")) %>%
  round(3) %>%
  kable(align = "l")

	x
chisq	570.988
df	340.000
pvalue	0.000
rmsea	0.057
tli	0.821

Good \(\chi^2\), so-so RMSEA, and an unacceptably low TLI. I’m going to call it a poor model. My hopes were not very high after seeing some of the issues above, so I’m not surprised. Oh well.

2. Orthogonal Measurement Model

fit.2 <- cfa(chr.1, sample.cov = mtx.1, sample.nobs = 211, std.lv = T, orthogonal = T)

I checked this model against the results in LISREL and found that they were similar, but not identical. I will proceed nonetheless, but expect some differences on the order of rounding error.

As usual, I will check the \(\chi^2\), RMSEA, and TLI.

fitMeasures(fit.2, fit.measures = c("chisq", "df", "pvalue", "rmsea", "tli")) %>%
  round(3) %>%
  kable(align = "l")

	x
chisq	825.213
df	350.000
pvalue	0.000
rmsea	0.080
tli	0.642

The \(\chi^2\) statistic is still showing as significant, but the RMSEA and TLI statistics have degraded considerably.

Now I will take a look at the \(\chi^2\) difference test. This test should tell us whether there is sufficient improvement in fit to justify a loss of parsimony (i.e., more parameters). The test is simple: take the difference of the two models’ \(chi^2\) statistics and the difference in their degrees of freedom (Pavlov et al., 2020). Then simply refer to a \(\chi^2\) distribution.

# M0 is nested within M1
fun.2 <- function(M0, M1) {
  D <- fitMeasures(M0, "chisq") - fitMeasures(M1, "chisq")
  df <- fitMeasures(M0, "df") - fitMeasures(M1, "df")
  p <- pchisq(D, df, lower.tail = F) %>%
    set_names("p")
  map_dbl(c(D, df, p), ~ round(., 3))
}

fun.2(fit.2, fit.1)

##   chisq      df       p 
## 254.225  10.000   0.000

There is a significant difference between the fit of the models, with the intercorrelated model (from section 1) outperforming the orthogonal model. This tells us that there are indeed meaningful correlations between the factors, which bodes well for the structural model to follow.

3. Structural Equation Model with Latent Variables

This is the most complete model, incorporating both latent variables with multiple items and structural relationships.

Estimation in R

I will reproduce the analysis in LISREL by combining the previously specified measurement and structural models into a single model. I will check the \(\chi^2\) statistic as a quick spot check to compare it with the LISREL output.

chr.3 <- '
    # measurement model
    amb =~ amb1 + amb2 + amb3 + amb4 + amb5
    cd =~ cd1 + cd2 + cd3 + cd4
    tp =~ tp1 + tp2 + tp3 + tp4 + tp5
    hi =~ hi1 + hi2 + hi3 + hi4 + hi5 + hi6
    anx =~ anx1 + anx2 + anx3 + anx4 + anx5 + anx6 + anx7 + anx8
    
    # structural model
    cd ~ amb
    tp ~ amb + cd
    hi ~ tp + cd
    anx ~ tp + hi
'

fit.3 <- sem(chr.3, sample.cov = mtx.1, sample.nobs = 211, std.lv = T)

fitMeasures(fit.3, "chisq")

##   chisq 
## 574.787

It’s within a similar range to previous models, so I will proceed.

Comparison to Measurement Models

First, I need to compare the measurement component of the model with the models from sections 1 and 2 above.

Item Loadings in Full Model

If I understand the difference between “standardized” and “completely standardized” correctly, the former standardizes only the latent variables while the latter standardizes both the latent variables and their indicators. lavaan offers the “completely” standardized solution by default.

Now, simply adding a structural model should not change the measurement model underpinning it, so I will be most curious to see whether there has been any change in the item loadings from the model in section 1.

standardizedSolution(fit.3, se = F, zstat = F, pvalue = F, ci = F) %>%
  rename(full = "est.std") %>%
  left_join(
    standardizedSolution(fit.1, se = F, zstat = F, pvalue = F, ci = F) %>%
              rename(measure = "est.std")
    ) %>%
  filter(op == "=~") %>%
  mutate(dif = round(full - measure, 3)) %>%
  kable

lhs	op	rhs	full	measure	dif
amb	=~	amb1	0.5965389	0.5966004	0.000
amb	=~	amb2	0.7104231	0.7063214	0.004
amb	=~	amb3	0.4236382	0.4273957	-0.004
amb	=~	amb4	0.5017753	0.5048930	-0.003
amb	=~	amb5	0.4384411	0.4383923	0.000
cd	=~	cd1	0.6883756	0.6878085	0.001
cd	=~	cd2	0.4328564	0.4309327	0.002
cd	=~	cd3	0.8427348	0.8422791	0.000
cd	=~	cd4	0.8599982	0.8619032	-0.002
tp	=~	tp1	0.7099354	0.7079060	0.002
tp	=~	tp2	0.6819569	0.6864029	-0.004
tp	=~	tp3	0.5025391	0.5057141	-0.003
tp	=~	tp4	0.4899837	0.4907405	-0.001
tp	=~	tp5	0.1854855	0.1819788	0.004
hi	=~	hi1	0.6256745	0.6070384	0.019
hi	=~	hi2	0.3749693	0.3849111	-0.010
hi	=~	hi3	0.3524110	0.3533244	-0.001
hi	=~	hi4	0.4134445	0.4133005	0.000
hi	=~	hi5	0.4859191	0.4945298	-0.009
hi	=~	hi6	0.5681531	0.5694435	-0.001
anx	=~	anx1	0.5522502	0.5484899	0.004
anx	=~	anx2	0.2622109	0.2616440	0.001
anx	=~	anx3	0.5686377	0.5671190	0.002
anx	=~	anx4	0.6094987	0.6106984	-0.001
anx	=~	anx5	0.5356453	0.5339230	0.002
anx	=~	anx6	0.7259092	0.7286943	-0.003
anx	=~	anx7	0.5527196	0.5526306	0.000
anx	=~	anx8	0.3539843	0.3557205	-0.002

The column on the right is simply the difference between the loadings in the measurement model and in the full model. Overall, the items look stable, so the only issues are the same as before (e.g., TP5, ANX2).

Full Model Fit

The \(\chi^2\)-statistic, RMSEA, and TLI are below:

fitMeasures(fit.3, c("chisq", "df", "pvalue", "rmsea", "tli")) %>%
  round(3) %>%
  kable(align = "l")

	x
chisq	574.787
df	343.000
pvalue	0.000
rmsea	0.057
tli	0.822

Not great, but not terrible for the models we’ve been running. The RMSEA is almost good, but the TLI is very poor.

\(\chi^2\) Difference Tests

I’ll start by comparing this model to the the less restrictive orthogonal model.

fun.2(fit.2, fit.3)

##   chisq      df       p 
## 250.426   7.000   0.000

Good start. It definitely performs better than the orthogonal model.

Next, I will compare it to the intercorrelated measurement model.

fun.2(fit.3, fit.1)

## chisq    df     p 
## 3.799 3.000 0.284

It did not perform better, so it’s not an improvement in terms of explaining variance in the data. However, introducing a full structural model allows us to make causal arguments about the relationships between variables, so I think it is still worth proceeding with the analysis of this model.

Comparison to Structural Models

I will first look at this model’s structural parameters on their own, then compare them with the structural parameters in the three models from the previous assignment.

Structural Parameters

parameterEstimates(fit.3, ci = F) %>%
  filter(op == "~") %>%
  kable

lhs	op	rhs	est	se	z	pvalue
cd	~	amb	1.2694818	0.2110196	6.0159422	0.0000000
tp	~	amb	0.3122643	0.2064213	1.5127520	0.1303427
tp	~	cd	0.1098519	0.1208939	0.9086640	0.3635275
hi	~	tp	1.3982499	0.3579335	3.9064516	0.0000937
hi	~	cd	0.1008006	0.0969279	1.0399540	0.2983613
anx	~	tp	0.0550525	0.2963488	0.1857692	0.8526257
anx	~	hi	0.3569672	0.1775020	2.0110598	0.0443191

Unlike in previous models, multiple structural parameters are non-significant—in fact, more than half. Only the paths between AMB and CD, TP and HI, and HI and ANX are supported by this analysis.

Comparisons to Previous Structural Models

I’ll pull a table with all the parameter coefficients side-by-side.

load(here('data', 'hw11data.RData'))

fit.3 %>%
parameterEstimates(se = F, zstat = F, pvalue = F, ci = F) %>%
  rename(full = "est") %>%
  left_join(
    parameterEstimates(fit.reliabilities, se = F, zstat = F, pvalue = F, ci = F) %>%
      rename(incorporated = "est") %>%
      mutate_at(1:3, ~ str_remove(., "_.*"))
    ) %>%
  left_join(
    parameterEstimates(fit.disattenuated, se = F, zstat = F, pvalue = F, ci = F) %>%
      rename(disattenuated = "est")
    ) %>%
  left_join(
    parameterEstimates(fit.observed, se = F, zstat = F, pvalue = F, ci = F) %>%
      rename(path_analysis = "est")
    ) %>%
  filter(op == "~") %>%
  mutate_at(4:7, ~ round(., 3)) %>%
  kable

lhs	op	rhs	full	incorporated	disattenuated	path_analysis
cd	~	amb	1.269	0.763	0.760	0.551
tp	~	amb	0.312	0.397	0.394	0.217
tp	~	cd	0.110	0.115	0.129	0.183
hi	~	tp	1.398	0.659	0.633	0.385
hi	~	cd	0.101	0.115	0.157	0.186
anx	~	tp	0.055	0.149	0.183	0.210
anx	~	hi	0.357	0.471	0.466	0.314

There are some dramatic differences here. The effect of AMB on CD and TP on HI are much stronger in the full model than in any others. They’re so strong, in fact, that I recall my earlier complaint in the path analysis assignment that AMB/CD and TP/HI seem more like facets of the same construct than different variables.

If we consider that the present model is the ‘best’ test of these hypothesized paths because it most thoroughly incorporates the raw measurement data, then it is fair to say that only the AMB-CD and TP-HI paths are strongly supported. The HI-ANX path tested as significant in this model, but it’s a weak effect that just barely meets the bar for significance.

I would have to say the other paths are not supported. They have only very weak effects that do not test significant. That they appear larger in the other models diminishes my opinion of those models but does not convince me that the full structural model is incorrect.

Conclusions

The steady drip of bad news for Theoretical Model A continues.

The measures have low reliability (possibly due to inconsistent items), and the factor model is unconvincing (especially HI).

Without a solid factor model, it’s difficult to make substantive interpretations of the structural model. But were we to do so anyway, we would find that the hypothesized model does not correspond to the data. Five of seven paths are insignificant or weak, and the remaining two paths’ strength can be attributed to conceptual similarities rather than causal relationships.

That said, there clearly are some true relationships in the data. Perhaps a respecified model (informed by theory) could produce cleaner factors and significant structural parameters.

References

bbt_write_bib("biblio.json", bbt_detect_citations("hw12.Rmd"), overwrite = TRUE)

Curran-Bauer Analytics. (2019, June 10). What are modification indices and should I use them when fitting SEMs to my own data? Curran-Bauer Analytics. https://curranbauer.org/what-are-modification-indices-and-should-i-use-them-when-fitting-sems-to-my-own-data/

Pavlov, G., Shi, D., & Maydeu-Olivares, A. (2020). Chi-square Difference Tests for Comparing Nested Models: An Evaluation with Non-normal Data. Structural Equation Modeling: A Multidisciplinary Journal, 0(0), 1–10. https://doi.org/10.1080/10705511.2020.1717957