I now have all the data collected (unless I have to throw some out and replace it), so let’s look at some descriptive statistics!
First, import packages.
# import packages
library(corrr)
library(tidyverse)
library(flextable)
library(psych)
library(lavaan)
library(semTools)
Import my R objects from previous documents.
<- readRDS("format.rds")
formatAsTable <- readRDS(file.path("..", "data", "quality-data.rds")) adhd.data
The manipulations should have been randomized 50/50 by Qualtrics such that we end up with about 100 per cell, but let’s check.
%>%
adhd.data count(interdep, disclose) %>%
formatAsTable
interdep | disclose | n |
FALSE | FALSE | 107 |
FALSE | TRUE | 115 |
TRUE | FALSE | 115 |
TRUE | TRUE | 107 |
Sweet, that looks just about right.
And I’ve already looked at the manipulation checks in a previous document, but I’ll port that analysis in here.
<- adhd.data %>%
adhd.data mutate(across(where(is.factor),
~ as.integer(.),
.names = "{.col}.int"))
%>%
adhd.data count(interdep, intcheck) %>%
arrange(interdep, intcheck) %>%
%>%
formatAsTable bg(bg = "#074005",
i = ~ ifelse(interdep == 0,
as.integer(intcheck) < 3,
as.integer(intcheck) > 3))
interdep | intcheck | n |
FALSE | Strongly disagree | 133 |
FALSE | Somewhat disagree | 35 |
FALSE | Neither agree nor disagree | 24 |
FALSE | Somewhat agree | 20 |
FALSE | Strongly agree | 10 |
TRUE | Strongly disagree | 3 |
TRUE | Somewhat disagree | 8 |
TRUE | Neither agree nor disagree | 21 |
TRUE | Somewhat agree | 77 |
TRUE | Strongly agree | 113 |
%>%
adhd.data count(disclose, discheck) %>%
arrange(disclose, discheck) %>%
%>%
formatAsTable bg(bg = "#074005",
i = ~ ifelse(disclose == 0,
as.integer(discheck) < 3,
as.integer(discheck) > 3))
disclose | discheck | n |
FALSE | Strongly disagree | 197 |
FALSE | Somewhat disagree | 10 |
FALSE | Neither agree nor disagree | 4 |
FALSE | Somewhat agree | 7 |
FALSE | Strongly agree | 4 |
TRUE | Somewhat disagree | 2 |
TRUE | Neither agree nor disagree | 4 |
TRUE | Somewhat agree | 17 |
TRUE | Strongly agree | 199 |
# correlation matrix with variances on the diagonal
%>%
adhd.data select(c(interdep, intcheck.int,
%>%
disclose, discheck.int)) correlate(diagonal = map_dbl(., ~ var(.))) %>%
%>%
shave fashion(leading_zeros = F) %>%
formatAsTable
term | interdep | intcheck.int | disclose | discheck.int |
interdep | .25 | |||
intcheck.int | .76 | 2.65 | ||
disclose | -.04 | .04 | .25 | |
discheck.int | -.05 | .07 | .94 | 3.69 |
Clearly, both manipulations were effective, but disclosure
was much more effective than interdep
. To be honest, I’m not sure why. My two best guesses are that
interdep
came first and the participants forgot what they had read by the time they got to the manipulation check or
either the vignette or manipulation check were somehow confusing.
I do think interdependence is a more difficult concept to grok than disclosure. Disclosure is just “Alex told me”. Interdependence requires making sense of your relationship with another person. All in all, I think it’s good enough.
%>%
adhd.data select(matches("\\w{3}\\d.int")) %>%
describe(fast = T) %>%
as_tibble(rownames = "var") %>%
select(-vars) %>%
mutate(across(c(n, min, max, range), as.integer)) %>%
formatAsTable
var | n | mean | sd | min | max | range | se |
aff1.int | 444 | 3.42 | 1.22 | 1 | 5 | 4 | 0.06 |
aff2.int | 444 | 3.65 | 1.04 | 1 | 5 | 4 | 0.05 |
aff3.int | 444 | 3.53 | 1.07 | 1 | 5 | 4 | 0.05 |
aff4.int | 444 | 3.56 | 1.01 | 1 | 5 | 4 | 0.05 |
cog1.int | 444 | 3.78 | 0.90 | 1 | 5 | 4 | 0.04 |
cog2.int | 444 | 3.66 | 1.10 | 1 | 5 | 4 | 0.05 |
cog3.int | 444 | 3.27 | 1.12 | 1 | 5 | 4 | 0.05 |
cog4.int | 444 | 3.26 | 1.13 | 1 | 5 | 4 | 0.05 |
lik1.int | 444 | 3.79 | 0.84 | 1 | 5 | 4 | 0.04 |
lik2.int | 444 | 3.72 | 0.89 | 1 | 5 | 4 | 0.04 |
lik3.int | 444 | 2.16 | 1.06 | 1 | 5 | 4 | 0.05 |
lik4.int | 444 | 1.71 | 0.88 | 1 | 5 | 4 | 0.04 |
Most of the means are hovering around the middle of the scale, which isn’t bad to see. And I can tell at a glance that most people said they liked Alex, which is kind of interesting. Standard deviations are somewhat low; hopefully that doesn’t become a problem when computing regressions.
What are the reliabilities of the measures? Using the lavaan
package, I will do a confirmatory factor analysis of the twelve items. I’m planning to use McDonald’s \(\omega\) in addition to Cronbach’s \(\alpha\) because it performs better and is preferable especially where there is skew.
<- 'aff =~ aff1 + aff2 + aff3 + aff4
cfa.model cog =~ cog1 + cog2 + cog3 + cog4
lik =~ lik1 + lik2 + lik3 + lik4'
<- cfa(cfa.model, adhd.data, effect.coding = T)
cfa.fit
%>%
cfa.fit %>%
reliability as_tibble(rownames = "stat") %>%
formatAsTable
stat | aff | cog | lik |
alpha | 0.91 | 0.44 | -0.49 |
omega | 0.88 | 0.55 | 0.07 |
omega2 | 0.88 | 0.55 | 0.07 |
omega3 | 0.88 | 0.60 | 0.06 |
avevar | 0.71 | 0.41 | 0.50 |
The reliabilities for cog
and lik
are low because some of their items are reverse-coded. I will try again.
<- adhd.data %>%
adhd.data mutate(across(c(cog4, lik3, lik4),
fct_rev,.names = "{.col}.r"),
lik4.r = fct_collapse(lik4.r,
agree = c("Strongly agree",
"Somewhat agree")))
<- 'aff =~ aff1 + aff2 + aff3 + aff4
cfa.model cog =~ cog1 + cog2 + cog3 + cog4.r
lik =~ lik1 + lik2 + lik3.r + lik4.r'
<- cfa(cfa.model, adhd.data, effect.coding = T)
cfa.fit
%>%
cfa.fit %>%
reliability as_tibble(rownames = "stat") %>%
formatAsTable
stat | aff | cog | lik |
alpha | 0.91 | 0.66 | 0.80 |
omega | 0.88 | 0.61 | 0.72 |
omega2 | 0.88 | 0.61 | 0.72 |
omega3 | 0.87 | 0.56 | 0.70 |
avevar | 0.71 | 0.40 | 0.51 |
Ouch, cog
was not a terribly reliable measure.
Anyway, as long as I have the CFA model, I might as well look at the loadings and fit statistics.
# loadings
%>%
cfa.fit %>%
parameterEstimates filter(op == "=~") %>%
formatAsTable
lhs | op | rhs | est | se | z | pvalue | ci.lower | ci.upper |
aff | =~ | aff1 | 0.94 | 0.02 | 47.15 | 0.00 | 0.90 | 0.98 |
aff | =~ | aff2 | 0.99 | 0.01 | 65.95 | 0.00 | 0.96 | 1.02 |
aff | =~ | aff3 | 1.06 | 0.01 | 71.92 | 0.00 | 1.03 | 1.09 |
aff | =~ | aff4 | 1.01 | 0.01 | 71.29 | 0.00 | 0.99 | 1.04 |
cog | =~ | cog1 | 1.67 | 0.06 | 26.25 | 0.00 | 1.55 | 1.80 |
cog | =~ | cog2 | 1.16 | 0.05 | 23.05 | 0.00 | 1.06 | 1.25 |
cog | =~ | cog3 | 0.98 | 0.06 | 16.39 | 0.00 | 0.87 | 1.10 |
cog | =~ | cog4.r | 0.19 | 0.08 | 2.28 | 0.02 | 0.03 | 0.35 |
lik | =~ | lik1 | 1.29 | 0.04 | 36.86 | 0.00 | 1.22 | 1.36 |
lik | =~ | lik2 | 1.08 | 0.03 | 35.47 | 0.00 | 1.02 | 1.14 |
lik | =~ | lik3.r | 0.65 | 0.04 | 15.50 | 0.00 | 0.57 | 0.74 |
lik | =~ | lik4.r | 0.97 | 0.04 | 27.67 | 0.00 | 0.90 | 1.04 |
# fit statistics
<- c("chisq", "df", "pvalue", "rmsea", "tli")
m %>%
cfa.fit fitMeasures(fit.measures = m) %>%
round(3) %>%
as_tibble(rownames = 'stat') %>%
formatAsTable
stat | value |
chisq | 290.53 |
df | 51.00 |
pvalue | 0.00 |
rmsea | 0.10 |
tli | 0.98 |
Interesting. It does look like cog4
deviated from the other cog
items. Still loaded significantly, though.
I’m curious how this would all look if we dropped the potentially problematic observations I previously identified.
<- adhd.data %>%
adhd.data.hq # exclude low-effort participants
filter(!if_any(c(random_clicker, nonpart, ftl)))
<- cfa.model %>%
cfa.fit.hq cfa(adhd.data.hq, effect.coding = T)
%>%
cfa.fit.hq %>%
reliability as_tibble(rownames = "stat") %>%
formatAsTable
stat | aff | cog | lik |
alpha | 0.91 | 0.67 | 0.80 |
omega | 0.88 | 0.61 | 0.73 |
omega2 | 0.88 | 0.61 | 0.73 |
omega3 | 0.88 | 0.57 | 0.71 |
avevar | 0.71 | 0.41 | 0.51 |
%>%
cfa.fit.hq %>%
parameterEstimates filter(op == "=~") %>%
formatAsTable
lhs | op | rhs | est | se | z | pvalue | ci.lower | ci.upper |
aff | =~ | aff1 | 0.93 | 0.02 | 44.16 | 0.00 | 0.89 | 0.97 |
aff | =~ | aff2 | 0.99 | 0.02 | 64.73 | 0.00 | 0.96 | 1.02 |
aff | =~ | aff3 | 1.06 | 0.02 | 69.58 | 0.00 | 1.03 | 1.09 |
aff | =~ | aff4 | 1.01 | 0.01 | 69.31 | 0.00 | 0.98 | 1.04 |
cog | =~ | cog1 | 1.67 | 0.07 | 25.64 | 0.00 | 1.54 | 1.80 |
cog | =~ | cog2 | 1.16 | 0.05 | 21.70 | 0.00 | 1.05 | 1.26 |
cog | =~ | cog3 | 0.98 | 0.06 | 15.37 | 0.00 | 0.85 | 1.10 |
cog | =~ | cog4.r | 0.20 | 0.08 | 2.39 | 0.02 | 0.04 | 0.36 |
lik | =~ | lik1 | 1.30 | 0.04 | 35.51 | 0.00 | 1.23 | 1.37 |
lik | =~ | lik2 | 1.08 | 0.03 | 32.75 | 0.00 | 1.01 | 1.14 |
lik | =~ | lik3.r | 0.63 | 0.05 | 13.82 | 0.00 | 0.54 | 0.72 |
lik | =~ | lik4.r | 0.99 | 0.04 | 25.89 | 0.00 | 0.92 | 1.07 |
Looks like cog4
had some marginal improvement that shows up in both the loading and the reliability.
I will run it one more time without cog4
and we’ll see if that improves things.
<- cfa.model %>%
cfa.fit.cog str_remove(fixed(" + cog4.r")) %>%
cfa(adhd.data.hq, effect.coding = T)
%>%
cfa.fit.cog %>%
reliability as_tibble(rownames = "stat") %>%
formatAsTable
stat | aff | cog | lik |
alpha | 0.91 | 0.75 | 0.80 |
omega | 0.88 | 0.70 | 0.73 |
omega2 | 0.88 | 0.70 | 0.73 |
omega3 | 0.88 | 0.69 | 0.71 |
avevar | 0.71 | 0.54 | 0.51 |
Nice. The McDonald’s \(\omega\) for cog
increased to 0.7!
Save data:
%>%
adhd.data saveRDS(file.path("..", "data", "adhd-data.rds"))
%>%
adhd.data write_csv(file.path("..", "data", "adhd-data.csv"))
%>%
adhd.data.hq saveRDS(file.path("..", "data", "hq-data.rds"))
%>%
adhd.data.hq write_csv(file.path("..", "data", "hq-data.csv"))
Output document:
options(knitr.duplicate.label = "allow")
::render("descriptives.Rmd",
rmarkdownoutput_dir = file.path("..", "github", "thesis"))