Main Effects
We first attempt a first glimpse of the main effects of (a) AMAS level, (b) CSI Score and (c) approach taken, to the overall correctness.
Looking at the first column, it becomes visible that:
- The numeric group makes more accurate decisions that the symbolic group – comparing the blue versus red lines.
- AMAS level appears to correlate negatively with accuracy for both groups.
- CSI score appears to have a very mild negative correlation with accuracy.
- There is an interaction between Approach and Representation (symbolic vs. numeric) with respect to both accuracy but also AMAS and CSI score.
A close-up can be seen below.
We can attempt to fit a first linear model as follows, noting that “Group” refers to the type of model representation.
full <- lm(Correct ~ Group + amas_score + csi_score + Approach,dd)
Anova(full,type=3) # Type 3 because cells are unbalanced
## Anova Table (Type III tests)
##
## Response: Correct
## Sum Sq Df F value Pr(>F)
## (Intercept) 209.11 1 44.0934 1.812e-09 ***
## Group 342.31 1 72.1802 2.345e-13 ***
## amas_score 26.81 1 5.6539 0.01938 *
## csi_score 15.49 1 3.2662 0.07382 .
## Approach 26.38 1 5.5625 0.02035 *
## Residuals 460.02 97
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We observe that the group differences (symbolic vs. numeric) is indeed highly unusual \(p<0.01\) if we assume that the groups are equally accurate. Furthermore the effect of AMAS score is also very unusual \(p<0.01\) under the null hypothesis that no such effect exists. The effect of CSI score is relatively more likely under the null hypothesis \(p<0.1\) but still raises some suspicion. Finally the approach the participants follow seems to have an effect, working methodically leading generally to more accuracy \(p<0.05\). However, approach seems to interact with other variables (and is also heteroscedastic..) so we return to it below.
Effect sizes are calculated and displayed below with respect to a discretization of the AMAS and CSI scores into Low and High based on their relationship to the population averages described above:
gd1<-effsize::cohen.d(data = dd,Correct ~ Group)
gd2<-effsize::cohen.d(data = dd,Correct ~ amas_level)
gd3<-effsize::cohen.d(data = dd,Correct ~ csi_level)
# For approach we need robust test due to violation of assumptions:
gd4<-akp.effect(Correct~Approach,data = dd,
EQVAR = FALSE)
- Representation (numeric vs. symbolic) explains a mean difference of \(3.53\) correct decisions (out of \(12\) maximum, meaning that those who used the symbolic models performed \(3.53\) more mistakes that those who used the numeric \((d=-1.51)\) – which is a large effect.
- AMAS Level explains a mean difference of \(0.96\) correct decisions \((d=-0.33)\) which is a small effect.
- CSI Score explains a mean difference of \(0.93\) correct decisions \((d=0.32)\) which is a small effect.
- Approach explains a mean difference of \(1.32\) correct answers – robust \(d\) between \(-0.4\) and \(-0.86\).
The following are some diagnostics of the main effects model created, which do not seem to raise red flags except for a possible deviation of normality. The ANOVA models we develop here are generally considered to be fairly robust to slight deviations from normality especially in large sample sizes like ours (\(102\)).
leveneTest(dd$Correct,dd$Group,center=median)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 1.4319 0.2343
## 100
leveneTest(dd$Correct,dd$csi_level,center=median)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 0.024 0.8773
## 100
leveneTest(dd$Correct,dd$amas_level,center=median)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 0.0635 0.8016
## 100
leveneTest(dd$Correct,dd$Approach,center=median) # Problem, hence the robust tests...
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 8.7893 0.003788 **
## 100
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
To eliminate any suspicion of issues due to lack of normality, the following are some robust tests for representation and AMAS Level, showing that show that both effects can be safely stated (even considering Type I error correction to \(p<0.001\)). The CSI level effect is not significant in these tests.
## Call:
## pbad2way(formula = Correct ~ Group + amas_level, data = dd, est = "median",
## nboot = 5000)
##
## p.value
## Group 0.0000
## amas_level 0.0248
## Group:amas_level 0.8600
## Call:
## t2way(formula = Correct ~ Group + amas_level, data = dd)
##
## value p.value
## Group 57.2130 0.001
## amas_level 7.2435 0.010
## Group:amas_level 0.0861 0.771
## Call:
## pbad2way(formula = Correct ~ Group + csi_level, data = dd, est = "median",
## nboot = 5000)
##
## p.value
## Group 0.0000
## csi_level 0.5998
## Group:csi_level 0.3510
## Call:
## t2way(formula = Correct ~ Group + csi_level, data = dd)
##
## value p.value
## Group 52.3929 0.001
## csi_level 0.7031 0.406
## Group:csi_level 0.0197 0.889
A conservative conclusion is to therefore not consider the effect of CSI.
Prior to moving on we also explore what would happen to the main effects model if model size (small versus large) were included as a factor. Small models have two alternatives and large models three. For that we resort to repeated-measures analysis via MANOVA as seen below.
m <- lm(cbind(Correct_Large, Correct_Small) ~ Group + amas_score + csi_score + Approach, data = dd)
Complexity <- ordered(c("Correct_Large", "Correct_Small"))
idata<-data.frame(Complexity)
modAn<-Manova(m,idata = idata, idesign = ~ Complexity,type = "III")
print(modAn)
##
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
## Df test stat approx F num Df den Df Pr(>F)
## (Intercept) 1 0.31251 44.093 1 97 1.812e-09 ***
## Group 1 0.42665 72.180 1 97 2.345e-13 ***
## amas_score 1 0.05508 5.654 1 97 0.01938 *
## csi_score 1 0.03258 3.266 1 97 0.07382 .
## Approach 1 0.05424 5.563 1 97 0.02035 *
## Complexity 1 0.00051 0.050 1 97 0.82420
## Group:Complexity 1 0.01306 1.284 1 97 0.26003
## amas_score:Complexity 1 0.00295 0.287 1 97 0.59368
## csi_score:Complexity 1 0.00754 0.737 1 97 0.39266
## Approach:Complexity 1 0.02395 2.380 1 97 0.12616
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We do not conclude any role of size or even interaction with the other terms; the effects discussed earlier re-appear as expected.