Sample Characteristics

A total of 41 persons showed up for the experiment, 21 assigned to the symbolic instrument and 20 to the textual. Out of these 5 are filtered out due to low performance in questions testing their comprehension of concepts presented in the videos and 1 due to an incomplete response. The remaining 35 cases, , 18 in the symbolic and 17 in the textual instrument, are used for the analysis. They are 26 females and 9 males, their ages are predominantly 18-29 and their field of study primarily Business and Economics.

Descriptives

Descriptives: Positive Contributions

Models are annotated with the column name by which they represented in the data file (C.1, C.2 etc).

Descriptives: Negative Contributions

Models are annotated with the column name by which they represented in the data file (C.1, C.2 etc).

Agreement Analysis

To measure the distance between participant responses we first map FD, PD, N, PS, FS (their actual responses) to the interval scale \([1,5]\). Then for each of the twenty scenarios and for each group (symbolic vs. textual) we perform all pair-wise comparisons between participant responses \(r_i\) and \(r_j\), \(i,j = 1\ldots N, i\neq j\) to calculate the normalized distance \(|r_i - r_j|/4\); the average of all these \(N(N-1)/2\) distances is considered, \(N\) being the number of participants for each group. The resulting set consists of \(2\) (groups) \(\times\) \(20\) (exercises) \(= 40\) data points each expressing level of total distance between the ratings of every pair of participants.

BoxPlots

Agreement wrt. Sign and Intensity

There is an obvious difference between positive and negative contributions, the latter yielding more disagreement. An effect can also be seen in intensity yet more subtle. Interestingly, a greater level of agreement appears to emerge in symbolic representations, especially in positive contributions.

Agreement wrt. Sign and Origin Satisfaction

In these two graphs we compare models in which the origin has a positive contribution with those whose origin has a negative one. Responses associated with models in which the satisfaction of the origin is “no information” (N) are omitted.

While a possible effect of the group (symbolic vs. textual) is very subtle, there is a clearer effect can be seen in the figure on the right: denial of origin is always a source of disagreement. The difference is more pronounced in the positive contributions where the disagreement is nevertheless generally less.

Relative Deviation from Normative (I)

To measure the distance between the participant responses and the normative ones according to the formal semantics (or simply the accuracy of the participants’ responses), we again coded both responses to the scale \([1,5]\) which we interpreted as interval. For each single response \(i\) on exercise \(j\) two distances are calculated: relative distance \(d_i = obs_{ij} - norm_j\) and absolute distance \(|d_i|\). When the former is positive participants overestimate satisfaction of the destination and vice-versa.

Boxplots

In the top left figure pp,p,n,nn stand for “\(++\)”,“\(+\)”,“\(-\)”,“\(--\)” or make, help, hurt, break depending on the group considered. In the same figure it is clear that with positive labels responses satisfaction of the destination goal is overestimated and with negative is underestimated.

The top left figure seems to also show that when the origin goal is denied, the underestimation is more pronounced.

Is relative deviation zero?

Group x Sign x Intensity

The following are the p-values of one sample t-tests (as well as the non-parametric equivalent) that the distance from normative is zero (0) of each of the \(2\times 2\times2\) (visualization group,label sign, label intensity) cells, as well as the \(2\times 2 \times 2\) (visualization group, label sign, origin satisfaction) cell.

Relative distance is meaningful for this test. They are all independent comparisons.

Group Mean Deviation Sign. (* means p < 0.5)
Symbolic.pp 0.49 * (non-zero)
Symbolic.p 0.14
Symbolic.m -0.31 * (non-zero)
Symbolic.mm -0.38 * (non-zero)
Textual.pp 0.61 * (non-zero)
Textual.p 0.29 * (non-zero)
Textual.m -0.20
Textual.mm -0.49 * (non-zero)
Symbolic.ppSat 0.17 * (non-zero)
Symbolic.pSat 0.39 * (non-zero)
Symbolic.mSat 0.36
Symbolic.mmSat 0.72 * (non-zero)
Symbolic.ppDen 0.83 * (non-zero)
Symbolic.pDen -0.14
Symbolic.mDen -1.03 * (non-zero)
Symbolic.mmDen -1.44 * (non-zero)
Textual.ppSat -0.09
Textual.pSat 0.21
Textual.mSat 0.62
Textual.mmSat 0.82 * (non-zero)
Textual.ppDen 1.41 * (non-zero)
Textual.pDen 0.47
Textual.mDen -1.00 * (non-zero)
Textual.mmDen -1.88 * (non-zero)

Overestimation vs. Underestimation

We now investigate under what circumstances participants overestimate and underestimate satisfaction. In the table below the average overestimation (positive value) or underestimation (negative value) per visualization style, contribution label and satisfaction origin is displayed.

##                 Label   Make (++)    Help (+)    Hurt (-)  Break (--)
## Group    Origin                                                      
## Symbolic Sat           0.16666667  0.38888889  0.36111111  0.72222222
##          Den           0.83333333 -0.13888889 -1.02777778 -1.44444444
## Textual  Sat          -0.08823529  0.20588235  0.61764706  0.82352941
##          Den           1.41176471  0.47058824 -1.00000000 -1.88235294

There are some differences between the two visualizations in terms of overestimation and underestimation. In addition, extreme labels (\(++\)) and (\(--\)) may naturally feature greater error. Cases in which the average error is consistent and substantial (\(>0.7\)) are:

  1. hurt and break labels with denied origin goals, where satisfaction is underestimated.

  2. break labels with satisfied origin goals, where satisfaction is overestimated. As above participants do not seem to perceive the satisfaction inversion of negative labels.

  3. make links with denied origin goals, where satisfaction is overestimated.

Looking at the descriptive images above, it becomes clear that many participants do not seem to perceive the satisfaction inversion of negative labels (cases (a) and (b)). In addition they do not seem to accept that even a strong makes relationship can result in a fully denied destination goal.

If we focus exclusively on cases in which the satisfaction of the origin is labelled as “no-information” (N) the following table describes the average deviation from normative.

##          Label  Make (++)   Help (+)   Hurt (-) Break (--)
## Group                                                     
## Symbolic        0.4444444  0.2222222 -0.2222222 -0.4444444
## Textual         0.4117647  0.1176471 -0.2352941 -0.3529412

It seems that users perceive contributions as generators of satisfaction than just propagators thereof. Positive contributions result in some satisfaction, and negative contributions result to some denial irrespective of the satisfaction of the origin. This mental model of labels conflicts with the links’ ability to completely inverse satisfaction, which users also had some trouble to comprehend.

Assumptions

Homogeity - Box’s test

## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  dRelLab[, -c(1, 2)]
## Chi-Sq (approx.) = 14.706, df = 10, p-value = 0.1431

Box’s test does not meet the significance test; we can assume homogeneity of covariance matrices.

Normality Tests

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.79087, p-value = 0.001132
## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.59411, p-value = 9.134e-06

QQ-Plots

We do seem to have issues with normality and the small sample size does not entirely allow us to deal with it by appealing to robustness arguments.

MANOVA

## 
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
##                     Df test stat approx F num Df den Df   Pr(>F)   
## (Intercept)          1  0.001781   0.0589      1     33 0.809793   
## Group                1  0.019667   0.6620      1     33 0.421674   
## Sig                  1  0.198117   8.1531      1     33 0.007378 **
## Group:Sig            1  0.005268   0.1748      1     33 0.678606   
## Intensity            1  0.178278   7.1596      1     33 0.011516 * 
## Group:Intensity      1  0.081121   2.9133      1     33 0.097245 . 
## Sig:Intensity        1  0.097017   3.5455      1     33 0.068546 . 
## Group:Sig:Intensity  1  0.012279   0.4102      1     33 0.526271   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sign Effect

## 
##  Response transformation matrix:
##    Sig1
## pp   -1
## p    -1
## m     1
## mm    1
## 
## Sum of squares and products for the hypothesis:
##          Sig1
## Sig1 31.46889
## 
## Sum of squares and products for error:
##          Sig1
## Sig1 127.3711
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df    Pr(>F)   
## Pillai            1 0.1981169 8.153131      1     33 0.0073777 **
## Wilks             1 0.8018831 8.153131      1     33 0.0073777 **
## Hotelling-Lawley  1 0.2470646 8.153131      1     33 0.0073777 **
## Roy               1 0.2470646 8.153131      1     33 0.0073777 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We have significant effect on Sign \(F(1,33)=8.1531, p=0.007378\)

Intensity Effect

## 
##  Response transformation matrix:
##    Intensity1
## pp          1
## p          -1
## m          -1
## mm          1
## 
## Sum of squares and products for the hypothesis:
##            Intensity1
## Intensity1   1.388889
## 
## Sum of squares and products for error:
##            Intensity1
## Intensity1   6.401699
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df   Pr(>F)  
## Pillai            1 0.1782778 7.159557      1     33 0.011516 *
## Wilks             1 0.8217222 7.159557      1     33 0.011516 *
## Hotelling-Lawley  1 0.2169563 7.159557      1     33 0.011516 *
## Roy               1 0.2169563 7.159557      1     33 0.011516 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We have significant effect on Intensity \(F(1,33)=7.1596, p=0.011516\)

Relative Deviation from Normative (II)

In these tests, we utilize a different aggregation of the data, in which the satisfaction of the origin is also considered as one of the factors. In this data set the data points coming from models in which satisfaction of origin is “No Information” (N) are excluded.

BoxPlots

Assumptions

Homogeity - Box’s test

## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  dRelSatPr[, -c(1, 2)]
## Chi-Sq (approx.) = 64.497, df = 36, p-value = 0.002446

Given the number of DVs here it is not suprizing that Box’s M test fails. Following Tabachnick and Fidell, we randomly remove one of the cases in order to bring all cells to equal size. In this case and given we do not fail for \(p<0.001\) we can proceed with the analysis.

Normality Tests

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.58052, p-value = 6.784e-06
## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.51064, p-value = 1.597e-06

(see commentary above on normality)

MANOVA

## 
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
##                  Df test stat approx F num Df den Df   Pr(>F)   
## (Intercept)       1  0.001860   0.0596      1     32 0.808654   
## Group             1  0.021013   0.6868      1     32 0.413381   
## Sig               1  0.195266   7.7647      1     32 0.008884 **
## Group:Sig         1  0.006464   0.2082      1     32 0.651259   
## Origin            1  0.275118  12.1451      1     32 0.001450 **
## Group:Origin      1  0.015996   0.5202      1     32 0.475996   
## Sig:Origin        1  0.207551   8.3811      1     32 0.006778 **
## Group:Sig:Origin  1  0.033058   1.0940      1     32 0.303418   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sign Effect

## 
##  Response transformation matrix:
##       Sig1
## ppSat   -1
## ppDen   -1
## pSat    -1
## pDen    -1
## mSat     1
## mDen     1
## mmSat    1
## mmDen    1
## 
## Sum of squares and products for the hypothesis:
##          Sig1
## Sig1 132.7206
## 
## Sum of squares and products for error:
##          Sig1
## Sig1 546.9706
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df   Pr(>F)   
## Pillai            1 0.1952660 7.764693      1     32 0.008884 **
## Wilks             1 0.8047340 7.764693      1     32 0.008884 **
## Hotelling-Lawley  1 0.2426467 7.764693      1     32 0.008884 **
## Roy               1 0.2426467 7.764693      1     32 0.008884 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There is a main effect on Sign \(F(1,32)=7.7647, p=0.008884\).

Origin Satisfaction Effect

## 
##  Response transformation matrix:
##       Origin1
## ppSat      -1
## ppDen       1
## pSat       -1
## pDen        1
## mSat       -1
## mDen        1
## mmSat      -1
## mmDen       1
## 
## Sum of squares and products for the hypothesis:
##          Origin1
## Origin1 222.4853
## 
## Sum of squares and products for error:
##          Origin1
## Origin1 586.2059
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df    Pr(>F)   
## Pillai            1 0.2751177  12.1451      1     32 0.0014504 **
## Wilks             1 0.7248823  12.1451      1     32 0.0014504 **
## Hotelling-Lawley  1 0.3795344  12.1451      1     32 0.0014504 **
## Roy               1 0.3795344  12.1451      1     32 0.0014504 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There is a main effect on Satisfactin of Origin goal \(F(1,32) = 12.1451, p=0.00145\).

Interaction

The above interaction is fairly intuitive: for negative contributions a denial of the origin leads to underestimation of the destination satisfaction (participants do not perceive satisfaction inversion). Positive contributions appear to be perceived as “blocking” the denial of the origin goal, hence the slight overestimation in such configurations.

Absolute Deviation from Normative (I)

We now turn to deviations based on absolute distance.

BoxPlots

Assumptions

Homogeity - Box’s test

## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  dAbsLab[, -c(1, 2)]
## Chi-Sq (approx.) = 13.059, df = 10, p-value = 0.2204

Box’s test does not meet the significance test; we can assume homogeneity of covariance matrices.

Normality Tests

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.88285, p-value = 0.02916
## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.84106, p-value = 0.007859

QQ-Plots

We have deviation from normality which furthermore cannot be cured with transformations. Judging from the “quantized” appearance of the graphs we can suspect that the culprit is the discrete scale [0..4] for measuring distance combined probably with the small sample size. The tests that follow are based on the assumption of their robustness to such deviations suggested in Tabachnick and Fidel and that for “large” \(N\) the problem is less critical.

Hypotheses Testing

## 
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
##                     Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)          1   0.53737   38.331      1     33 5.502e-07 ***
## Group                1   0.02478    0.839      1     33   0.36646    
## Sig                  1   0.17961    7.225      1     33   0.01118 *  
## Group:Sig            1   0.00335    0.111      1     33   0.74111    
## Intensity            1   0.01794    0.603      1     33   0.44298    
## Group:Intensity      1   0.00868    0.289      1     33   0.59458    
## Sig:Intensity        1   0.05417    1.890      1     33   0.17848    
## Group:Sig:Intensity  1   0.02010    0.677      1     33   0.41651    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There are significant main effects in Sign \(F(1,33)=7.225, p<0.05\) and pretty much this is all there is.

Sign Effect

## 
##  Response transformation matrix:
##    Sig1
## pp   -1
## p    -1
## m     1
## mm    1
## 
## Sum of squares and products for the hypothesis:
##          Sig1
## Sig1 16.05556
## 
## Sum of squares and products for error:
##          Sig1
## Sig1 73.33503
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df   Pr(>F)  
## Pillai            1 0.1796113 7.224833      1     33 0.011179 *
## Wilks             1 0.8203887 7.224833      1     33 0.011179 *
## Hotelling-Lawley  1 0.2189343 7.224833      1     33 0.011179 *
## Roy               1 0.2189343 7.224833      1     33 0.011179 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Intensity Effect

## 
##  Response transformation matrix:
##    Intensity1
## pp          1
## p          -1
## m          -1
## mm          1
## 
## Sum of squares and products for the hypothesis:
##            Intensity1
## Intensity1  0.6422222
## 
## Sum of squares and products for error:
##            Intensity1
## Intensity1   35.14837
## 
## Multivariate Tests: 
##                  Df test stat  approx F num Df den Df  Pr(>F)
## Pillai            1 0.0179439 0.6029678      1     33 0.44298
## Wilks             1 0.9820561 0.6029678      1     33 0.44298
## Hotelling-Lawley  1 0.0182718 0.6029678      1     33 0.44298
## Roy               1 0.0182718 0.6029678      1     33 0.44298

There is no Intensity effect.

Absolute Deviation from Normative - (II)

To consider satisfaction of origin as one of the factors, a different data set is prepared, in which the N (no information) satisfaction value is eliminated.

## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.

Assumptions

Homogeity - Box’s test

## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  dAbsSat[, -c(1, 2)]
## Chi-Sq (approx.) = 59.233, df = 36, p-value = 0.008688
## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  dAbsSatPr[, -c(1, 2)]
## Chi-Sq (approx.) = 55.805, df = 36, p-value = 0.01868

Given the number of DVs here it is not suprizing that Box’s M test fails. Following Tabachnick and Fidell, we randomly remove one of the cases in order to bring all cells to equal size. In this case and given we do not fail for \(p<0.001\) we can proceed with the analysis.

Normality Tests

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.58437, p-value = 7.377e-06
## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.53423, p-value = 2.564e-06

See above for discussion on normality.

MANOVA

## 
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
##                  Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)       1   0.55736   40.293      1     32 3.982e-07 ***
## Group             1   0.01968    0.642      1     32  0.428746    
## Sig               1   0.22793    9.447      1     32  0.004301 ** 
## Group:Sig         1   0.01126    0.364      1     32  0.550297    
## Origin            1   0.17249    6.670      1     32  0.014582 *  
## Group:Origin      1   0.00088    0.028      1     32  0.867480    
## Sig:Origin        1   0.00408    0.131      1     32  0.719534    
## Group:Sig:Origin  1   0.00051    0.016      1     32  0.898887    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There is a significant effect with Sign \(F(1,32) = 9.447, p<0.01\) and the satisfaction value of the origin \(F(1,32) = 6.67, p<0.05\). As we saw in the graphs, both negative sings and denial values cause more deviation from the formal semantics.

Sign Effect

## 
##  Response transformation matrix:
##       Sig1
## ppSat   -1
## ppDen   -1
## pSat    -1
## pDen    -1
## mSat     1
## mDen     1
## mmSat    1
## mmDen    1
## 
## Sum of squares and products for the hypothesis:
##          Sig1
## Sig1 119.1176
## 
## Sum of squares and products for error:
##       Sig1
## Sig1 403.5
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df    Pr(>F)   
## Pillai            1  0.227925 9.446753      1     32 0.0043013 **
## Wilks             1  0.772075 9.446753      1     32 0.0043013 **
## Hotelling-Lawley  1  0.295211 9.446753      1     32 0.0043013 **
## Roy               1  0.295211 9.446753      1     32 0.0043013 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Satisfaction Origin Effect

## 
##  Response transformation matrix:
##       Origin1
## ppSat      -1
## ppDen       1
## pSat       -1
## pDen        1
## mSat       -1
## mDen        1
## mmSat      -1
## mmDen       1
## 
## Sum of squares and products for the hypothesis:
##          Origin1
## Origin1 84.94118
## 
## Sum of squares and products for error:
##         Origin1
## Origin1   407.5
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df   Pr(>F)  
## Pillai            1 0.1724900 6.670227      1     32 0.014582 *
## Wilks             1 0.8275100 6.670227      1     32 0.014582 *
## Hotelling-Lawley  1 0.2084446 6.670227      1     32 0.014582 *
## Roy               1 0.2084446 6.670227      1     32 0.014582 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Confidence Analysis

To analyze response confidence and what affects it we first map the Likert-type scale to the scale {-2,-1,0,+1,+2}, and then sum up the correspondingly coded values per type of contribution link for each of the groups. We then proceed with the analysis as above. Neither visually nor through statistical tests do we find any main effect with respect to type of label to the response confidence of the participants.

The Data: BoxPlot

Hypotheses Testing

## 
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
##                     Df test stat approx F num Df den Df   Pr(>F)    
## (Intercept)          1   0.52017   35.774      1     33 1.02e-06 ***
## Group                1   0.00517    0.172      1     33  0.68142    
## Sig                  1   0.06982    2.477      1     33  0.12505    
## Group:Sig            1   0.00898    0.299      1     33  0.58823    
## Intensity            1   0.03388    1.157      1     33  0.28984    
## Group:Intensity      1   0.04030    1.386      1     33  0.24754    
## Sig:Intensity        1   0.09285    3.378      1     33  0.07510 .  
## Group:Sig:Intensity  1   0.10918    4.045      1     33  0.05254 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response transformation matrix:
##    Sig1
## pp   -1
## p    -1
## m     1
## mm    1
## 
## Sum of squares and products for the hypothesis:
##           Sig1
## Sig1 0.8022222
## 
## Sum of squares and products for error:
##          Sig1
## Sig1 10.68719
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df  Pr(>F)
## Pillai            1 0.0698227 2.477109      1     33 0.12505
## Wilks             1 0.9301773 2.477109      1     33 0.12505
## Hotelling-Lawley  1 0.0750639 2.477109      1     33 0.12505
## Roy               1 0.0750639 2.477109      1     33 0.12505
## 
##  Response transformation matrix:
##    Intensity1
## pp          1
## p          -1
## m          -1
## mm          1
## 
## Sum of squares and products for the hypothesis:
##            Intensity1
## Intensity1  0.3755556
## 
## Sum of squares and products for error:
##            Intensity1
## Intensity1   10.70915
## 
## Multivariate Tests: 
##                  Df test stat approx F num Df den Df  Pr(>F)
## Pillai            1 0.0338805 1.157266      1     33 0.28984
## Wilks             1 0.9661195 1.157266      1     33 0.28984
## Hotelling-Lawley  1 0.0350687 1.157266      1     33 0.28984
## Roy               1 0.0350687 1.157266      1     33 0.28984