1 Descriptives

230 crowdworkers participated in this study, with 114, 16, 100 from the Applause application testing, Atizo innovation and market research and generic Crowdguru crowdsourcing platforms respectively.

1.1 Demographics

## Warning: Removed 3 rows containing non-finite values (stat_bin).
Participants by Age, Gender and Platform

Abbildung 1.1: Participants by Age, Gender and Platform

Perhaps surprisingly, the crowdworkers surveyed here include young and old, with a mean age around 38.1 years. The typical participant is in early middle age, with a median age of 35 years, also indicating a skew towards younger participants overall, and few older outliers. The platform Crowdguru draws relatively more female participants, while Applause draws more male respondents. Across all three platforms, there are roughly equal numbers of female and male respondents (109, 121, respectively).

## Warning: Factor `education` contains implicit NA, consider using
## `forcats::fct_explicit_na`
Table 1.1: Educational Attainment
education n
Fachhochschulabschluss 61
Universitätsabschluss 82
Betriebliche Berufsausbildung oder Lehre 47
Beamtenausbildung für die Laufbahn des öffentlichen Dienstes 1
Andere Berufsausbildung 18
Schulische Berufsausbildung 20
Keine Angabe o. Fehlwert 1

Participants are, overall, relatively educated, with over half reporting a tertiary degree, though this may, in part, be explained by the age structure of participants.

Only 22 participants report caring for a sick, old or disabled person in their household, and 62 have children.

## Warning: Factor `employment` contains implicit NA, consider using
## `forcats::fct_explicit_na`
Table 1.2: Employment Status
employment n
Voll erwerbstätig 108
In Teilzeitbeschäftigung 38
In betrieblicher Ausbildung/Lehre/oder betrieblicher Umschulung 7
Geringfügig oder unregelmäßig erwerbstätig 35
Nicht erwerbstätig 41
Keine Angabe o. Fehlwert 1

More than half of the participants are in some kind of regular employment, though it is unclear whether participants understood the question to include their crowdworking.

Participants report having had, on arithmetic average, 4.9 separate employers or periods of self-employment, though the typical respondent only reports 4, reflecting some outliers with great declared fluctuation. In addition, responses of more than 50, or 0 employers were dropped from the analysis, because both are likely erroneous entries.

The surveyed crowdworkers assessed their professional life overall optimistically with 115, 90, 24 responding “as on the rise”, “unchanging” and “on the decline”.

1.2 Current Work

Study subjects said they worked, on arithmetic average, for 1.7 platforms, with a range from 0 to 5. There were again a few erroneous entries.

Table 1.3: Type of Platform
platforms n
Innovations-Plattformen 19
Testing-Plattformen 111
Design-Plattformen 1
Microjob-Plattformen 88
Sonstige 11

Testing and microjob platforms are most popular among the participants, as is also reflected in the many respondents recruited via Applause and Crowdguru.

Crowdworking Hours per Month

Abbildung 1.2: Crowdworking Hours per Month

On arithmetic average, participants worked 48.7 hours per month (median 30), but there is a substantial amount of spread (standard deviation 58.3). Figure 1.2 displays the probability density estimates for reported hours worked on all, and the platform in question. 1 The figure suggests that the working hours are more broadly spread out for respondents from Crowdguru, with more participants working shorter hours for Applause. In both cases, the amount of hours worked on the platform in question tracks quite closely the amount of hours crowdworked in total, with an arithmetic average of only 8.6 worked outside of the studied platforms. The density estimate of the hours worked on the platform in question can sometimes be higher than the overall hours; this is because the estimate is an aggregate statistic. At the individual level, all participants who reported higher hours worked on the platform in question than on all crowdworking platforms were dropped on both variables. 2

Table 1.4: Predominant Working Hours
time_of_day n
Überwiegend morgens 17
Überwiegend mittags 5
Überwiegens nachmittags 22
Überwiegend abends 59
Überwiegend nachts 8
Es gibt keine feste Tageszeit 119

There appears to be a slight preference for working in the evening, though most study subjects responded not working at fixed time during the day.

Table 1.5: Predominant Working Days
time_of_week n
Unter der Woche 53
Am Wochenende 19
Die ganze Woche über 158

Similarly, most respondends reported working throughout the week, including weekends.

Table 1.6: Predominant Working Locations
workspace n
Von Zuhause aus 217
Von der Arbeit aus 11
Wenn ich gerade unterwegs bin 2

The overwhelming majority of all participants said they worked from home.

About half of the participants (119) would like to do their current crowdworking as a permanent, regular job.

1.3 Work and Crowdwork Expectations

The participants were surveyed on 28 items concerning expectations towards work, under two separate contexts: work in general, and crowdwork. The items were slightly reworded to fit the two contexts.

Abbildung 1.3: Items, Wordings and Average Scores

It should be noted that the items between the two batteries (work and crowdwork) are not strictly comparable. These differences are sometimes above and beyond, or unrelated to the change context. For example, item fair is worded “to be treated fairly” for the crowdwork context, and “to be treated fairly at work” (emphasis added) for the regular work context. Such subtle shifts in emphasis may contribute small shifts in ratings, or may, in any event, make it hard to compare the ratings between the two contexts.

Work Expectations (Heatmap of Counts)

Abbildung 1.4: Work Expectations (Heatmap of Counts)

Crowdwork Expectations (Heatmap of Counts)

Abbildung 1.5: Crowdwork Expectations (Heatmap of Counts)

The above figures 1.3 and ?? show relatively little variance within and across the measured 28 variables in both contexts. Most participants rate most of the items as relatively important, or very important. Hardly any respondents and hardly any variables include negative ratings (“does not apply at all”, “does not apply”). This does not bode well for a correlation analysis or dimensionality reduction: there is just not enough variance on either the people or variable mode in the data matrix to expect much structure.

Work - Crowd Expectations (Heatmap of Counts)

Abbildung 1.6: Work - Crowd Expectations (Heatmap of Counts)

There is also not much difference in the rankings of the variable compared between the two contexts. In figure 1.6, the individual crowdwork scores are subtracted from the work scores, and the results are then counted. For example, the counts for a -3 in figure 1.6 is the number of participants who rate the item in question 3 levels higher in the crowd than in the regular work context. There are, alas, very view such substantial differences. Most of the participants rate the statements very similarly for the two contexts.

2 Structure

Because the great majority of participants rated most items as “agree” (1) or “strongly agree” (2), interpreting the Likert scores as interval-scaled, as is sometimes done, may not be appropriate in this context. When z-scored as part of most parametric procedures, the relatively few outlying ratings will be transformed into great distances, an interpretation that may not accord with the usage or expectations of participants. Such z-scoring is not usually a problem when there are many and widely-used levels or when the normative measurement of the underlying phenomenon is easily accomplished (as for temperature), but in tightly concentrated, discrete distributions such as the present one, z-scoring forces some arbitrary choices. For example, a Pearsons correlation coefficient \(\rho\) will be heavily swayed by possibly correlated items in the extremes of these faux-continuous, scaled distributions. Other (parametric) procedures, such as the Kendall’s rank correlation coefficient \(\tau\) used below im prove upon this situation somewhat, but also merely imply another arbitrary choice in how to weigh the rare extremes (in this case, swayed by the number of ties).

A proper analysis of the data, to the extent that it is worthwhile, will therefore require non-parametric procedures.

2.1 Correlations

Somewhat surprisingly given the above small differences, ratings of the two contexts only display middling correlations of around 0.4, though these are still fairly high for individual level data (each data point is a person-variable rating). 3

A surprising pattern stands out from the correlation coefficient heatmap shown in the above: many of the largest correlations are found closely to the diagonal for adjacent items, such as balance_loc and balance_time, correlated at .7. This artefact, not otherwise easily explained, may reflect an ordering effect, as participants were likely to respond similary to items presented together. Because the Likert items (or other questions) were not randomized in the survey, this makes the interpretation difficult, and any correlations or patterns therein suspect: an item pair may be correlated simply because of adjacency, but the correlation may also express a substantive association. Additionally, adjacent items are also often worded similarly, though the patterns of similarly differ between the two conditions of instruction, such as deadline and quantity (at .6).

Correlations of Work Expectations

Abbildung 2.1: Correlations of Work Expectations

Correlations of Crowdwork Expectations

Abbildung 2.2: Correlations of Crowdwork Expectations

As expected, the correlations between the two variables are relatively low for both contexts, with an average absolute Kendall’s correlation coefficient of 0.2 for the work, and 0.3 crowdwork context. The minimum significance of .99 chosen in the above plots should be considered generous in this case; given the high number of pairs, there are likely to be some correlations just by chance.

2.2 Dimensionality Reduction

Adjusted Scree Plot for Work

Abbildung 2.3: Adjusted Scree Plot for Work

## Parallel analysis suggests that the number of factors =  8  and the number of components =  5
Adjusted Scree Plot for Crowd

Abbildung 2.4: Adjusted Scree Plot for Crowd

## Parallel analysis suggests that the number of factors =  5  and the number of components =  3

Concordant with the above low (Kendall’s \(\tau\)) correlations, a Monte-Carlo parallel analysis based on polychoric correlations suggests that there may be, at best, 2 and 2 factors to be retained, explaining a combined 39.2 and – notably higher – 48.4 percent of the variance for work and crowd, respectively. The parallel analysis was conducted using resampling and simulation, as well as listwise and pairwise complete observations (as in the above), all with similar results. It should be noted that while the parallel analysis retention criteria (de-biased Kaiser-Guttman) curtails overfitting by the number of parameters (here, factors), the effect size (here, eigenvalues or explained variance) is still biased upwards. The adjusted Eigenvalues – the part not readily explained by random chance – sum to 25.7 and 34.9 percent of the variance, respectively.

There appears to be a somewhat more pronounced pattern under the crowd condition.

Abbildung 2.5: Loadings for Work Condition

## Warning: `as.tibble()` is deprecated, use `as_tibble()` (but mind the new semantics).
## This warning is displayed once per session.
Loadings for Work Condition

Abbildung 2.5: Loadings for Work Condition

Abbildung 2.6: Loadings for Crowd Condition

Loadings for Crowd Condition

Abbildung 2.6: Loadings for Crowd Condition

The above presented, unrotated loadings reveal a strong first general unipolar factor for both work and crowd contexts, loading somewhat discriminately on all variables. A second, bipolar factor loads only on some variables.

Unfortunately, and in line with the above correlations, many of the highly loading variables are also adjacent in the survey, possibly reflecting mere ordering artefacts.

Because these potential artefacts cannot be remedied, there is remains a real risk that the extracted factors are, in fact, little but reflections of the question order. This potential defect renders any further analysis and rotation moot.


  1. There are too few non-missing responses for the Atizo crowdworkers to meaningfully include in this graph.

  2. Because it was impossible to figure out which of the two numbers was entered erroneously, both were dropped from the analysis for these participants.

  3. To get meaningful correlations in spite of the frequent missing values, this and all of the below correlations are using pairwise complete observations. No systematic analysis was undertaken to check whether and how many observations remain using this procedure, and whether any cases or variables should be excluded entirely.