Comments on Greenberg et. al. (2019): When Big Data Are The Answer

Kelsey Perrykkad and Jakob Hohwy

— Click here to download pdf —

In their reply to our letter in PNAS, Greenberg et. al. argue that their original conclusion in favour of the Extreme Male Brain theory of autism was in fact justified based on results from their big data study. They argue that the autism quotient (AQ) does accurately capture [diagnosed cases of] autism in both sexes, contrary to our claims that it may be biased against female symptomatology. Also that the AQ was not designed to distinguish the sexes in a typical population, but show a significant correlation. They provide biological evidence for the Extreme Male Brain theory in addition to the data provided by the original study.

The most important part of their reply, we think, is that we focused on the development of the full original versions of the questionnaires, and their creation with reference to the full AQ and/or a diagnosed autistic sample. We acknowledge that we may have dismissed the importance of the adjustments made to the questionnaires for this study too quickly; all except those used in the replication cohort were shortened versions. The replication cohort did use the full versions of the EQ and SQ to confirm the predictions of the Empathizing-Systematizing theory of sex differences.

While these short versions were not tested against AQ or in autistic populations as part of their validation, they were validated by showing that they highly correlate (r=.82-.96) with previous versions of the questionnaires, including the full versions we cited. In other words, they indirectly relate to AQ in virtue of their design, even if they “were not developed to have an expected relationship with the AQ”. Greenberg et al.’s reply further states that using these different versions of the questionnaires means that “the results are to some extent independent of which items are included and rather an indication of effects in the underlying domains”. The specific items in each short version was a subset of the full original questionnaires (except for two items [32 & 33] on the newly developed SQ-R-Short that came from the SQ-R*), and were chosen based on their discrimination index – a measure of how much a specific answer on that question distinguishes between a particularly high or low result on the original questionnaire. In developing a short questionnaire (or a revised questionnaire), we believe a tension arises when claiming both that it is 1) sufficiently similar to the previous version (in both specific item content and correlation in scores) to warrant their use in measuring the same underlying constructs, and 2) different enough from the predecessors to establish conceptual and statistical independence.

We thank the authors for their thoughtful reply, and think this discussion has inspired many interesting thoughts about questionnaires and correlational research. We acknowledge too that development of psychological questionnaires is a difficult and often thankless task, and that the AQ has been hugely influential and remains a cornerstone of autism research. We commend Greenberg et. al. on the work that we know goes into analysing such a large dataset.


*The shortened form of the systematizing quotient (SQ) developed for this study was based on its revised form, which, as Greenburg et al. highlight, attempted to alleviate potential male bias in the content of the questions (by focusing on “mechanical and abstract systems”) by including more traditionally female domains such as “social systems and domestic systems” (Wheelwright et al., 2006, p. 54). Setting aside the question of whether this is a good way to remove gender bias in responses, only two out of the ten questions in the SQ-R-10 were not in the original SQ, so a small proportion of the questionnaire used was tapping into these added “feminine” domains.



Wheelwright, S., Baron-Cohen, S., Goldenfeld, N., Delaney, J., Fine, D., Smith, R., . . . Wakabayashi, A. (2006). Predicting Autism Spectrum Quotient (AQ) from the Systemizing Quotient-Revised (SQ-R) and Empathy Quotient (EQ). Brain Research, 1079(1), 47-56. doi:

Follow the conversation:

Original Study: Greenberg, D. M., Warrier, V., Allison, C., & Baron-Cohen, S. (2018). Testing the Empathizing–Systemizing theory of sex differences and the Extreme Male Brain theory of autism in half a million people. Proceedings of the National Academy of Sciences, 115(48), 12152-12157. doi:10.1073/pnas.1811032115.

Letter to the Editor: Perrykkad, K., & Hohwy, J. (2019). When big data aren’t the answer. Proceedings of the National Academy of Sciences, 116(28), 13738. doi:10.1073/pnas.1902050116.

Reply from authors: Greenberg, D. M., Warrier, V., Allison, C., & Baron-Cohen, S. (2019). Reply to Perrykkad and Hohwy: When big data are the answer. Proceedings of the National Academy of Sciences, 116(28), 13740. doi:10.1073/pnas.1903773116.