Student Research as a Tool for Investigating Replicability

  1. 1.  Birkbeck, University of London

Picture the following scenario; an undergraduate research course, in which a group of students are tasked with setting up and running an experiment based on a paper from the research literature. The group of students are eager to do well in the course, and over the course of the academic term painstakingly ensure that they set up, run, and analyze their experiment as demanded by the original paper. In the final weeks of the course, when they present their findings to the rest of their cohort, they are deflated; despite all of their hard work, their results are not statistically significant.

“Don’t worry,” the course instructor tells them; “that one never replicates.”

In recent years, the question of how replicable research is has seen much discussion, both within psychology (Open Science Collaboration, 2015; Pashler & Wagenmakers, 2012) and other fields (Ioannidis, 2005); these general concerns have occurred alongside a number of high profile cases of scientific misconduct (Borsboom & Wagenmakers, 2013; LaCour & Green, 2014; Wade, 2010). The overall consequence has been for confidence in the research literature to be drastically eroded (Earp & Trafimow, 2015), with no clear solution to the troubles in sight; that there is even a problem to be solved has been the subject of debate (Pashler & Harris, 2012).

If it is questionable that a replication crisis is indeed taking place, we can at least grant that producing more evidence that the findings of certain experiments are replicable would be beneficial to the scientific literature. While large scale replication projects are feasible and have already been attempted (Klein et al., 2014), I wish to propose that there is already a rich and untapped source of data fitting this same purpose – student research projects. In 2006 alone, there were 90,000 undergraduate students who were awarded bachelor’s degrees in psychology in the US (Halonen, 2011). Even with the conservative assumption that each one of those students conducted only a single study based on the research literature during their degree, that leads to an obviously large number of potential replications.

The question arises, however, as to how to access this untapped data source. One possibility would be to set up an online repository of student results, in a similar fashion to some websites that already exist (Psych File Drawer, n.d.), and to invite instructors or lecturers responsible for coordinating research methods courses to encourage their students to upload their findings to such a database. Alongside their raw data, students could also note what paper they were basing their research on, whether it was an exact or conceptual replication (Hendrick, 1990), and whether their results were statistically significant. Online users would then be able to browse these uploaded results based upon the paper being replicated; summary pages which displayed graphs breaking down the submitted projects could also be used, making interpretation of the student submissions somewhat easier.

The potential benefits of this kind of online student research database are many. Obviously, setting up such a database allows the data from a large number of replications which are already being conducted to be made available to the broader scientific community, and the general strengths of such crowd science have been noted previously (Franzoni & Sauermann, 2014). Since this research is already taking place, it avoids the problem that many researchers may prefer to engage in novel research rather than replication; as an online database, rather than research seeking to be published in scientific journals, it similarly dodges the concern that journals may be biased against publishing replications (Neuliep & Crandall, 1990). There would also be a number of pedagogical perks. By allowing students to engage in large-scale international open science, this may foster an appreciation for such endeavours, which could make organizing future open science projects easier; since open science has had a relatively slow adoption rate among established researchers (Friesike et al., 2015), focusing on individuals very early in their scientific careers may encourage them to be more welcoming of such approaches. Allowing students to be part of what would essentially be an ongoing research project may also lead to them identifying more closely as part of the scientific community, which has been previously shown to have a beneficial effect on student engagement (Olitsky, 2005; Pike, Kuh, & McCormick, 2011). More loosely, this form of collaboration would be an additional skill for students to develop, which may add some utility for subjects which have been accused of not providing adequate value to its students (Strohmetz et al., 2015).

There are, however, also clear drawbacks and caveats to this approach. Undergraduates, being new to research, may not be the most careful experimentalists, and this may lead to questions over whether any given project failed to replicate due to researcher error or because there was no true effect to be found; the intended large number of submitted replications may mitigate this concern to some extent. There is also the related issue that much research will either be too technically difficult or expensive for undergraduate students to replicate; it’s a rarity that an undergraduate would have access to an fMRI, for instance, precluding the possibility of a large segment of the cognitive neuroscience literature from being part of this initiative. That some research will not be replicable by this approach, however, does not mean that we should ignore that a large proportion of the research literature is absolutely capable of being replicated by undergraduates (and is already being done so). Perhaps the largest stumbling block for this student research database would actually be popularising it with course coordinators; if none encourage their students to submit results, then the project has no value. That no instructors would wish to collaborate with such a project seems overly pessimistic, and even if only a handful request that their students submit findings then that still allows for potentially dozens of replications to be contributed to the broader scientific community.

These are evidently not insurmountable problems, and the benefits to both scientific research and teaching would outweigh them. These benefits will certainly not cure all of the ills that are currently plaguing scientific research, but student experiments are a commonplace form of replication and there have been previous calls to employ them to a greater extent (Grahe et al., 2012). If we seek to produce more evidence that research in general is replicable, it would be absurd to ignore a tool that is already present in our labs - even if that tool has limitations.


Borsboom, D., & Wagenmakers, E. J. (2013). Derailed: The rise and fall of Diederik Stapel. APS Observer, 26(1).

Earp, B. D., & Trafimow, D. (2015). Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology, 6. doi: 10.3389/fpsyg.2015.00621

Franzoni, C., & Sauermann, H. (2014). Crowd science: The organization of scientific research in open collaborative projects. Research Policy, 43(1), 1-20. doi: 10.1016/j.respol.2013.07.005

Friesike, S., Widenmayer, B., Gassmann, O., & Schildhauer, T. (2015). Opening science: towards an agenda of open science in academia and industry. The Journal of Technology Transfer, 40(4), 581-601. doi: 10.1007/s10961-014-9375-6

Grahe, J. E., Reifman, A., Hermann, A. D., Walker, M., Oleson, K. C., Nario-Redmond, M., & Wiebe, R. P. (2012). Harnessing the undiscovered resource of student research projects.Perspectives on Psychological Science, 7(6), 605-607. doi: 10.1177/1745691612459057

Halonen, J. S. (2011). Are there too many psychology majors? White Paper. Retrieved from

Hendrick, C. (1990). Replications, strict replications, and conceptual replications: are they important?. Journal of Social Behavior and Personality, 5(4), 41.

Ioannidis, J. P. (2005). Contradicted and initially stronger effects in highly cited clinical research.JAMA, 294(2), 218-228. doi: 10.1001/jama.294.2.218.

Klein, R., Ratliff, K., Vianello, M., Adams Jr, R., Bahník, S., Bernstein, M., ... & Cemalcilar, Z. (2014). Data from investigating variation in replicability: A “Many Labs” Replication Project.Journal of Open Psychology Data, 2(1). doi: 10.5334/

LaCour, M. J., & Green, D. P. (2014). When contact changes minds: An experiment on transmission of support for gay equality. Science, 346(6215), 1366-1369. doi: 10.1126/science.1256151

Neuliep, J. W., & Crandall, R. (1990). Editorial bias against replication research. Journal of Social Behavior and Personality, 5(4), 85.

Olitsky, S. (2007). Promoting student engagement in science: Interaction rituals and the pursuit of a community of practice. Journal of Research in Science Teaching, 44(1), 33-56. doi: 10.1002/tea.20128

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science.Science, 349(6251), doi: 10.1126/science.aac4716

Pashler, H., & Harris, C. R. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7(6), 531-536. doi: 10.1177/1745691612463401

Pashler, H., & Wagenmakers, E. J. (2012). Editors’ introduction to the special section on replicability in psychological science: A crisis of confidence?. Perspectives on Psychological Science, 7(6), 528-530. doi: 10.1177/1745691612465253

Pike, G. R., Kuh, G. D., & McCormick, A. C. (2011). An investigation of the contingent relationships between learning community participation and student engagement. Research in Higher Education, 52(3), 300-322. doi: 10.1007/s11162-010-9192-1 - An Archive of Brief Reports of Replication Attempts in Experimental Psychology (n.d.). Retrived from

Strohmetz, D. B., Dolinsky, B., Jhangiani, R. S., Posey, D. C., Hardin, E. E., Shyu, V., & Klein, E. (2015). The skillful major: Psychology curricula in the 21st century. Scholarship of Teaching and Learning in Psychology, 1(3), 200. doi: 10.1037/stl0000037

Wade, N. (2010). Harvard finds scientist guilty of misconduct. The New York Times.



Showing 1 Reviews

  • 5000591
    John Pellman
    Confidence in paper
    Quality of writing
    Originality of work

    One caveat to the point you make about fMRI studies.  While data collection is indeed prohibitive for undergrads, data analysis is much more feasible due to an abundance of open data (see this table for a summary of what's out there).  Undergraduates can always re-run statistical analyses on open datasets, or try their hand at preparing the data for these tests (e.g., orienting the data into a common space to facilitate comparisons between individuals) to see how robust the conclusions the study arrived at are.  One difficulty posed by such analyses, however, is that the original researchers may not have fully documented their procedures (a point brought up in Carp, 2012).  Another limit is computational, as such data can take up a lot of disk space and require at least one enterprise-grade server with plenty of CPU cores, although the advent of on-demand cloud computing and the tradition of on-campus HPC clusters makes this less of a concern.


This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.