The rising trend in authorship

  1. 1.  Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724

Abstract

Big science is on the rise. Recent endeavors, such as the Large Hadron Collider and the Human Genome Project, illustrate the rise in large-scale scientific inquiries. To assess whether big science is part of a general trend towards increased authorship, we queried the publicly available database Pubmed and measured the trend in number of authors per paper over the last century. Here we show that authorship has increased five-fold since 1913 and predict that by 2034, publications will boast an average of 8 authors.

The Landscape of Authorship

To study the evolution of authorship over the last century, we obtained metadata for all ~24 million papers listed in Pubmed as published between 1913 and 2013 (Methods Section). Figure 1 shows the exponential trend in number of papers published over the last century. Starting in 2011, Pubmed indexed >1 million papers each year.

Description: robert-mac:Users:robert:Downloads:_number_of_papers_over_time.png

Figure 1: The number of papers published has increased exponentially over the last century

We next turn to studying authorship over time. As shown in Figure 2, the average number of authors per paper has increased more than 5-fold over the last 100 years, going from ~1 author per paper in 1913 to ~5.4 authors per paper in 2013 (see Figure 3 for the shift in authorship distribution between 1953 and 2013). Interestingly, the only significant drop in the average number of authors per paper occurs between 1945 and 1949. Although this dip occurs at the end of World War II, it is unclear whether there exists a causal link.

Description: robert-mac:Users:robert:Downloads:_authorship_over_time.png

Figure 2: The average number of authors per paper has increased 5-fold over the last century

Figure 3: The distribution of authorship between 1953 and 2013

In Figure 4, we plot, for each year, the number of authors found in the paper with the largest author list. Two turning points are observed: In 1998, we see the first sign of papers with >500 authors, and in 2010, we observe papers with >2000 authors. Interestingly, the Large Hadron Collider project accounts for many of the papers with large author lists.

Figure 4: Many of the papers with over 1000 authors are from the Large Hadron Collider project

Outlook on the Future

We now turn to predicting the trend in authorship over the next 20 years. To validate our approach, we first predict the trends observed between 1994 and 2013 using only the data from 1913 to 1993. By fitting the data to a polynomial function (degree 5, R2=0.9968), we accurately predict the trend seen between 1994 and 2013. Next, we apply the same analysis to predict the authorship trend in the next 20 years. As shown in Figure 5, we predict that papers written in 2034 will feature 8 authors on average.

Description: robert-mac:Users:robert:Downloads:_authorship_over_time_fit.png

Figure 5: We predict that by 2034, the average paper will list ~8 authors

Effect of Journal Policies

The trend in number of authors is also influenced by a journal's authorship policies. In November of 1991, the New England Journal of Medicine (NEJM) wrote an editorial announcing a new policy that limits the number of authors to 12 (Kassirer and Angell 1991). As a result, we observe a significant decrease in the average number of authors in NEJM papers starting in 1992 (Figure 6). In 2002, however, the NEJM editorial board reverted their decision (Drazen and Curfman 2002) and, although not immediately, the average number of authors per paper increased significantly a few years later.

Description: robert-mac:Users:robert:Downloads:_authorship_over_timenew_england_journal_of_medicine.png

Figure 6: The New England Journal of Medicine saw a large decrease in authorship when it implemented a policy limiting the number of authors to 12 in 1992. Once the policy was lifted in 2002, authorship increased a few years later.

Conclusion

Here we demonstrate that authorship per paper has increased over 5-fold over the last century, and will reach, on average, 8 authors per paper by 2034.

Two major explanations can be cited for this increase in authorship. First, it is possible that modern scientific inquiries have become so complex that answering them requires large teams of scientists from different fields, thereby driving up the number of authors per paper. Alternatively, it is conceivable that in a climate of scarce funding, granting authorship to minor contributors-also known as honorary authorship-is on the rise.

To combat the latter, many journals are now requiring papers to specify each author's contribution to the study. If, however, the increase in authorship can be traced to scientific inquiry becoming more complex, there is nothing inherently objectionable about this trend.

Methods

To query the Pubmed database programmatically, we made use of the Pubmed API. Using Pubmed "esearch', we first retrieved the unique IDs of all papers published between 1913 and 2013. Using Pubmed "efetch', we then retrieved the records of all papers, which included paper title, journal, and author list. Next, we parsed the authors from all downloaded records and, for each year, measured the distribution of the number of authors per paper. The software developed to perform the analysis outlined here is available open-source at github.com/robertaboukhalil/pubmed.

References

Drazen, J. M., and G. D. Curfman. 2002. "On authors and contributors." N Engl J Med no. 347 (1):55. doi: 10.1056/NEJMe020063.

Kassirer, J. P., and M. Angell. 1991. "On authorship and acknowledgments." N Engl J Med no. 325 (21):1510-2. doi: 10.1056/NEJM199111213252112.

Reviews

Showing 2 Reviews

  • Placeholder
    Dennis Evangelista
    Originality of work
    Quality of writing
    Quality of figures
    Confidence in paper
    0

    1.  I think the paper would be stronger if it delved more into the things currently in "conclusions". Specifically, if the author could propose ways to test (or do the test) either hypothesis given.  

    2.  Can the authors comment on what other metrics are available in the data set?  Maybe some other quantity is conserved (words/author) or ?? that might be informative here. 

    3. Is it possible to examine other wider correlations - authors and words versus money? 

    4.  What is the basis for using a 5th order polynomial fit to project ahead by 20 years?  Why 20 years?  (If you go 100?)  If most papers had N authors for large N, but the academic workforce grows by rate R and the untenured workforce grows by S... Perhaps your projections could be made more relevant if they were tied to such things. 

    5.  Would the author care to venture a comment on if having 500 authors for a paper is good / realistic / desireable?  This would not be a strictly scientific judgement and thus would not be done in a conventional journal - but it appears this particular platform is not a conventional journal. 

  • Kh web square
    Konrad Hinsen
    Confidence in paper
    Quality of figures
    Quality of writing
    Originality of work
    -1

    This work raises more questions than it answers. This can of course be a desirable feature of scientific work, if the new questions have a chance to be answered by scientific enquiry. Unfortunately, I do not see any discussion of this. A historical trend is pointed out and then extrapolated in a somewhat arbitrarily looking manner. So what?

    I suspect that most readers would, like me, be most interested in testable explanations for the phenomenon. But the two explanations proposed do not seem to be testable. It's easy to add further speculative explanations, such as:

    1) Today's bibliometry never takes into account the number of authors per paper. If two teams decide to "join forces" and put the other team's members as authors on all papers, everybody's citation counts and h-factors will increase.

    2) The arrival of corporate management approaches in research favors big projects over small ones even in the absence of scientific complexity. Longer author lists could simply reflect the hierarchical structure imposed by project funding.

    An exploration of explanations would require measuring other features of scientific work, such as the complexity, interdisciplinarity, or number of distinct scientific techniques being employed. All that looks much more difficult than counting authors.

License

This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.