While early commenting on studies is seen as one of the advantages of preprints, the type of such comments, and the people who post them, have not been systematically explored.
Materials and methods
We analysed comments posted between 21 May 2015 and 9 September 2019 for 1983 bioRxiv preprints that received only one comment on the bioRxiv website. The comment types were classified by three coders independently, with all differences resolved by consensus.
Our analysis showed that 69% of comments were posted by non-authors (N = 1366), and 31% by the preprints’ authors themselves (N = 617). Twelve percent of non-author comments (N = 168) were full review reports traditionally found during journal review, while the rest most commonly contained praises (N = 577, 42%), suggestions (N = 399, 29%), or criticisms (N = 226, 17%). Authors’ comments most commonly contained publication status updates (N = 354, 57%), additional study information (N = 158, 26%), or solicited feedback for the preprints (N = 65, 11%).
Our results indicate that comments posted for bioRxiv preprints may have potential benefits for both the public and the scholarly community. Further research is needed to measure the direct impact of these comments on comments made by journal peer reviewers, subsequent preprint versions or journal publications.
Abstract: The world continues to face a life-threatening viral pandemic. The virus underlying the Coronavirus Disease 2019 (COVID-19), Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), has caused over 98 million confirmed cases and 2.2 million deaths since January 2020. Although the most recent respiratory viral pandemic swept the globe only a decade ago, the way science operates and responds to current events has experienced a cultural shift in the interim. The scientific community has responded rapidly to the COVID-19 pandemic, releasing over 125,000 COVID-19–related scientific articles within 10 months of the first confirmed case, of which more than 30,000 were hosted by preprint servers. We focused our analysis on bioRxiv and medRxiv, 2 growing preprint servers for biomedical research, investigating the attributes of COVID-19 preprints, their access and usage rates, as well as characteristics of their propagation on online platforms. Our data provide evidence for increased scientific and public engagement with preprints related to COVID-19 (COVID-19 preprints are accessed more, cited more, and shared more on various online platforms than non-COVID-19 preprints), as well as changes in the use of preprints by journalists and policymakers. We also find evidence for changes in preprinting and publishing behaviour: COVID-19 preprints are shorter and reviewed faster. Our results highlight the unprecedented role of preprints and preprint servers in the dissemination of COVID-19 science and the impact of the pandemic on the scientific communication landscape.
“10 years later I ended up working at Cold Spring Harbor myself, and continuing my relationship with HighWire from a new perspective. The arXiv preprint server for physics had launched in 1991, and my colleague John Inglis and I had often talked about whether we could do something similar for biology. I remember saying we could put together some of HighWire’s existing components, adapt them in certain ways and build something that would function as a really effective preprint server—and that’s what we did, launching bioRxiv in 2013. It was great then to be able to take that experiment to HighWire meetings to report back on. Initially there was quite a bit of skepticism from the community, who thought there were cultural barriers that meant preprints wouldn’t work well for biology, but 7 years and almost 100,000 papers later it’s still there, and still being served very well by HighWire.
When we launched bioRxiv we made it very explicit that we would not take clinical work, or anything involving patients. But the exponential growth of submissions to bioRxiv demonstrated that there was a demand and a desire for this amongst the biomedical community, and people were beginning to suggest that a similar model be trialed for medicine. A tipping point for me was an OpEd in the New York Times (Don’t Delay News of Medical Breakthroughs, 2015) by Eric Topol (Scripps Research) and Harlan Krumholz (Yale University), who would go on to become a co-founder of medRxiv….”
“Early Evidence Base (EEB) is an experimental platform that combines artificial intelligence with human curation and expert peer-review to highlight results posted in preprints. EEB is a technology experiment developed by EMBO Press and SourceData.
Preprints provide the scientific community with early access to scientific evidence. For experts, this communication channel is an efficient way to accesss research without delay and thus to accelerate scientific progress. But for non-experts, navigating preprints can be challenging: in absence of peer-review and journal certification, interpreting the data and evaluating the strength of the conclusions is often impossible; finding specific and relevant information in the rapidly accumulating corpus of preprints is becoming increasingly difficult.
The current COVID-19 pandemic has made this tradeoff even more visible. The urgency in understanding and combatting SARS-CoV-2 viral infection has stimulated an unprecedented rate of preprint posting. It has however also revealed the risk resulting from misinterpretation of preliminary results shared in preprint and with amplification or perpetuating prelimature claims by non-experts or the media.
To experiment with ways in which technology and human expertise can be combined to address these issues, EMBO has built the EEB. The platform prioritizes preprints in complementary ways:
Refereed Preprints are preprints that are associated with reviews. EEB prioritizes such preprints and integrates the content of the reviews as well as the authors’ response, when available, to provide rich context and in-depth analyses of the reported research.
To highlight the importance of experimental evidence, EEB automatically highlights and organizes preprints around scientific topics and emergent areas of research.
Finally, EEB provides an automated selection of preprints that are enriched in studies that were peer reviewed, may bridge several areas of research and use a diversity of experimental approaches….”
Abstract: While early commenting on studies is seen as one of the advantages of preprints, the nature of such comments, and the people who post them, have not been systematically explored. We analysed comments posted between 21 May 2015 and 9 September 2019 for 1,983 bioRxiv preprints that received only one comment. Sixty-nine percent of comments were posted by non-authors (n=1,366), and 31% by preprint authors (n=617). Twelve percent of non-author comments (n=168) were full review reports traditionally found during journal review, while the rest most commonly contained praises (n=577, 42%), suggestions (n=399, 29%), or criticisms (n=226, 17%). Authors’ comments most commonly contained publication status updates (n=354, 57%), additional study information (n=158, 26%), or solicited feedback for the preprints (n=65, 11%). Our study points to the value of preprint commenting, but further studies are needed to determine the role that comments play in shaping preprint versions and eventual journal publications.
Abstract: Engagement with scientific manuscripts is frequently facilitated by Twitter and other social media platforms. As such, the demographics of a paper’s social media audience provide a wealth of information about how scholarly research is transmitted, consumed, and interpreted by online communities. By paying attention to public perceptions of their publications, scientists can learn whether their research is stimulating positive scholarly and public thought. They can also become aware of potentially negative patterns of interest from groups that misinterpret their work in harmful ways, either willfully or unintentionally, and devise strategies for altering their messaging to mitigate these impacts. In this study, we collected 331,696 Twitter posts referencing 1,800 highly tweeted bioRxiv preprints and leveraged topic modeling to infer the characteristics of various communities engaging with each preprint on Twitter. We agnostically learned the characteristics of these audience sectors from keywords each user’s followers provide in their Twitter biographies. We estimate that 96% of the preprints analyzed are dominated by academic audiences on Twitter, suggesting that social media attention does not always correspond to greater public exposure. We further demonstrate how our audience segmentation method can quantify the level of interest from nonspecialist audience sectors such as mental health advocates, dog lovers, video game developers, vegans, bitcoin investors, conspiracy theorists, journalists, religious groups, and political constituencies. Surprisingly, we also found that 10% of the preprints analyzed have sizable (>5%) audience sectors that are associated with right-wing white nationalist communities. Although none of these preprints appear to intentionally espouse any right-wing extremist messages, cases exist in which extremist appropriation comprises more than 50% of the tweets referencing a given preprint. These results present unique opportunities for improving and contextualizing the public discourse surrounding scientific research.
Abstract: Preprints are becoming well established in the life sciences, but relatively little is known about the demographics of the researchers who post preprints and those who do not, or about the collaborations between preprint authors. Here, based on an analysis of 67,885 preprints posted on bioRxiv, we find that some countries, notably the United States and the United Kingdom, are overrepresented on bioRxiv relative to their overall scientific output, while other countries (including China, Russia, and Turkey) show lower levels of bioRxiv adoption. We also describe a set of ‘contributor countries’ (including Uganda, Croatia and Thailand): researchers from these countries appear almost exclusively as non-senior authors on international collaborations. Lastly, we find multiple journals that publish a disproportionate number of preprints from some countries, a dynamic that almost always benefits manuscripts from the US.
Abstract: Preprint servers, such as arXiv and bioRxiv, have disrupted the scientific communication landscape by providing rapid access to research before peer review. medRxiv was launched as a free online repository for preprints in the medical, clinical, and related health sciences in 2019. In this review, we present the uptake of preprint server use in nephrology and discuss specific considerations regarding preprint server use in medicine. Distribution of kidney-related research on preprint servers is rising at an exponential rate. Survey of nephrology journals identified that 15 of 17 (88%) are publishing original research accepted submissions that have been uploaded to preprint servers. After reviewing 52 clinically impactful trials in nephrology discussed in the online Nephrology Journal Club (NephJC), an average lag of 300 days was found between study completion and publication, indicating an opportunity for faster research dissemination. Rapid review of papers discussing benefits and risks of preprint server use from the researcher, publisher, or end user perspective identified 53 papers that met criteria. Potential benefits of biomedical preprint servers included rapid dissemination, improved transparency of the peer review process, greater visibility and recognition, and collaboration. However, these benefits come at the risk of rapid spread of results not yet subjected to the rigors of peer review. Preprint servers shift the burden of critical appraisal to the reader. Media may be especially at risk due to their focus on “late-breaking” information. Preprint servers have played an even larger role when late-breaking research results are of special interest, such as during the global coronavirus disease 2019 pandemic. Coronavirus disease 2019 has brought both the benefits and risks of preprint servers to the forefront. Given the prominent online presence of the nephrology community, it is poised to lead the medicine community in appropriate use of preprint servers.
Abstract: A potential motivation for scientists to deposit their scientific work as preprints is to enhance its citation or social impact. In this study we assessed the citation and altmetric advantage of bioRxiv, a preprint server for the biological sciences. We retrieved metadata of all bioRxiv preprints deposited between November 2013 and December 2017, and matched them to articles that were subsequently published in peer-reviewed journals. Citation data from Scopus and altmetric data from Altmetric.com were used to compare citation and online sharing behavior of bioRxiv preprints, their related journal articles, and nondeposited articles published in the same journals. We found that bioRxiv-deposited journal articles had sizably higher citation and altmetric counts compared to nondeposited articles. Regression analysis reveals that this advantage is not explained by multiple explanatory variables related to the articles’ publication venues and authorship. Further research will be required to establish whether such an effect is causal in nature. bioRxiv preprints themselves are being directly cited in journal articles, regardless of whether the preprint has subsequently been published in a journal. bioRxiv preprints are also shared widely on Twitter and in blogs, but remain relatively scarce in mainstream media and Wikipedia articles, in comparison to peer-reviewed journal articles.
Abstract: The world continues to face an ongoing viral pandemic that presents a serious threat to human health. The virus underlying the COVID-19 disease, SARS-CoV-2, has caused over 3.2 million confirmed cases and 220,000 deaths between January and April 2020. Although the last pandemic of respiratory disease of viral origin swept the globe only a decade ago, the way science operates and responds to current events has experienced a paradigm shift in the interim. The scientific community has responded rapidly to the COVID-19 pandemic, releasing over 16,000 COVID-19 related scientific articles within 4 months of the first confirmed case, of which at least 6,000 were hosted by preprint servers. We focused our analysis on bioRxiv and medRxiv, two growing preprint servers for biomedical research, investigating the attributes of COVID-19 preprints, their access and usage rates, characteristics of their sharing on online platforms, and the relationship between preprints and their published articles. Our data provides evidence for increased scientific and public engagement (COVID-19 preprints are accessed and distributed at least 15 times more than non-COVID-19 preprints) and changes in journalistic practice with reference to preprints. We also find evidence for changes in preprinting and publishing behaviour: COVID-19 preprints are shorter, with fewer panels and tables, and reviewed faster. Our results highlight the unprecedented role of preprints and preprint servers in the dissemination of COVID-19 science, and the likely long-term impact of the pandemic on the scientific publishing landscape.