2018: best year yet for net growth of open access

Highlights: this edition of the Dramatic Growth of Open Access features charts that illustrate that 2018 showed the strongest growth to date for open access by number of documents searchable through BASE, PubMedCentral, arXiv, DOAJ, texts added to Internet Archive, and journals added to DOAJ.

A Bielefeld Academic Search Engine (BASE) search encompasses over 19 million more items at the end of 2018 – about 60% or 11.4 million are open access. This brings the total documents searchable through BASE to close to 140 million (about 84 million open access)

PubMedCentral added 600,000 items in 2018, and surpassed a milestone of 5 millions items this year (now 5.2 million items)

arXiv added 140,000 items in 2018, bringing the total close to 1.5 million items.

The DOAJ article search grew by more than 800,000 articles in 2018, bringing the total number of articles searchable through DOAJ to about 3.6 million.

2018 was also the best year to date for DOAJ net journal growth. 1,707 journals were added for a current total of over 12,000 journals. Negative growth in 2016 illustrates the impact of the DOAJ weeding / re-application process.

4.5 million more texts are available through Internet Archive, bringing the total close to 20 million.

The following table provides data on total number of items as of December 31, 2018, growth in 2018 by number and percentage, in descending order by growth in percent. In interpreting percentage growth, consider total and numeric growth. bioRxiv nearly doubled in size this year, indicating a fairly new but healthy and rapidly growing service; but this reflects growth of about 20 thousand documents, a small fraction of the 600,000 items added by PMC for a 13% growth rate.

2018 growth (percent)   2018 total 2018 growth (number)
110% bioRxiv # articles  39,570 20,748
74% Internet Archives software 346,320 147,320
39% SCOAP3 # article 25,163 7,121
30% Internet Archive texts 19,570,789 4,570,789
30% DOAJ searchable articles 3,624,154 832,453
29% Internet Archive audio (recordings) 4,909,271 1,109,271
28% DOAB # books 13,253 2,938
25% Internet Archive collections 389,778 76,778
24% Internet Archive videos (movies) 4,701,129 901,129
21% DOAJ journals searchable at article level 9,479 1,670
16% PubMed keyword search: cancer- last year – free fulltext 65,766 9,154
16% DOAJ # journals 12,434 1,707
16% BASE # documents 139,476,029 19,092,606
16% Internet Archives television 1,733,000 233,000
15% DOAB # publishers 285 38
14% PMC journals some articles OA 758 94
13% PMC # items 5,200,000 600,000
13% RePEC books 39,086 4,449
12% RePEc journal articles 1,785,335 193,994
12% PubMed keyword search: cancer- last 2 years – free fulltext 153,875 16,026
11% BASE # content providers 6,732 694
11% Internet Archive webpages (in billions) 345 35
11% RePEC online (fulltext) (downloadable as of March 2012) 2,528,831 249,692
11% PubMed keyword search: cancer- last 5 years – free fulltext 391,691 37,230
10% arXiv  http://arxiv.org/  1,482,864 140,139
10% OpenDOAR http://www.opendoar.org/ # repositories 3,799 335
9% RePEC chapters 51,278 4,360
9% PMC journals selected articles 4,908 414
8% RePEc working papers 858,360 64,235
8% Total Policies (ROARMAP) 960 71
8% PubMed keyword search: cancer – free fulltext 1,027,541 75,655
7% PMC journals immediate free acccess 1,964 132
7% DOAJ # countries 129 8
7% PubMed keyword search: cancer – last year – all results 184,024 11,341
6% PMC journals deposit all articles 2,217 124
6% Elektronische Zeitschriftenbibliotek – Electronic Journals Library  # journals that can be read free of charge 62,681 3,441
5% PubMed keyword search: cancer – last 5 years – all results 839,960 43,565
5% PMC journals actively participating 2,578 132
5% PubMed keyword search: cancer – all results 3,784,638 192,126
5% PubMed keyword search: cancer – last 2 years – all results 357,370 17,970
4% RePEc software components 4,206 178
4% Internet Archive live music (concerts) 192,534 7,534
3% PMC journals all articles OA 1,529 51
3% ROAR # repositories 4,735 138
2% PMC journals NIH portfolio 335 6
-12% Internet Archive images 3,247,253 -452,747

Full data can be downloaded from the Dramatic Growth of Open Access dataverse: https://hdl.handle.net/10864/10660. This post is part of the Dramatic Growth of Open Access series. From 2004 – June 30, 2018 the series was posted on a quarterly basis. As of September 30, 2018, I continue to gather data quarterly but plan to release the series less frequently, most likely on an annual basis.

Celebrating community growth and Open Science – PeerJ’s 2018 in review – PeerJ Blog

As 2018 comes to an end, we would like to take a moment and recognize the significant efforts of our staff, authors, editors, reviewers, and many collaborators over this past year. And what a year it has been at PeerJ! We are proud to share it has been another landmark year publishing excellent science and contributing to the development of Open Science worldwide.

Over the next few days, we will be highlighting the notable achievements and standout articles from the last year across our platforms. And a quick reminder that we are expanding in 2019 to launch five new peer-reviewed Open Access Chemistry journals….”

5 Questions With… Devin Soper | Association of Southeastern Research Libraries

Q.4 If you had a magic wand and could change one thing in the scholcomm ecosystem, what would it be?

Like many other contributors to this blog series, my first choice would be changing the promotion and tenure process to incentivize faculty to make their work open. Perhaps the best example of this, for me, is the Liège model, where faculty are required to deposit the full text of their works in the institutional repository in order to have them considered for the purposes of internal research evaluation / P&T. If even a few U.S. institutions were able to implement similar policies, I think that belief in the value of institutional OA policies (and the feasibility of Green OA, more generally) would soar as a result.

To vary the conversation a bit, a close second for me would be increased collaboration around big deal cancellations. I’m thinking here about the nationwide cancellations and renegotiations that have taken place in the Netherlands, Germany, and Finland, for instance, where hundreds of universities have banded together to cancel (and later renegotiate) their big deal contracts with Elsevier on the grounds of unsustainable pricing practices, insufficient respect for authors’ rights, and reluctance (if not outright opposition) to advance the cause of open access. In following these developments, I’ve long wished that we could present a similarly united front on these issues here in the U.S., whether at the state, regional, or national level….”

Prevalence of publishing in predatory journals

Abstract:  Objectives: In 2017 the journal Nature published challenges to the assumption that research intensive U.S. institutions are immune to the hazards of predatory publishing. Sample articles from hundreds of potentially predatory journals were analyzed: the NIH was the most frequent funder and Harvard was among the most frequent institutions. Our study was designed to identify the publication prevalence at our institution. 

Methods: Predatory publishers were defined using an archived version of Beall’s list, a now defunct website that was widely recognized as the only comprehensive black list for potential predators. The archive was collected January 15, 2017 and reflects updates made 1-2 weeks prior. To identify our NIH publications, records were collected from PubMed Central using an institution search and limiting to 2011-2016 to reflect a five-year period covered by Beall’s last update. PMC was selected under the assumption that direct journal inclusion in PubMed/MedLine serves as a proxy for quality. Journal and ISSN data were referenced against Ulrich’s Periodical Directory to determine publishers. Data were then compared against the Beall’s listing of potentially predatory publishers and standalone journals. The publication costs for the predatory journals were used to determine the total amount of NIH funding used to pay for publications in predatory journals. 
Results: The review of the University’s Publications submitted to PubMed Central from 2011 to 2016 revealed 15090 publications. Of those 15090 articles 218 publications (1.4%) were from publishers that fell in Beall’s list of predatory publishers. A review of publication fees for the publishers that University faculty published in revealed that approximately $300,000 dollars of Federal grant money was spent over the 5 year period publishing in predatory publications. 
Conclusions: Previously, it was thought that publishing predatory journals was primarily a problem in developing countries. However, like the 2017 Nature study, we found that researchers publishing at Emory are publishing in journals that are considered predatory. While the rate of publication in predatory journals is low (1.4%) it did cost approximately $300,000 of Federal tax payer money, which amounts to approximately 70% of the funds of one year of the average NIH R01 grant.

Go To Hellman: Towards Impact-based OA Funding

What if there was a funding channel for monographs that allocated support based on a measurement of impact, such as might be generated from data aggregated by a trusted “Data Trust”? (I’ll call it the “OA Impact Trust”, because I’d like to imagine that “impact” rather than a usage proxy such as “downloads” is what we care about.)

Here’s how it might work:


  1. Libraries and institutions register with the OA Impact Trust, providing it with a way to identify usage and impact relevant to the library or institutions.
  2. Aggregators and publishers deposit monograph metadata and usage/impact streams with the Trust.
  3. The Trust provides COUNTER reports (suitably adapted) for relevant OA monograph usage/impact to libraries and institutions. This allows them to compare OA and non-OA ebook usage side-by-side.
  4. Libraries and institutions allocate some funding to OA monographs.
  5. The Trust passes funding to monograph publishers and participating distributors. …”

Paper Digest

Paper Digest uses an AI to generate an automatic summary of a given research paper. You can simply provide a DOI (digital object identifier), or the url to a PDF file, then Paper Digest will return a bulleted summary of the paper. This works only for open access full-text articles that allow derivative generation (i.e. CC-BY equivalent). In case you receive an error message and no summary is generated, it is most likely either the full text is not available to use or the license does not allow derivative generation….”

2018 in review: Working towards an open scholarly world | About Hindawi

Consistent with our mission to drive greater openness in research, in September of this year we released a new peer review system that has been built using an open-source framework. The new platform was developed as part of our collaboration with the Collaborative Knowledge Foundation (Coko) and was the first step towards a network of open publishing infrastructure that we, Coko, and other like-minded organizations like eLife and University of California Press are developing and plan to share with the research community. An earnest commitment to openness can only be built on open scholarly infrastructure….”

What does the recently revisited Belgian copyright law for scholarly publications say, actually ? | Ouvertures immédiates / Immediate openings

My previous blog post triggered a lot of interpretations on the actual content, extend and meaning of the amendments to the Belgian copyright law. The best response is the actual text, translated here….

Except for the potential loophole of the King (i.e. the Federal Government)’s good will who can, for some obscure reason (publishers’ lobbying ?) extend the embargo period in an undefined way, and which appears as a very weak point, the rest of the text is quite strong: the right to re-publish and re-use is mandatory and irrecusable. It overrides any previous contract between the author and the publisher, even anterior to the law itself.

Of course, one would presume it applies to Belgian citizens in a scholarly institution in Belgium, leaving a fuzzy zone when the author is working abroad transiently or when he/she is a coauthor among foreign researchers….”

The Colombo Statement: IDUAI 2018

“Having attended “The Asian Digital Revolution: Transforming the Digital Divide into a Digital Dividend through Universal Access”, a commemorative event held to celebrate the International Day for Universal Acces to Information (IDUAI) in Colombo on 28-29 September 2018:…

Considering the 2011 Strategy on UNESCO’s contribution to the promotion of Open Access (OA) to scientific information and research and taking into account specific needs in the countries of the South;…

The participants: …

Reaffirm the importance of empowering all citizens, especially young women and men and persons with disabilities, to develop a culture of openness and to become creators of content and innovation, including through access to information and quality education.

Reiterate the understanding of the Dakar Declaration on Open Access for the Global South, and state the necessity for establishing polycentric governance mechanisms for OA research and recommend that institutions and governments urgently collaborate to pilot and develop policies and enabling mechanisms to promote and publicize Open Scholarship and Open Science.

Call upon the governments to take firm steps and develop policies to mandate all the publicly funded research are available under Open Access; and also to earmark enough funding for necessary infrastructural and capacity enhancement.

Appreciate the Ljubljana Ministerial Statement and Open Educational Resources [OER] Action Plan 2017 which recognizes OER as a strategic opportunity to increase knowledge sharing and universal access to quality learning and teaching resources and call upon Governments and all relevant educational stakeholders, including civil society, to mainstream OER making them more broadly accessible including to persons with disabilities in support of achieving the Education 2030 Agenda.

Note the need to ensure institution-wide multi-sectoral training, attuned to people’s divergent and discrete needs, in particular those of disadvantaged groups and individuals, and designed to accustom and familiarize the community towards a more inclusive environment which can integrate the latest available technology (ODL, OER, FOSS, OA, etc.) into learning, teaching and training routines, applying the tenets of universal design for learning including UNESCO’s just published Competency Framework…

Recommend that OER be made accessible across media, including smart mobile devices and offline, in flexible and inclusive formats that support their effective and widest possible use, including by persons with disadvantages or disabilities, to learning, teaching and training, again in accordance with the tenets of relevant best practice….”