The theme for this issue of Dramatic Growth of Open Access is a celebration of successes in 2009, areas with room for improvement, and, in keeping with the times, suggested New Year’s Resolutions. We’ll begin with an OA status report, followed by “leaps and bounds” growth in 2009. For other editions of this series, see: Open data – download data. View full data. View full data with 2009 growth. Dramatic Growth of Open Access Series.

OA Status Report 2009

Open Access Journals
DOAJ: 4,535 titles
Net growth 2009: 723 titles
Growth rate: 2 titles per day

Open Access Archives
OpenDOAR # archives: 1,558
New growth 2009 (ROAR): 318
Growth rate: 1 archive per day
BASE # documents: 22,007,367
Scientific Commons # documents: 32,265,678
Net growth 2009: 7.9 million documents (Scientific Commons)
Growth rate: 22,000 documents per day

Open Access Mandate Policies (from ROARMAP):
Institutional: 79 (growth 52, more than doubled); growth rate 1-2 per week
Funder: 42 (growth 12, 40% increase, growth rate one per month)
Departmental: 18 (growth 14, more than tripled); growth rate one per month
Thesis: 39
Proposed mandates: 15 (growth 5, 45% increase); growth rate one per month

Leaps and Bounds: impressive growth by percentage, in decreasing order of percentage growth

More than doubled

  • departmental open access mandates, 350% growth from 4 to 18 (ROARMAP)
  • institutional open access mandates, 208% growth from 27 to 79 (ROARMAP)

Over 40% growth
Total open access mandates: 198% (ROARMAP)
# items in CARL metadata harvester search: 74%
Proposed Open Access Mandates: 45%
# archives in CARL metadata harvester: 44%
Funder Open Access mandates: 40% (ROARMAP)
Peer-reviewed journals in Open J-Gate: 40%

Over 30% growth
DOAJ – # articles searchable at article level: 38%
PubMedCentral – # journals in PMC with all articles open access: 36%
Scientific Commons – # publications: 33%
DOAJ – # of journals searchable at article level: 32%
# journals in Open J-Gate: 31%

Over 20% growth
# repositories listed in ROAR: 26%
free fulltext in rePEC: 26%
# repositories listed in Scientific Commons: 20%

Welcome and good luck to BASE, aiming to be the world’s best and most comprehensive search engine for Open Access Archives.

Free back issues

Highwire Free: while the # of free articles actually decreased by 6% in 2009, it may be worth noting that about a third of the articles hosted on Highwire Free – mainly representing society publishers – are freely available online.

Electronic Journals Library lists over 23,000 titles that are freely available online.

PubMed: individual journal free fulltext performance

This is a continuation of a somewhat random exploration of why free full-text availability for citations in PMC covered by the NIH Public Access policy are less than what they should be. As with the Dec. 11 issue, my findings reveal a wide range of performance by journal.

Kudos!!! to the following journals with outstanding free fulltext track records:
Biomicrofluids: 100% of the articles in PubMed in this journal published by the American Institute of Physics are available as free fulltext – even though none of these articles fall under the NIH policy!
Journal of Postgraduate Medicine: while very few of the articles in this India-based open access journal by Medknow Publications are NIH-funded, 100% of those that are, are available as free full-text, and over 76% of all of the articles indexed in PubMed are free fulltext.
Journal of Oncology by Hindawi Publications: 60 of 67 articles listed in PubMed are available as free fulltext, even though none are NIH-funded
Vulnerable Children and Youth Studies: only 3 of the articles in this Taylor & Francis journal are listed in PubMed, none NIH funded, but all are free full-text

Room for improvement

This section touches on a few journals with remarkably poor performance in taking advantage of the dissemination potential of the internet – particularly given the obvious public interest in the topics covered. Update January 18, 2010: please note that calculations of compliance with the NIH Public Access Policy reflect articles with a publication date from Jan. 1, 2005 up to the permitted 12-month embargo period. This reflects both the original policy which requested public access, and the newer policy which requires open access which took effect April 7, 2008. Compliance rates under the new required policy have not been calculated at this time, but may be added to a future DGOA. For my search method, see the DGOA Full Data edition (see the 3rd sheet), or this explanatory post. My apologies for any confusion.

Update January 19: according to Peter Suber, “In the period since the NIH policy became mandatory, HSCC has had two submissions based on NIH funding. In the first case it deposited the manuscript in PMC within six days of receipt. The second paper was received very recently and is still in process. (Thanks to Cliff Morgan for the correction.)”. As of this morning, I am not able to find any articles from this journal indexed in PubMed using the original search. This could mean nothing; it might be a glitch at PubMed, or persistent operator error, i.e. I do not wish to draw any firm conclusions until I retry the search at another time. I re-ran the Dec. 31, 2009 search yesterday evening, and once again found the result of 6 NIH-externally funded articles from 2005-2008 with no fulltext available for any of the articles.

Wiley and Blackwell’s Health and Social Care in the Community authors show a 0% compliance rate with the NIH Public Access Policy. According to the journal website, the journal’s policy is to comply, with the expectation that Wiley will undertake the deposit. Perhaps the editor might like to get in touch with Wiley? Either that, or have a discussion with the editorial board about the future of the journal. There are a great many free or low cost journal hosting options these days; the selection is likely much richer than when the decision was made to go with Wiley-Blackwell. [Hint: if the purpose of your research is to improve health and social care in the community, why not make the research available to the community – and the many professionals, often working in agencies or volunteer organizations with minimal funding – who serve the community?]

The compliance rates are under 20% for authors of Wiley-Blackwell’s Alcoholism: Clinical and Experimental Research, Taylor and Francis’ AIDS Care and American Journal of Bioethics. While my selection technique is somewhat random, I selected these titles looking for topics with a high public interest, hoping to see more impressive results.

Comments: as mentioned in the December 11, 2009 early year-end edition, 2009 was the year of the open access mandate, with highly significant growth in this area. Something else that is worth noting is the dramatic growth both in open access archives and in documents available through open access archives. To some extent, this reflects early success of the open access mandate policies, but clearly, there is more to it than that. The CARL metadata harvester statistics, for example, show significant growth even though institutional open access policies in Canada are still quite rare. To me, this is an early sign that we are collectively beginning to get over the learning curve (of understanding what an open access archive is, and what it can do for us), which bodes very well indeed for future growth of open access. RePEC’s “leaps and bounds” growth is especially impressive for a mature repository – kudos to RePEC and the international economics community!

Suggested New Year’s Resolutions
Are you interested in contributing to further dramatic growth of open access – or perhaps looking to make sure your journal thrives in the emerging open access environment? Here are some suggested New Year’s Resolutions:

For libraries:
Join the Compact on Open Access Publishing Equity
Continue with all the great work you are already doing to advance and support open access and transformative change in scholarly communication!

For researchers:
Retain your copyright!
Publish in an open access journal if you can.
Self-archive a copy of your work for open access, no matter where you publish.

For journal people:

  • If you’re open access – hurray! Join OASPA.
  • If you’re not open access – why not? If a commercial publisher is hosting your journal, this might be a good time to review your options. Odds are that the scant options of a few years ago have evolved with a wealth of interesting opportunities.
  • If you’d really like to be open access but are concerned about economic support: have a frank discussion with your academic communities and your libraries. The more people realize that we can have a fully open access scholarly publishing system – assuming reasonable journal costs, as is the case with almost all independent society journals – at much less than current expenditures – the sooner we can all transition to open access.

Universities and funding agencies:
Open access mandate policies – depending on local circumstances, start discussions, commit to a policy, implement, evaluate, or strengthen existing policy.

This post is part of the Dramatic Growth of Open Access Series. For more on 2010 predictions, see my Dec. 11, 2009 early year-end edition.

Policy Forum on Public Access to Federally Funded Research: Features and Technology

Following is my comment on the U.S. Office of Science and Technology Policy’s Policy Forum on Public Access to Federally Funded Research: Features and Technology (second phase). Reader note: this post is more technical than the average IJPE post.

Q: In what format should published papers be submitted in order to make them easy to find, retrieve, and search and to make it easy for others to link to them?

A: XML is the best format. It is important to also take into account how the researchers work; the process of submission should ideally fit into their workflow. Microsoft has been working on an automated upload feature for repositories. Ideally, researchers should be able to cross-deposit to as many open access archives as are desirable for their work (I already have 3 archives myself, and there are good reason to deposit in all of them).

Q: Are there existing digital standards for archiving and interoperability to maximize public benefit?

– The Open Archives Initiative – Protocol for Metadata Harvesting (OAI-PMH) is key to harvesting and cross-searching metadata from all open access archives.
– Stable URLs, preferably ones that meet the standards for OpenURL (and possibly DOI), are essential.
– The SWORD protocol allows for cross-deposit into multiple archives.
– Creative Commons licensing, to facilitate both human and machine reading of licensing terms.
– For archiving (preservation): LOCKS, CLOCKSS, and Portico. For preservation purposes as well as ensured ongoing access, multiple mirror sites is recommended.
– Open standards are recommended. For example, video materials should use a format like MPEG-4. Open standards will allow the most possible people to access the materials, and will facilitate the task of preservation.

Q: How are these anticipated to change?
A: OAI-PMH is quite stable. SWORD is new; the ability to cross-deposit is very important to researchers, so watch for growth.

Q: What are the best examples of usability in the private sector (both domestic and international) and what makes them exceptional?
– E-LIS, the Open Archive for Library and Information Studies, has exceptional tools for searching, including a custom-designed subject classification scheme – not surprising for a tool developed by and for librarians: http://eprints.rclis.org/
– Google provides a very effective search engine to materials in repositories, particularly for known items. Google strikes me as more effective in this instance than Google Scholar.
– BASE, the Bielefeld Academic Search Engine, aims to be the world’s most comprehensive search service for open archives, using OAI-PMH: http://www.base-search.net/
– It is worthwhile looking at initiatives that are using the same standards for journals, conferences, and archives, providing a foundation for cross-searching materials in all these venues. For example, the Directory of Open Access Journals (DOAJ) http://www.doaj.org features an article-level search, based on OAI-PMH. Open Journal Systems (OJS), a free open source software, also supports OAI-PMH and there is a PKP harvester. http://pkp.sfu.ca/?q=ojs OJS is part of the Public Knowledge Project, which also includes Open Conference Systems and Open Monograph Systems (in development, to be released this February).

Q: Should those who access papers be given the opportunity to comment or provide feedback?

A: Of course; the only questions are the best venues for providing comments or feedback. My perspective is that opening up access to these papers has tremendous potential to inform public debates and commenting on a wide variety of issues; this potential will come to fruition over a period of time, as there will need to be time for learning and exploration. The most fruitful discussions, in my opinion, will be when people take ideas from the papers and bring them to their communities for discussion.

For example, it makes sense to me that a patient advocacy group might lead a discussion on research in their advocacy area, perhaps on their own website, including references to articles of interest. Researchers in this area might well wish to participate in special events with such a group from time to time; this would provide them with feedback in a focused way, and could also be a way for researchers to connect with people who might be good candidates for clinical trials.

Another example: a variety of businesspeople, scientists, and the environmentally minded public might well be interested in research that has the potential to uncover new green technologies.

What would be most helpful to facilitate this kind of discussion would be to ensure that papers have stable URLs so that these communities can reference them, ideally an easy way to export a proper citation, and creative commons licensing to ensure that rights issues are clear (and also to encourage broadest re-use rights; for example, allowing a portion of an article to be posted, with appropriate attribution, to the website of a not-for-profit discussion group).

There can be roles for journalists and media here to act as intermediaries in setting up such discussions, and also for government staff to conduct groups on public policy issues, much like this one.

Q: By what metrics (e.g. number of articles or visitors) should the Federal government measure success of its public access collections?

A: The first important metric is the number of articles that are freely available. This can involve a simple count of articles, percentage of articles covered under policies that are actually freely accessible, percentage of all scholarly articles published anywhere are freely accessible (an indirect measure of extended policy influence; as an example, hundreds of scholarly journals voluntarily participate fully in PubMedCentral in a way that goes far beyond what is required by the NIH Public Access Policy), and (a little harder) levels of inability to access materials; this may require developing a reporting system.

As for use, number of visitors, abstract views, or article downloads would be useful. It is important to focus on this kind of usage in the aggregate, and not at the individual paper level. There are potentially serious issues with using metrics to evaluate scholarly work, as I have touched on in my book chapter, The Implications of Usage Statistics as an Economic Factor in Scholarly Communication: http://eprints.rclis.org/4889/

Open access roundup

Data sharing in the life sciences: reality vs. policy

Patterns of information use and exchange: case studies of researchers in the life sciences, report by the Research Information Network and the British Library, November 2009. From the executive summary:

… Researchers communicate their findings – new knowledge, new methodologies and tools – primarily through conference proceedings and journal articles. These public activities have strong institutional and professional incentives in building reputations, securing promotion and so on. Incentives for other kinds of communication and sharing are weaker and indirect.

Most research councils have policies requiring researchers to set up formal mechanisms to manage created data, including provision for access and re-use. Moreover, the experience of sharing data such as gene sequences in high-profile research programmes in fields such as genomics or proteomics has come to be seen as something of a paradigm or model around which policies and practice will converge.

But our study suggests that such a model remains exceptional. Indeed, researchers highlight a number of barriers to sharing their research data, including concerns about potential misuse, ethical constraints, and intellectual property. Above all, they see data as a critical part of their ‘intellectual capital’, generated through a considerable investment of time, effort and skill. In a competitive environment, their willingness to share is therefore subject to reservations, in particular as to the control they have over the manner and timing of sharing.

Discussion of these issues has been hampered by confusions and inconsistent usage of the terms ‘data’ and ‘information’. The current preoccupation with sharing research data has diverted attention from the diverse range of formal and informal information exchanges that take place in the life sciences. Given the limited current understanding of which forms of sharing and exchange are most effective and beneficial, and under what circumstances, we suggest that policy-makers need to engage in further discussions with researchers to identify and address the constraints, as well as to preserve the exercise of informed choice that is fundamental to science.

Narrowly prescriptive approaches are unlikely to be effective. We recommend rather that funders should adopt a more pragmatic and experimental policy that recognises the multiplicity of contexts, and the different approaches to information sharing; and which builds upon the informal sharing that is already taking place, based on the recognition of mutual needs. Such a bottom-up view is needed in order:

  • to attend to the practicalities of data sharing: what makes information from other sources intelligible? Under what circumstances is such sharing useful and sufficiently beneficial to warrant the labour necessary to achieve it? and
  • to address existing barriers and drivers for change, including the perceived self-interests and goals of researchers, and their need to sustain their intellectual capital and advance their careers.

A key message from our work, therefore, is that policy intervention and support systems for researchers need to be built around the many different and successful tools and practices emerging within life science research communities themselves. …

60,000 OA books from Library of Congress

Sarah Rouse, Library of Congress Puts Thousands of Historic Books Online, America.gov, December 24, 2009.

Nearly 60,000 books prized by historians, writers and genealogists, many too old and fragile to be safely handled, have been digitally scanned as part of the first-ever mass book-digitization project of the U.S. Library of Congress (LOC), the world’s largest library. Anyone who wants to learn about the early history of the United States, or track the history of their own families, can read and download these books for free.

“The Library chose books that people wanted, but that were too old and fragile to serve to readers. They won’t stand up to handling,” said Michael Handy, who co-managed the project, which is called Digitizing American Imprints. …

[The] digitized books can be accessed through the Library’s catalog Web site and the Internet Archive (IA), a nonprofit organization dedicated to building and maintaining a free online digital library. …

The Library of Congress has digitized many of its other collections — more than 7 million photographs, maps, audio and video recordings, newspapers, letters and diaries can be found at the Library’s Digital Collections site, such as the popular American Memory and the multilingual Global Gateways collections — but “this is the first sustained book-digitization project on a high-volume basis,” Handy said. …

A $2 million grant from the Alfred P. Sloan Foundation inaugurated the LOC book digitization project. One of the grant’s objectives was “to address some of the issues that other book digitization projects had mainly avoided dealing with — for instance, the brittle book issue,” Handy said. “We established some procedures and preservation treatments to be able to scan books that otherwise couldn’t be scanned.” …

Handy said, “More funding will be sought to keep this going after this year. This is just the beginning.”

See also our past posts on the program.

OA mandate at Dublin Tech

Dublin Institute of Technology has adopted an OA mandate:

Academic staff, research assistants, research students and other members of the Institute are entitled and required to deposit digital copies of refereed and other research publications and documents. …

Exceptionally, material that is to be commercialised, or which can be regarded as confidential, or the publication of which would infringe a legal commitment of the Institute and/or the author, is exempt from inclusion in the repository.

Uploading of items into [the IR] is the responsibility of authors and researchers. It is desirable that items be self-archived. However, this task may be delegated to others or to Library Services.

All deposits of journal articles must comply with Publishers’ policies. …

OA mandate at U. Abertay Dundee

The University of Abertay Dundee has adopted an OA mandate:

It is the University’s policy to establish a comprehensive database of research outputs, recording bibliographic information and, where permissible under publishers’ copyright policies, providing access to the full text of published research produced by University staff and research students.

The University therefore requires that all staff and research students submit the following to the repository:

  • Full text electronic copies and bibliographic details of peer-reviewed research published from 1 January 2010.
  • Bibliographic details (including abstracts, where available) of peer-reviewed research published between 1 January 2001 and 31 December 2009.

and that:

  • The electronic version of theses accepted for research degrees after 10th July 2009 will be deposited in the repository on behalf of the students. …

Is changing copyright law the best way to OA?

Steven Shavell, Should Copyright of Academic Works Be Abolished?, working paper, December 18, 2009. Abstract:

The conventional rationale for copyright of written works, that copyright is needed to foster their creation, is seemingly of limited applicability to the academic domain. For in a world without copyright of academic writing, academics would still benefit from publishing in the major way that they do now, namely, from gaining scholarly esteem. Yet publishers would presumably have to impose fees on authors, because publishers would no longer be able to profit from reader charges. If these author publication fees would actually be borne by academics, their incentives to publish would be reduced. But if the publication fees would usually be paid by universities or grantors, the motive of academics to publish would be unlikely to decrease (and could actually increase) – suggesting that ending academic copyright would be socially desirable in view of the broad benefits of a copyright-free world. If so, the demise of academic copyright should probably be achieved by a change in law, for the “open access” movement that effectively seeks this objective without modification of the law faces fundamental difficulties.