Dramatic Growth of Open Access September 30, 2017

Happy Open Access Week!

In brief:  best guesstimate – there are approximately 70 million OA documents today (subset of BASE’s 115 million, about 60% OA), with OA documents at BASE growing at a rate of about 1,800 OA documents per day. Where do these come from? Thousands of OA archives – with PubMedCentral the largest by far at 4.5 million articles and active participation by thousands of journals. This quarter by the numbers the DOAJ team set a new record with a net growth of 689 journals of 7.7 titles per day. However, percentage wise the most remarkable quarterly growth was all about archives, with BioRxiv and SocRXiv topping the growth list by percentage, and as usual several sections of Internet Archive well up on the growth list. On an annual basis, Directory of Open Access Books was the fastest growing in terms of both # of books and # of publishers.

To download the raw data, go to the DGOA dataverse.


Bielefeld Academic Search Engine (BASE), in addition to a great OA search engine, provides the best (if rough) guesstimate of how much we are achieving together, added 2.7 million documents this quarter for a total of 115 million. About 60% of the content in BASE is OA, so this is roughly growth of 160,000 open access items over the past quarter, or about 1,800 documents per day, with a total of about 6.9 million open access documents.

While the growth of open access is always amazing, sometimes it’s more evident by the numbers, other times by the percentage.  By the numbers: this quarter DOAJ net growth was 689 titles – that’s 7.7 titles per day, a record for DOAJ! As of September 30, DOAJ included
10,114 titles. As the chart shows, growth in DOAJ at the searchable article level is particularly remarkable, growing from just over 60,000 in 2004 to close to 2.5 million articles today. Over at PubMedCentral there are now 4.5 million documents with close to 7 thousand journals actively contributing content.

By the percentages, it was a particularly good quarter for open access archives. Newcomers bioRxiv and SocArXiv top the quarterly growth by percentage with growth rates of 25% for bioRxiv (equivalent to doubling in a year) and 22% for SocArXiv (just under doubling in a year). bioRxiv now has 15,000 preprints, SocArXiv close to 1,500. As usual growth at Internet Archive was very impressive, 14% growth in texts (now 14.5 million free texts), 12% growth in the recently added collections category (now close to 300,000 collections) and 9% growth in software (close to 200,000). The RePEC book collection grew by 12% to over 33,000*.

On an annual basis by percentage, Directory of Open Access Books is at the top for growth both in # of books (65% growth, now close to 9,000 titles) and # of publishers (40% growth, 225 publishers). BASE continues to amaze with a 23% increase in content providers over the past year (edging up towards 6,000), and 15% growth in content (now at 115 million documents).

* The RePEC book chapter category also showed amazing growth, but perhaps this is an artefact due to a recent clean-up project as numbers were significantly down last quarter.

This post is part of the Dramatic Growth of Open Access series.

Dramatic Growth of Open Access June 30, 2017

Correction: DOAJ will soon surpass 2.5 million articles, not a quarter of a billion as originally reported. 


Open access continues to demonstrate robust growth on a global scale, in terms of works that are made available open access, ongoing growth in infrastructure (new repositories, journals, book publishers), strong growth for new initiatives such as SocArxiv, BioRxiv, the Directory of Open Access Books, SCOAP3, as well as ongoing strong growth in established services such as BASE, PubMed / PubMedCentral, Internet Archive (check out the new Collections including a Trump archive and FactChecker), DOAJ (almost 2.5 million articles searchable at the article level), RePEC and arXiv. Ongoing growth in infrastructure and OA policy give every reason to expect this growth to be ongoing.

Open Data Version

Morrison, Heather, 2014, “Dramatic Growth of Open Access”, hdl:10864/10660, Scholars Portal Dataverse, V17,


This edition of the Dramatic Growth of Open Access highlights two of the new kids on the OA block – SocArxiv and BioRxiv, modeled on early OA success story arXiv, topping the quarterly growth by percentage with percentage growth of about 30% each! SocArxiv now has 1,200 documents and BioRxiv 12,800.

Similarly, a relative newcomer, the Directory of Open Access Books, is in both first and second place for annual growth by percentage with 68% growth for OA books and 40% of OA publishers in the past year for a total of 8,172 open access books and 217 OA book publishers.

SCOAP3, a global initiative to transform high-energy physics publishing to open access, is showing remarkable growth, 39% in the last year and 8% in the last quarter for a total of 15,790 articles funded.

To celebrate the growth of all OA services two pictures are presented of the growth of the largest collective OA search engine that I am aware of. Together, the 5,000 content providers who contribute metadata to the Bielefeld Academic Search Engine (BASE) have made available over 112 million documents. Around 60% of these are open access, so the number of OA documents in the world can be said to be somewhere about 67 million. BASE also posts their own online statistics table and chart – check it out here.

I wish I had the time to applaud and celebrate the growth of each and every OA service, but with 5,000 services contributing to BASE (and others that don’t), if I worked on this 365 days a year I would have to cover 14 initiatives every day. So please feel free to help out by applauding and celebrating the services most relevant to you – the journals in your discipline, your institutional repository, the services you find most helpful to search.

Below you will find tables listing the top services by quarterly (5% or more) and annual growth (10% or more). For the full numbers download the open data version (link above). As usual Internet Archive is well represented, with 5 items in the list of the top 13 services by quarterly growth and the top 18 services by annual growth. Internet Archive also offers 2 intriguing new services under Collections – a Trump Archive with over a thousand videos and a Fact Checker collection with over 400 items, available at https://archive.org/details/tvhttps://archive.org/details/tv

Of course PubMed and PubMedCentral are up there in the growth charts, in this quarter for total number of items (5% quarterly growth) as well as what looks (to me) like hesitant new steps by a substantial number of journals, with a 26% increase in the number of contributing journals that provide some OA and a 14% increase in the number of journals that provide OA to selected articles. The number of journals providing immediate free access and/or all articles open access continues to increase, so this is clearly growth, not backsliding.

DOAJ is included in the top growth services with 14% growth in the number of articles searchable at article level. DOAJ now has over 2.49 million articles searchable at the article level and should soon surpass 2.5 million articles.

arXiv and RePEC are on the list for strong growth in articles, and ROARMAP for growth in OA policies.


Quarterly growth (percentage) June 2017
32% SocArxiv preprints 1,200
29% BioRxiv all articles 12,280
18% # of academic peer-reviewed books (DOAB) 8,172
18% # publishers (DOAB) 217
8% SCOAP3 articles 15,790
8% Internet Archive Software 178,635
7% Video (movies)  (Internet Archive) 3,437,542
7% Texts  (Internet Archive) 12,821,051
5% Images (Internet Archive) 1,476,743
5% # of content providers (BASE) 5,621
5% Audio (recordings)  (Internet Archive) 3,477,033
5% Webpages (Internet Archive) (in billions) 298
5% PubMedCentral (number of items) 4,400,000

Annual growth (percentage) 06/30/17
68% # of academic peer-reviewed books (DOAB) 8,172
40% # publishers (DOAB) 217
39% SCOAP3 number of archives 15,790
34% Video (movies)  (Internet Archive) 3,437,542
33% Internet Archive: Software 178,635
29% # of content providers (BASE) 5,621
27% Texts  (Internet Archive) 12,821,051
26% PMC journals some OA 609
25% Internet Archive: Images 1,476,743
20% # of documents (BASE) 112,458,360
17% Audio (recordings)  (Internet Archive) 3,477,033
17% RePEc journal articles 1,491,037
14% # of articles searchable at article level (DOAJ) 2,493,835
14% PMC select deposit journals 4,296
13% RePEC downloadable 2,143,844
13% Total Policies (ROARMAP) 872
13% PMC # items 4,400,000
10% arXiv  http://arxiv.org/ 1,278,739

 This post is part of the Dramatic Growth of Open Access Series Feel free to copy and share - with love.  Note that images are compressed by the software to reduce file size, and they are also quickly outdated. You are welcome to use the images, but my recommendation is to download the data and make your own graphics. It's easier than you think with tools like modern spreadsheet software.

Critical Data Literacy, why and how: an Open Education Resource (OER)

This OER was developed for presentation at the Data Power 2017 conference held at Carleton University, Ottawa, Ontario June 22 – 23. This is primarily a framework for how to go about teaching critical data literacy in the student-centered tradition of Freire, supplemented by the work of Tygel and colleagues. A sample introduction developed for Canadian university students, and a few references, are included. My definition of critical data literacy as used in this OER is: 

critical data literacy is the ability to understand and critique how the beliefs and values of people and groups (including government) influence what data is created, how it is shared and how it used by to tell compelling stories by storytellers whose beliefs and values shape the kind of stories they choose to tell and how they tell the stories. Critical data literacy also means having the ability to create and tell one’s own stories using data. 

This OER is released under the terms of copy and share – with love, my latest statement on sharing which can be found at the bottom of this post. The Freire tradition of popular education involves starting with the lived experience of students. In this context, following is what I recommend for anyone who wishes to develop a full critical data literacy program based on the framework. I think that this framework could be adapated for teaching at any level, from community-based learning (led by community groups or organizers or as a participatory action research project) to graduate classes (that’s where I teach). Some of the details would change. For example, if you are teaching at a university, some parts of the process are likely to involve formal evaluation (marking), but if you are teaching to the general public or a community group, this would not make sense. Please adjust as needed for your own context.

The overall approach:

  1. Identify your student group. Think about what kinds of issues or problems they might have that could potentially be helped by data, the kind of data stories they might be familiar with. 
  2. Develop an introduction to critical data literacy. Tygel and colleagues (2015, 2016) found that this was necessary. One way to think about the difference between critical data literacy and basic literacy (reading) is that people who do not know how to read in recent history are likely to be aware of the existence of reading as something that other people do. Data literacy / critical data literacy is not at this point in time as broadly understood as reading.
  3. Plan the 3 phases of the framework that follow directly from the Freire tradition: investigation, thematisation, and problematisation. In these phases, students should lead the learning process (active learning), pursuing problems and questions of their own devising. The teacher’s role is to provide support. 
  4. Plan a systematisation (synthesis) wrap-up approach that makes sense for your student group. In some cases this might be left for the students to decide the approach, and the teacher only helps to guide the students towards this closure. In a formal educational setting, this might involve a pre-determined assignment.
  5. Implement!

The 5 phases are: introduction, investigation, thematisation, problematisation, and systematization (synthesis). Details follow. The introduction section is the most fully developed as this is the only teaching portion that involves imparting knowledge; all others begin with the student.


As noted above, it will not be obvious to everyone what data literacy or critical data literacy is or why they should learn about it, as discovered by Tygel and colleagues (2015, 2016). For this reason, an introduction to the topic may be helpful. In this phase one might invite in guest speakers from the community who use data in their storytelling and/or to provide examples of data storytelling. This is also where definitions of critical data literacy could be introduced. In addition to my definition (see above), I like this definition of data literacy from the Data Journalism Handbook  because it includes the element of critical thinking; not every definition that I have seen includes this, to me a significant omission.

data literacy is the ability to consume for knowledge, produce coherently and think critically about data [emphasis added] (Grey, Bounearu & Chambers (2012)

Following is a sample introduction developed for an audience of Canadian university students. If you are teaching a different type of student group, I recommend that you develop your own introduction tailored to your group. If you do and you are willing to share this with others, please send me a link (via e-mail to Heather dot Morrison at uottawa dot ca) or as a comment to this post and I will include a link to your work in this post. If you would like to use this introduction as is, please see the link to the full presentation.

Introduction slide 1

This slide presents two conflicting stories that are told using basically the same underlying data. One of these (tax freedom day) will be very familiar to the audience, while the other will not as it is relatively new. 

This slide illustrates two very different perspectives on taxation in Canada. On the left, we see the Fraser Institute’s Tax Freedom Day. The Fraser Institute, a right-wing think tank, uses data to tell their story of over-taxed Canadians, working more than half the year for the government before earning a dime for themselves. The idea of tax freedom day has been very effective in Canada over the past few decades. On the right, we see one of the images from the Broadbent Institute’s report The Brass Tax which was published very recently. The left-wing Broadbent Institute challenges the numbers behind the Fraser Institute’s analysis, argues that Canadian taxation is pretty reasonable compared to other countries, and presents a different picture. In this case this graph illustrates Canada’s progressive approach to taxation and makes the point that people with little to no income pay no income tax and only a small percentage of Canadians age 25 to 54 are in the top income tax bracket, paying more than 30% of income in taxes. These are 2 groups of people with a different vision of what society should be like, using the same underlying data to tell 2 very different stories. If we go directly to the data source, will this eliminate the impact of the storyteller? Let’s see.

The following two slides might be more effective as a live demo or in-class lab activity. 
One of the underlying datasets used by both groups is the statistics provided by OECD. If you go to the OECD website there are some neat online tools that let us quickly visualize data in different ways. One of the elements of the data story told by the Fraser Institute is that individual families pay too much in taxes. I wondered if there has been any change in the portion of tax revenue contributed through personal and corporate taxes over the years. Here is what I found using the OECD website. It seems that more tax is gathered from personal rather than corporate taxes, but over the past few years the portions don’t seem to have changed much. This is the default view that shows trends from 2000 – 2015. If this had fit what I already believed, I suspect I would have stopped here. But I seem to recall a relative decrease in corporate taxation over the past few decades so I decided to slide the years covered…
And this is what I found. If we slide the start date of the visualization tool back to 1965, it does appear that there has been a relative increase in tax revenue from personal sources and a relative decrease in tax revenue from corporate sources. This shows how easy it would be for two people with different perspectives on what a data trend is likely to be to go to exactly the same dataset and make a slight change to how the data is visualized to tell two very different stories. 

Kaulfuss uses OECD data to tell a story about U.S. health care spending on a blog called Beyond Economics. The story  is that the U.S. spends two and a half times the OECD average on health. It doesn’t surprise me that the U.S. spends more than the OECD average on health, but I am surprised that the difference is this much. What I found even more intriguing is the author’s claim that U.S. public spending on health is above the OECD average. Who knew? Disclaimer: what I am doing here is presenting stories told through data, I have not examined the data itself so cannot comment on the accuracy of the story.
Wikipedia has a section called Health Care in Canada. Here in Canada many of us – I include myself – think highly of our public health care system, and I think I see this perspective here. This section states that “most health statistics in Canada are at or above the G8 average” in a paragraph that is followed by the table pictured above. The table draws from a number of data sources and appears to me to demonstrate above-average data literacy skills. However…
When you look at the statistics that are presented and calculate the averages, Canada is above average on 3 of 8 measures. This is not “most”. This suggests a need for data literacy. If you look at the specific measures where we are above average, an argument can be made that being above average in life expectancy is a good thing. However, an above-average infant mortality rate is probably not such a good thing. We are also slightly above average on % of government revenue spent on health, but what does this mean and is it a good thing? Looking at some of the areas where we are below average –such as the  # of doctors & nurses per population & % of health costs paid by government – might give one reason to re-consider our narrative that we Canadians are above average in public health. This illustrates a need for critical data literacy. In other words, our beliefs might be getting in the way of understanding what is our existing data tells us.
Some approaches and suggestions  for creating a meaningful introduction     
The reason for the introduction section is because as Tygel and colleagues found there is a need to start with some explanation about what data is and how people use it. There are many potential approaches to introducing the topic such as having guest speakers come to explain how they make use of data and data visualization. 
Suggested sample activity
One activity that would fit here is to have students create their own demonstrations. In the case of tax data, students could do a google search for tax data and limit to images. This search will yield lots of material to work on. The idea is to have students find out who created the visualization and what the story behind the visualization is. If this is done for evaluation purposes, I recommend a pass/fail approach because student success will depend a lot on which images are selected. Being there to hear the findings of all the students is sufficient for this learning exercise. A teacher in an area where computers are not readily available could bring in copies of materials to work with. This introductory phase may be more relevant for some student groups than others, for example university students. If this doesn’t seem to fit, you could skip this stage. 

Investigation, Thematisation & Problematisation

Two key points to keep in mind in these 3 phases: 1) the core focus should be lived experience not imparting abstract knowledge and 2) teaching involves helping people seek and find answers. This is important because in teaching data literacy one might be tempted by starting with the data, teaching people how to understand and work with data. Keynote speaker Gwen Phillips (and BC First Nations data activist) at the Data Power 2017 provided a brilliant example of why not to start with the data: the existing data might not be what is wanted at all. As Gwen said, we should measure what do want (e.g. youth vitality) not just what we don’t want (e.g. teen suicide). This introduces a challenge to develop new metrics, but one that seems worthy of pursuit. If we start by teaching about existing data we risk missing the opportunity to identify gaps like this.

Disclosure: in understanding the following 3 phases, it may be helpful to know that although I teach at a university and am very engaged in pedagogy, I do not have an education degree and do not consider myself an expert on pedagogy. If you would like to know more about how to teach in the Freire tradition, I suggest starting with the Tygel references below and if desired supplementing with general educational books and articles covering the Freire tradition. My contributions below are limited to providing a very quick introduction and making the connection with critical data literacy.


The investigation phase is the first of 3 phases that follow the Freire tradition. The idea is to begin with lived experience, with real-world problems. If this approach is used for self-teaching by community groups independently or with an academic consultant as a participatory action research project, this is closest to the classic Freire scenario and the best example of a pure investigation stage. To modify this for an education setting, students could either choose problems or issues of direct interest to them, for example student debt, or they might brainstorm a particular target group whose problems they are familiar with such as First Nations, a salient issue here in Canada as many of us struggle to implement the recommendations of our Truth & Reconciliation commission. Classroom activities could include a brainstorm session, individual or small group reflection, and/or presentation of the results of the investigation stage.
Thematisation is the first analytic stage. Before searching for what data is available, the idea is to focus on the real-world issue and figure out what kind of data might help to understand or resolve the issue. Examples based on today’s case studies on taxation and health spending could include learning what sorts of taxes are collected and by which governments, or comparing public collective health spending with individual spending.


After thematisation, with some back-and-forth, comes problematisation. This is where we get into research on what kinds of data actually exist that is relevant to the problem, who collects the data and why. Some examples of the types of data sources students might look into at this point if they choose to focus on taxation and spending:

  • Canada Revenue Agency
  • OECD
  • Federation and provincial budgets
  • Academic Research 
  • NGO / Think Tank research (e.g. Fraser Institute and Broadbent Institute) 
One question that might be raised is whether the existing data is actually sufficient or not, that is, the scope of the inquiry is not focused just on understanding what data is available. but rather what is needed to understand and resolve the problem of interest. 
Finally, in the systematization stage we put what we have together to come up with an action plan. The nature of the action plan might vary quite a bit depending on the students. An activist community group might want to develop an action campaign or an infographic or other data story to facilitate an existing action campaign. One approach to action could involve citizen data collection. In a graduate class on information policy, like the classes that I teach at the University of Ottawa’s School of Information Studies, developing a policy briefing and recommendations for evaluation as academic work might make sense. 

Fraser Institute (n.d.). Tax freedom day calculator. Retrieved June 9, 2017 from https://www.fraserinstitute.org/tax-freedom-day-calculator

Grey, J., Bounegru, L., & Chambers, L. (2012). Data Journalism Handbook. OKFN. (as cited in Tygel & Kirsch 2016)
Kaulfuss, R. (2017). Health care: human right or expensive entitlement? Beyond economics. Retrieved June 15, 2017 from  https://beyondeconomics.org/2017/03/15/health-care-human-right-or-expensive-entitlement/
OECD (2017), Tax revenue (indicator). doi: 10.1787/d98b8cf5-en (Accessed on 15 June 2017)
Shillington, R. & Shaban, R. (2017). The brass tax: busting myths about overtaxed Canadians. Ottawa: Broadbent Institute. Retrieved June 9, 2017 from http://www.broadbentinstitute.ca/the_brass_tax

Tygel, A.; Campos, M.; De Alvear, C. (2015). Teaching open data for social movements: a research strategy. The Journal of Community Informatics 11:3. Retrieved June 19, 2017 from http://ci-journal.net/index.php/ciej/article/view/1220/1165
Tygel, A.; Kirsch, R. (2016). Contributions of Paulo Freire for a critical data literacy: a popular education approach. The Journal of Community Informatics 12:3 pp. 108 – 121. Retrieved June 19, 2017 from http://ci-journal.net/index.php/ciej/article/view/1296.
Wikipedia (n.d.). Healthcare in Canada. Retrieved June 15, 2017 from https://en.wikipedia.org/wiki/Healthcare_in_Canada 

Terms:  Please copy and share with love.
What does this mean? In brief, I have no interest in using intellectual property law to prevent anyone from using or re-using my work with intentions such as furthering the collective knowledge of humanity (truth with justice and compassion), protecting or restoring the environment or making the conditions of life of humanity better. That is what I mean by with love. If your motives in using my work are something other than love, such as making a profit for yourself or a corporation that you work for, subverting truth, justice, or compassion, then note that I reserve all rights under copyright. Please use attribution as appropriate. For example, if you use my work in an academic or journalist context, you need to acknowledge me as author in order to avoid plagiarism (and confusion).

This post is part of the Creative Globalization series

Novel processes and metrics for a scientific evaluation: preliminary reflections

Reflections on  Michaël Bon, Michael Taylor, Gary S. McDowell. “Novel processes and metrics for a scientific evaluation rooted in the principles of science – Version 1”. SJS (26 Jan. 2017)

Following are my initial reflections on what I would describe as a ground-breaking effort toward articulating a radically transformation of scholarly communication, a transformation that I regard as much needed and highly timely as the current system is optimized for the technology of the 17th century (printing press and postal system) and is far from taking full advantage of the potential of the internet.

The basic idea described by the authors is to replace the existing practices of evaluation of scholarly work with a more collaborative and open system they call the Self-Journals of Science


The title Self-journals of science: I recommend coming up with a new name. The name is likely to give the impression of vanity publishing, even though this is not what the authors are suggesting, which appears to be more along the lines of a new form of collaborative organization of peer review.  

Section 1 Introduction: the inherent shortcomings of an asymetric evaluation system appears to attempt to describe how scientific communication works, its purpose, and critique, with citations, in just a few pages. This is sufficient to tell the reader where the authors are coming from, but too broad in scope to have much depth or accuracy. I am not sure that it makes sense to spend a lot of time further developing this section. For example, the second paragraph refers to scientific recognition as artificially turned into a resource of predetermined scarcity. I am pretty sure that further research could easily yield evidence to back up this statement – e.g. Garfield’s idea of the “core journals” to eliminate the journals one needn’t bother buying or reading, and the apparently de facto assumption that a good journal is one with a high rejection rate. On page 3, first paragraph, 4 citations are given for one statement. A quick glance at the reference list suggests that this may be stretching what the authors of the cited works have said. For example, at face value it seems unlike that reference 4 with a title of “Double-blind review favours increased representation of female authors” actually supports the author’s assertion that “Since peer-trials necessarily involve a degree of confidentiality and secrecy..many errors, biases and conflicts of interest may arise without the possibility of correction”. It seems that the authors of the cited article are making exactly the opposite argument, arguing that semi-open review results in bias. If I was doing a thorough review, I would look up at least a few of the cited works and if the arguments cited are not justified in the cited works I hand the work of reading the works cited and citing appropriately back to the authors.

The arguments presented are provocative and appropriate for initiating an important scholarly discussion. Like any provocative work, the arguments may be relatively stronger for the task of initiating needed discussion but somewhat weak due to lack of counter-argument. For example, the point of Section 1.4 is that “scientific conservatism is placing a brake on the pace of change”. Whether anything is placing a brake on the pace of change in 2017 is, I believe, arguable. However, the authors also do not address the benefits of scientific conservatism here, although the arguments made elsewhere e.g. “The validity of an article is established by a process of open and objective debate by the whole community” are arguments for scientific conservatism (or so I argue). The potential benefits of scientific conservatism are not addressed. For example, one needs to understand this tendency of science to fully appreciate the current consensus on climate change.

Section 2 defines scientific value as validity and importance

There are some interesting ideas here, however the authors conflate methodological soundness with validity. A research study can reflect the very best practices in today’s methodology and present logical conclusions based on existing knowledge while still being incorrect or invalid (lacking external validity) for such reasons as limitations on our collective knowledge. A logical argument based on a premise incorrectly perceived to to be true can lead to logical but incorrect conclusions.

The authors state that “the validity of an article is established by a process of open and objective debate by the whole community”. This is one instance of what I see as overstatement of both current and potential future practice. Only in a very small scholarly community would it be possible for every member of the community to read every article, never mind have an open and objective debate about each article. I think the authors have a valid point here, but direct this at the wrong level. This kind of debate occurs with the big picture paradigmatic issues such as climate change, not at the level of the individual article.

Perceived importance of an article is given along with validity as the other measure for evaluation of an article.  This argument needs work and critique. I agree with the author (and Kuhn) about the tendency towards scientific conservatism, and I think we should be aware of bias in any new system, especially one based on open review. People are likely to perceive articles as more important if they advance work that falls within an existing paradigm or a new one that is gaining traction than truly pioneering work. With open review, I expect that authors with existing high status are more likely to be perceived to be writing important work while new, unknown, female authors or those from minority groups are more likely to have their work perceived as unimportant.

I do not wish to dismiss the idea of importance, rather I would like to suggest that this needs quite a bit of work to define appropriately. For example, if I understand correctly replication of previous experiments is perceived as a lesser contribution than original work. This is a disincentive to replication that seems likely to increase the likelihood of perpetuating error. Assuming this is correct, and we wish to change the situation, what is needed is something like a consensus that replication should be more highly valued, otherwise if we rely on perceived importance this work is likely to continue to be undervalued.

Section 2.2 Assessing validity by open peer review

This section presents some very specific suggestions for a review system. One comment that I have is that this approach reflects a particular style. The idea of embedded reviews likely appeals more to some people than to others. Journals often provide reviewers with a variety of options depending on their preferred style; a written review like this, or go through the article and track changes. The + / – vote system for reviews strikes me as a tool very likely to reflect the personal popularity of reviewers and/or particular viewpoints rather than adding to the quality of reviews. There are advantages and disadvantages to authors being able to select the reviews that they feel are of most value to them. The disadvantage is that authors with a blind spot or conscious bias are free to ignore reviews that a really good editor would force them to address before a work could be published.

Section 3 Benefits of this evaluation system

Here the authors argue that this evaluation system can be transformed into metrics for the purpose of evaluation (number of scholars engaged in peer review, fraction that consider the article is up to scientific standards) and for importance (the number of authors that have curated the article in their self-journal). Like the authors, I think we need to move away from publishing in high impact factor journals as a surrogate of quality. However, I argue against metrics-based evaluation, period. This is a topic that I will be writing about more over the coming months. For now, suffice it to say that quickly moving to new metrics-based evaluation systems appears to me likely to create worse problems than such a move is meant to solve. For example, if we assume that scientific conservatism is a thing and is a problem, isn’t a system where people are evaluated based on the number of people who review one’s work and find it up to standards likely to increase conservatism?

Some strengths of the article:

  •  recognizing the need for change and hopefully kick-starting an important discussion
  • starting with the idea that we scholars can lead the transformation ourselves
  • focus on collaboration rather than competition

To think about from my perspective:

  • researcher time: realism is needed. An article that is reviewed by two or three people who are qualified to judge soundness of method, logic of arguments and clarity of writing should be enough. It isn’t a good use of the time of researchers to have a whole lot of people looking at whether a particular statistical test was appropriate or not.
  •  this is work for scholarly communities, not individuals. The conclusion speaks to the experience of arXiv. arXiv is a service shared by a large community and supported by a partnership of libraries that has staff and hosting support.  
  • the Self-Journals of Science uses the CC-BY license as a default.  Many OA advocates regard this license as the best option for OA, however I argue that this is a major strategic error for the OA movement. My arguments on the overlap between open licensing and open access are complex and can be found in my series Creative Commons and Open Access Critique. To me this is a central issue that the OA movement must deal with, and so I raise it here and continue to avoid participating in services that require me to use this license for my work.

Key take-aways I hope people will get out of this review:

  • forget metrics – don’t come up with a replacement for impact factor, let’s get out of metrics-based evaluation altogether
  • look for good models, like arXiv because communities are complicated. What works?
  • let’s talk – some of us may want immediate radical transformation of scholarly communication, but doing this well is going to take some time, to figure out the issues, come up with potential solutions, let people try stuff and see what works and what doesn’t, and research too
  • be realistic about time and style – researchers have limited time, and people have different preferred styles. New approaches need to take this into account.

For more on this topic, watch for my keynote at the What do rankings tell us about higher education? roundtable at UBC this May. 

      Dramatic Growth of Open Access December 31, 2016

      Download data here


      Arguably the best indicator of the global collaborative growth of open access, whether through archives or publications, is the ongoing impressive growth of what we can access through the Bielefeld Academic Search Engine, which surpassed two major milestones in 2016: over 100 million documents (about 60% open access) and 5,000 content providers. The growth rates (22% for documents, 27% for content providers) are particularly impressive given the high pre-existing content rate. This is amazing success not just for BASE, but for all of us. If you’ve published a thesis through an institutional repository that allows for metadata harvesting, or published an article in a journal that contributes article-level data for metadata harvesting, your contribution is reflected here. This is a meta-level indicator of our global success.

      I’ve added a new metric for medical open access, a keyword search of PubMed for “cancer” for articles with no date limit, last 5 years, last 2 years, and last year, further limited to free fulltext to determine the percentage of items for which fulltext is available. This ranges from 26% overall (no date limit), to 40 – 44% for items published in the last 2 – 5 years, to 32% for articles published in the last year.

      Also added this quarter: OECD iLibrary – with more than 11,000 free books, this one publisher’s OA collection is nearly double the size of the 167 publishers included in the impressivley growing Directory of Open Access Books! arXiv, in addition to an over 10% growth rate last year, inspired the recent development of two similar services, socArXiv and bioRxiv, newly added to facilitate future growth tracking. The DOAJ get-tough inclusion policy and March 2016 major weeding means the DOAJ count for titles, countries and journals searchable at the article level are all down from last year, while articles searchable at the article level through DOAJ continued to show robust growth of 13%. DOAJ’s quarterly growth is back to an impressive rate of just under 3 titles per day. RePEC surpassed a milestone of 2 million downloadable items this year, while Internet Archive surpassed 3 milestones: there are now more than 3 million video and audio recordings, and more than 11 million texts (the number of IA web pages archived is way down, by the billions – such a difference it strikes me as likely due to a glitch in counting, whether before or after). Recently Open Journal Systems announced that OJS is now used by more than 10,000 active journals which <>.

      Kudos and thanks to everyone in the open access movement – every researcher, author, editor, publisher, archive manager, librarian, policy-maker, and activist who is making open access happen. What of 2017? My advice: let’s remember the beautiful vision of the potential unprecedented public good of open access – forged not at a time of peace and certainty, but rather within months of the trauma of 9/11 – repeated below – and keep on making it happen.

      BOAI vision:

      An old tradition and a new technology have converged to make possible an unprecedented public good. The old tradition is the willingness of scientists and scholars to publish the fruits of their research in scholarly journals without payment, for the sake of inquiry and knowledge. The new technology is the internet. The public good they make possible is the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it by all scientists, scholars, teachers, students, and other curious minds. Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge.

      Selected numbers and growth by service:

      Directory of Open Access Journals 

      Highlights: in March 2016 DOAJ removed more than 3,000 journals, reflecting a new get-tough inclusion policy. All journals that had not gone through DOAJ’s new application process were removed. As a result, in spite of robust quarter since the removal process, most of DOAJ’s key data are lower at the end of 2016 than at 2015, with the exception of number of articles searchable through DOAJ which grew by 13%.

      • 9,455 journals (down from 10,963 in 2015, a 14% decrease. Note that this quarter DOAJ added 246 journals for a current growth rate of close to 3 titles per day).
      • 6,634 journals searchable at article level (down from 6,780 in 2015, a 2% decrease. Note that this quarter DOAJ increased the number of searchable journals by 217).
      • 2,400,258 articles (up 13% from 2,123,402 at the end of 2015, very impressive given the journal weeding process)
      • 128 countries (down from 136 at the end of 2016)

      Electronic Journals Library

      •  55,562 journals that can be read free-of-charge (up from 51,983 at the end of 2017, a 7% growth rate)

      OECD iLibrary  * (selected data points) (just added, no growth figures)

      • 11,050 e-book titles
      • 5,130 multilingual summaries
      • 5,200 working papers
      • 5 billion data points across 42 databases

        Directory of Open Access Books

        • 5,602 books (up from 3,789 at the end of 2015, a 48% growth rate)
        • 167 publishers (up from 134 at the end of 2014, 33 publishers added, a 25% growth rate)


        3,000 repository milestone!!!

        • 3,285 repositories (up from 2,991 at the end of 2015, a 10% growth rate)

        Registry of Open Access Repositories

        •  4,365 repositories (up from 4,147 at the end of 2015, a 5% growth rate)

        Bielefeld Academic Search Engine 

        100 million document milestone!!!
        5,000 content providers milestone!!!

        • 103,090,961 documents (up from 84.25 million at the end of 2015, a 22% growth rate)
        • 5,023 content sources (up from 3,965 at the end of 2015, a 27% growth rate)


        4 million article milestone!!!

        •  4.1 million articles (up from 3.7 million at the end of 2015, an 11% growth rate)
        • 2,326 journals actively participating in PubMedCentral (up from 2,021 at the end of 2015, a 10% growth rate)
        • 1,720 journals with immediate free access (up from 1,553 at the end of 2015, an 11% growth rate)
        • 1,426 journals with all articles open access (up from 1,331 at the end of 2015, a 7% growth rate)
        • 569 journals with some articles open access (up from 423 at the end of 2015, a 35% growth rate)


          • 1,219,224 preprints (up from 1,105,906 at the end of 2015, a 10% growth rate)

          SocArXiv Preprints (launched December 7, 2016, inspired by arXiv)  **

          • 631 searchable preprints

          (in beta December 31, 2016, inspired by arXiv) ***

          • 7,500 articles (based on “all articles” search, 750 pages X 10 articles / page)


          2 million downloadable items milestone!!!

          • 2,021,534 downloadable items (up from 1,942,541 at the end of 2015, a 13% growth rate)


          • 803 total open access mandate policies (up from 762 at the end of 2015, a 5% growth rate)

          Internet Archive

          3 million milestones for video and audiorecordings!!!
          10 million milestone for texts (now 11 million)!!!

          • 11 million texts (up from 8.8 million at the end of 2015, a 26% growth rate


           * OECD iLibrary statement on free-to-read (from About page):

          All book and journal content is available to all users to read online by clicking the READ icon. Read editions are optimised for browser-enabled mobile devices and can be read online wherever there is an internet connection – desktop computer, tablets or smart phones. They are also shareable and embeddable.
          The site also features content for all users to access and download such as the OECD Factbook, OECD Working Papers, Indicators, and more.
          Subscribers benefit from full access to all content in all available formats.

          ** about SocArXiv (from the Dec. 7, 2016 launch announcement):

          SocArXiv, the open access, open source archive of social science, is officially launching in beta version today. Created in partnership with the Center for Open Science, SocArXiv provides a free, noncommercial service for rapid sharing of academic papers; it is built on the Open Science Framework, a platform for researchers to upload data and code as well as research results

          *** about bioRxiv (from about page):

          bioRxiv (pronounced “bio-archive”) is a free online archive and distribution service for unpublished preprints in the life sciences. It is operated by Cold Spring Harbor Laboratory, a not-for-profit research and educational institution. By posting preprints on bioRxiv, authors are able to make their findings immediately available to the scientific community and receive feedback on draft manuscripts before they are submitted to journals.

          This post is part of the Dramatic Growth of Open Access series.

            Dramatic Growth of Open Access September 30, 2016


            There is plenty to celebrate for this year’s Open Access Week October 24 – 31 everywhere! 

            As of Oct. 6, 2016, a Bielefeld Academic Search Engine (BASE) search includes over 100 million documents! Globally the collections of open access archives are now collectively an order of magnitude larger than the 10 million articles and books claimed by Elsevier for Science Direct. Congratulations to BASE and everyone in the repositories movement that is making this happen!

            In spite of a vigorous weeding process, new get-tough inclusion policy and negative growth in the past year in journal numbers, the Directory of Open Access Journals showed an amazing 11% growth in the past year in articles searchable at the article level – about half a million more articles today than a year ago. This past quarter DOAJ showed a healthy growth rate of 135 titles or added 1.5 titles per day.

            For every journal added by DOAJ in the past quarter, another repository was added to the vetted OpenDOAR collection of repositories.

            The Internet Archive now has more than 3 million audio recordings.

            The Directory of Open Access Books added over 2 thousand titles in the past year for a current total of over 5,000 titles (60% annual growth rate) from 161 publishers (41% annual growth rate in publishers).

            The number of journals actively contributing to PubMedCentral continues to show strong growth in every measure: there are 212 more journal active participants in PMC today than a year ago, a 10% growth rate; 170 more journals provide immediate free access, an 11% growth rate; 113 more journals provide all articles as open access, a 9% growth rate; and the number of journals with some articles open access increased by 123, a 31% growth rate.

            Full data is available for download from here.

            This post is part of the Dramatic Growth of Open Access series. 

            Dramatic Growth of Open Access June 30, 2016

            Highlights this quarter include a new indicator illustrating that 42% of the cancer literature indexed by PubMed is available as free full-text within 3 years of publication; ongoing strong growth in open access archives and their content; milestone of over 10 million free texts for Internet Archive; a mix of negative growth reflecting clean-up at DOAJ and growth in articles searchable at the article level; over 50% annual growth at the Directory of Open Access Books; and concern noted about the apparent ongoing growth of Elsevier and what this might mean for open access.


            New indicator : a search of the PubMed index for “cancer” for all articles and with limits by date of publication demonstrates that 42% of the cancer literature indexed in PubMed published in the last 3 years is available as free fulltext.  17% is available as free fulltext within 30 days of publication, 31% within one year of publication. With no date limits the overall percentage is 26% of the 3.3 million articles on cancer indexed by Pubmed.


            Results of PubMed search for “cancer”
            with limits by date of publication and free fulltext
              # of articles free fulltext % free fulltext
            30 days 19,050 3,206 17%
            60 days 32,562 6,540 20%
            90 days 46,057 10,382 23%
            180 days 85,913 23,421 27%
            last 1 year 162,335 50,499 31%
            last 2 years 323,252 126,867 39%
            last 3 years 475,973 198,505 42%
            no date limit 3,318,957 861,168 26%

            Kudos to Internet Archive for exceeding 10 million free texts!

            Ongoing strong open access archives growth is illustrated by OpenDOAR adding close to 200 repositories over the past year, a 7% growth rate and a total of over 3,000 repositories. The Registry of Open Access Repositories added 269 repositories over the past year, also a 7% annual growth rate for a total of over 4,000 repositories. The Bielefeld Academic Search Engine is now searching over 93 million documents from over 4,000 repositories. With growth of over 18 million documents over the past year (24% annual growth rate), it won’t be long before BASE passes the 100 million milestone. arXiv grew by over 10% over the past year, adding over 100,000 documents for a total of 1.6 million.

            The Directory of Open Access Books grew by over 50% in the past year for a total of close to 5,000 books from more than 150 publishers. 

            In spite of overall negative growth reflecting a major “get-tough” clean-up project, the Directory of Open Access Journals‘ number of articles searchable at the article level which grew by 16% over the past year, over 300,000 more articles for a total of over 2.1 million articles. On May 9, DOAJ removed over 3,000 journals that had not filled out the new application form. Since that date, DOAJ has added 234 titles for a DOAJ growth rate of 4.5 journals per day. Watch for continuing strong growth in the next few quarters as DOAJ has hired a team of international ambassadors. 

            The ongoing dramatic growth of Elsevier 

            The Social Sciences Research Network (SSRN) is still included in the downloadable data.  I would like to note concern about its current and future open access status and commitment, particularly since it was recently bought by Elsevier, features the ad for “free subscriptions to more than 500 partner-sponsored abstracting e-journals [emphasis added]”, (copied below for purposes of academic critique – please contact SSRN for other uses), the SSRN website indicates partnerships with providers of pay-per-view, and the message from chairman Michael Jensen on the Elsevier sale indicates that part of what is behind this is Elsevier’s desire to expand into social sciences.

            In addition to the SSRN buyout, as noted on my research blog Sustaining the Knowledge Commons, Elsevier is now the world’s largest open access publisher as measured by number of fully open access journals.

            Open access may resolve the access problem, however OA per se does not address the increasing commercialization of scholarly journal publishing and increasing market concentration that has been happening since the end of the second world war. The growing presence of large traditional commercial scholarly publishers in open access is something to watch, in particular because ongoing open access is likely not compatible with maximal profit-making.

            As usual the full data is available for download from the DGOA Dataverse:  http://dataverse.scholarsportal.info/dvn/dv/dgoa

            This post is part of the Dramatic Growth of Open Access series.

            Canada’s draft new action plan on open government 2016 – 2018

            Following are my comments on Canada’s draft new action plan on open government 2016 – 2018

            Canada’s Draft New Plan on Open Government 2016-2018
            Individual Comments by Dr. Heather Morrison
            Kudos are in order to Canada’s government for global leadership, commitment, and swift moves by our new government to action, notably in the areas of commitment to open access and open data to both academic and government information, commitment to creation of a Chief Science Officer position, restoring the mandatory long form census, forthcoming free and more timely access to Statistics Canada data, and initiating electoral reform (to mention a few moves!). Following are my comments as an expert in the area of information policy, notably open access, intended to help strengthen a solid, ambitious but realistic draft plan. In the spirit of openness and transparency, note that I am a professor at the University of Ottawa’s bilingual School of Information Studies and I see career opportunities for our graduates and research opportunities for me arising from this plan and some of my suggestions.
            Summary of key points
            ·       Reconsider centralization or the “one-stop” approach. Sometimes this is a good idea (one stop search for grants and contributions, single point of access to all geospatial data). However, centralization can also be a bottleneck and even a muzzling device. Decentralization with website and open data development in the hands of departmental experts who understand the information they are working with and how people will want to use it is probably in many instances the most effective means of providing open government information and data. I want my weather information directly from Environment Canada and my tax data directly from Canada Revenue Agency, not indirectly from a central service where staff are not likely to be experts in these areas.
            ·       Consider expanding information services to include reference service (professional service by intermediaries with understanding of information seeking behavior as well as government information), both through government and indirectly through libraries of all types (through advocacy for this role with key partners). This has the potential to provide better service and sometimes reduce cost. For example, in the area of Access to Information, overly broad requests may reflect lack of knowledge of the specific documents or data most likely to address a need. Direct communication with requestors may be the best means to hone requests.
            ·       Beware what I characterize as a blind spot of completely unrestricted re-use which could lead to intended consequences (for example effectiveprivatization of currently free public services). Impose reasonable expectations of behaviour by re-users that is in the public interest, and encourage development along these lines at the global level.  
            ·       Remember the vulnerable. Sometimes the best approach to open government is in-person offices. Open data and data visualization are a boon for those of who can see but a challenge for the visually disabled. Proactively address this challenge rather than waiting for complaints. Consider and consult First Nations peoples before releasing data about resources on their lands or lands that they depend on that could be exploited to their detriment.
            ·       Build in protection against the inevitable temptations of power and the understandable human tendency to want to look good. Access to Information – an effective means to demand information that the government does not choose to make open – will always be needed for really open government. I also recommend an arms-length approach to developing data visualization services, because it is easy to develop services that help people to see what we want them to see; our truth rather than the truth.
            ·       Considerable research is needed on how to go about meaningfully engaging a whole population in open dialogue and policy-making. This particular potential of open government will take an extended period of time for full development. This should be factored into assessment of progress.
            ·       Immediately apply principles and best practices of open dialogue and policy-making in trade treaty negotiations, beginning with the Trans Pacific Partnership.
            ·       Expand on corporate accountability through a review of legislation on corporations and consultations with the private sector, academics and other stakeholders to understand barriers to triple bottom line accounting (finance, people and environment) and propose solutions.
            Detailed comments
            Detailed comments are presented below in two sections, Overarching comments and specific comments on the draft plan.
            Overarching comments
            To centralize or not to centralize?
            The draft plan refers in several places to centralization (single portal, one-stop etc.). I recommend re-thinking of the benefits of centralization versus decentralization. Sometimes, centralization can result in streamlining of access for the citizen; commitment 11, one-stop access to data on grants and contributions is a good example of this.  However, centralization can also be a bottleneck or even a muzzling device. Weather information is both interesting and important to the public. To have the best information on whether a potentially dangerous storm is headed in my direction, I look to the experts at Environment Canada to post what they know as soon as they possibly can. Sending information to a central service would simply create delays and likely impede good decision-making by Canadians. Governments create different departments for good reasons. The type of information provided and how it is best structured to be understood by the public will vary with the type of information. When it’s time to reconcile my taxes I want a website that is under the control of the best experts in taxation and web development for this type of information. I note below particular sections of the plan where I see centralization as beneficial or problematic.
            What’s missing?
            Reference and information literacy services are needed (directly through government and indirectly through libraries) and would reduce in some cases reduce the workload.
            As a professor in the area of information studies, former practicing professional librarian and researcher in the areas of open access, open government, and access to information, I have had many discussions with students, experts, and government staffers who provide services such as responding to ATI requests about the challenges and opportunities. In my professional opinion, the Government of Canada could provide better service, sometimes at lower cost through a kind of service akin to the tradition of library reference services. For example, one of the reasons ATI requests can seem to be “frivolous and vexatious” appears to be that people request very large amounts of information because they do not have sufficient understanding of government operations to know what to ask for. Having a professional serving in an intermediary role who understands both information seeking behaviour and the kind of information that is held by government would likely be more efficient in many cases.
            Helping people find the information they need (reference services) and providing education on how to understand the need for information, find, evaluate and effectively use it (information literacy), is a traditional role of public, school, corporate and academic libraries.
            Recommendation: work with Library and Archives Canada and open government representatives at all levels (municipal, provincial, global) to advocate for an emerging role for libraries of all types in the areas of open government and incorporate professional information services within government departments.
            Openness and transparency in trade treaty negotiations
            Moving towards openness and transparency in government while at the same time failing to engage with citizens on trade agreements that will impact our jobs, communities, and businesses, is moving in opposite directions at the same time. Recommendation: extend open dialogue to trade treaty negotiations, beginning with the Trans Pacific Partnership.
            Open government and access to government services for people with disabilities
            Open data and the potential for data visualization offer tremendous potential for the advancement of Canadian society and should be embraced. However, the formats also create new challenges for people with disabilities such as print disabilities. Recommendation: address these challenges proactively through working with groups representing disabled communities and show global leadership in advocating for technological solutions to facilitate equitable open government.
            Consider restrictions on access to data to avoid harm to vulnerable groups
            The plan appropriately recognizes the need to consider the protection of personal privacy in the release of open data. I recommend that potential harm to vulnerable groups be another consideration in deciding whether data should be released. For example, data about valuable exploitable resources on lands our First Nations peoples own or depend on should not be released without consultation with the peoples who would be affected.
            Specific comments on the draft plan
            Introduction – Towards an Open and Transparent Government
            Re third bullet: “a review of the Access to Information Act, and efforts to accelerate and expand initiatives to help Canadians easily access and use open data, by the President of the Treasury Board working with the ministers of Justice and Democratic Institutions”
            Suggestion: split into 2 bullet points to avoid confusion because Access to Information and open data initiatives are two very different types of activities.

            The Open Government Partnership
            Re: the fifth grand challenge, “Increasing corporate accountability”: measures that address corporate responsibility on issues such as the environment, anti-corruption, consumer protection, and community engagement.
            Comment: addressing this challenge would be a golden opportunity to begin to address the limitations of the corporate sector’s single bottom line focus on profit, financially defined. This draft plan is weak in this sector and I would like to see expansion of commitments in this area. Some suggestions:
            ·       Review legislation on corporations and other businesses to recognize triple bottom line accounting (financial, social, environment)
            ·       Develop a consultation process with citizens, civil society organizations, academics and business to uncover challenges to corporate accountability and draft solutions

            IV. A. Open by Default
            Re: Third paragraph, “Being “open by default” also means allowing Canadians to more easily access government services through a single online window [emphasis added]”.
            Recommendation: change this sentence to “Being “open by default” also means allowing Canadians to more easily access government services through effective access mechanisms designed to facilitate accountability on service delivery [emphasis added]”.
            Comments: see “to centralize or not to centralize” above.

            Commitment 1: Enhance Access to Information
            It is good to see a commitment to updating the Access to Information Act. Open government will never replace the need for a mechanism for citizens to effectively demand access to information. Government by definition holds power, and power inevitably will attract those who wish to pursue personal gain through corruption. Also, mistakes and poor decisions or even good decisions that did not produce the expected results cannot always be avoided. There will always be a temptation for government staff as well as elected representatives to open or close, highlight or suppress information based on whether it makes the government look good. If you don’t want to release a piece of information it’s all too easy to perceive a request for the information as “frivolous and vexatious”. An important strength of the action plan is “giving the Information Commissioner the power to order the release of government information”.
            Re first bullet: “Making government data and information open by default, in formats that are modern and easy to use;”
            Suggestion: add a second and third bullet to address the ongoing need for ATI and to streamline the process through the provision of reference services:
            ·       Providing easy-to-use, cost-free mechanisms for requesting any information that is not open by default;
            ·       Develop professional intermediary services to help requestors identify with precision the information required
            Comment: re the second suggested bullet, see the section “reference and information services” above.

            Commitment 2: Streamline Requests for Personal Information
            Re: How it will be done – line 2: “a simple, central website [emphasis added] where Canadians can submit requests to any government institution”.
            Suggest change to: “a simple, central website where Canadians can submit requests to any government institution to supplement requesting services that are most efficiently handled by the collecting department”.
            Comment: see the section “to centralize or not to centralize?” above

            Commitment 3: Expand and Improve Open Data
            Re: 5th milestone: “Improve Canadians’ access to data and information proactively disclosed by departments and agencies through a single, common online search tool[emphasis added]”
            Suggest change to “Improve Canadians’ access to data and information proactively disclosed by departments and agencies through departmental websites as well as a single, common online search tool”
            Comment: see the section on “to centralize or not to centralize” above.

            Commitment 4: Provide and Preserve Open Information
            Re: Milestone 4: “Update Library and Archives Canada’s online archive of the Government of Canada’s web presence to ensure Canadians’ long-term access to federal web content”.
            Recommendation – add a Milestone: consult with academic experts and Library and Archives Canada to develop a plan, recommendation and funding analysis to capture Canadian content on the web.
            Comment: I applaud the addition of this milestone, but would note that we need to capture Canadian content on the web in general, not just federal web content. Currently, some of this content is voluntarily captured by Internet Archive, however I think Canadians have a duty to take this on ourselves, for profound social, legal and cultural reasons. Material that until recently was produced in print and often archived and preserved by libraries and archives is increasingly available only online and risks being lost, sometimes after only a short period of time.

            Commitment 7: Embed Transparency Requirements in the Federal Service Strategy
            Re first Milestone “Development a Government and Canada Clients-First Service Strategy that aims to create a single online window [emphasis added] for all government services”.
            Suggest change to: Development a Government and Canada Clients-First Service Strategy that aims to create a efficient and effective online access [emphasis added] for all government services through a departmental or centralized online window, whichever is most effective for citizens”.
            Comments: see to centralize or not to centralize above.

            Commitment 8: Enhance Access to Culture & Heritage Collections
            Re: “The Government of Canada will expand collaboration with its provincial, territorial, and municipal partners and key stakeholders to develop a searchable National Inventory of Cultural and Heritage Artefacts to improve access across museum collections”.
            Comment / question: how does this relate to Library and Archives Canada’s Building a Canadian National Heritage Digitization Strategy? http://www.bac-lac.gc.ca/eng/about-us/Pages/national-heritage-digitization-strategy.aspx

            B. Fiscal Transparency
            Re: second paragraph, “…the government will provide Canadians [emphasis added] with the tools they need to visualize spending data and compare fiscal information across departments, between locations, and over time”.
            Suggested change to “…the government will develop an arms-length service to provide Canadians with the tools they need to visualize spending data and compare fiscal information across departments, between locations, and over time and encourage all members of the open government partnership to do likewise”.
            Comment: it is fairly easy for an interested party to set up visualization tools to “help” people see things like financial data from a particular perspective. This can be deliberate or reflect unconscious biases. For example, to help people understand tax data, one can choose from a number of different potential comparison points. The tax freedom date approach showing how long it takes an average Canadian to work to pay taxes before they get to keep money is a good choice for people ideologically opposed to taxation and seeking tax breaks. In contrast, those of us who think public health care is the right way to go both for social and financial reasons tend to see data demonstrating the lower per-capita health spending in Canada as compared to countries with private health care as an obvious and important way of demonstrating the truth. A government that has succeeded in lowering corporate taxes by two-thirds and does not want public critique creeping into public budget discussions might be tempted to present budget data showing how little is gained by a small to medium increase in the existing corporate tax rate and avoid historical comparisons. A government determined to reserve the corporate tax rate cuts would likely emphasize historical comparisons.

            Commitment 10: Increase Transparency of Budget Data and Economic and Fiscal Analysis
            Re: “The Government of Canada will provide access to the datasets used in the Federal Budget each year in near real time [emphasis added]”.
            Suggested change (addition) to: “The Government of Canada will provide access to the datasets used in the Federal Budget each year in near real time starting with Budget 2017 and will explore the feasibility of providing as many of these datasets as possible in advance of the release of the budget.
            Comment: near real time datasets to help Canadians understand the budget would be a major leap forward, however in the long term for Canadians to have meaningful input into the budget process and parliamentarians to have full information for decision-making purposes, we have to have access to the datasets before the Budget is developed. One thought is that after Budget 2017 the datasets identified for release could be prioritized for timely open data release after that point in time.

            Commitment 11: Increase Transparency of Grants and Contributions Funding
            Re: “one stop access”: in this instance centralized access makes a lot of sense!

            C. Innovation, Prosperity, and Sustainable Development
            Re: “Making government data and information openly available to Canadians without restrictions on reuse [emphasis added]”…
            Suggested change to: “Making government data and information openly available with minimal restrictions on reuse and the expectation of reuse in the spirit of the public good…”
            Comments: although the spirit of “no restrictions” is one that I agree with, a major positive change, and internationally embraced by open government advocates as consensus, this is an area where in my professional opinion too open an approach invites problems as well as benefits for the social good. For example, as contributors to the Social Sciences Research Network (SSRN) recently discovered, their free sharing of their work in what they thought of as an open access archive enabled not only open access but also the sale of SSRN to the world’s largest commercial scholarly publisher, Elsevier, a corporation that benefits from a profit rate of about $1 billion US a year (39%) profit based primarily on toll access and that has incentive to create new locked-down services. I believe this is an early indication of a potential danger of open data that is too open. For example, in the case of government data, too open an approach to data release could result in effective privatization of public services. “Without restrictions on reuse” is so broad that it can include charging for services, paying Internet service providers to have for-pay services prioritized over free public services, making the latter less useful, and using profits to lobby against funding for free public services that profitable commercial re-users are likely to see as competition.
            Open data should be open to anyone, not just Canadians. In order to have the full benefit of open access to government data we need to be able to use data from any jurisdiction and compare data across jurisdictions.

            C. Innovation, Prosperity, and Sustainable Development
            Re – second paragraph: “the Government of Canada will be building strategic partnerships with other governments at the provincial, territorial, and municipal level, to support the development of common standards and principles for open data”.
            Comment: good idea, but add the global level; this will be necessary to create innovations that work across jurisdiction and allow cross-jurisdictional comparison.

            Commitment 14: Increase Openness of Federal Science Activities (Open Science)
            Comments: kudos, this is great to see!!! Note that the granting councils already have policies on open access to research outputs and digital data management strategies. With respect to open access to documents, it might be worth looking at the tri-agency policy. With respect to digital data management strategies, there are important differences between government data, collected by the government for purposes of public policy, typically collected by government staff in the course of their employment and originally owned and controlled by the government, and academic research data which frequently involves third parties such as research subjects and third party organizations (e.g. police data is important to criminologists, business data to business researchers). Here I see many more issues arising from opening of data and I recommend separate treatment of academic research and government data.

            Commitment 15: Stimulate Innovation through Canada’s Open Data Exchange (ODX)
            This is a great initiative, but this is where building in the concept of free reuse in the context of commitment to the public good (see C above) is important to avoid the potential privatization of free public services.

            Commitment 20: Enable Open Dialogue and Open Policy Making
            Re: Milestone 1 “Promote common principles for Open Dialogue and common practices across the Government of Canada to enable the use of new methods for consulting and engaging Canadians”.
            Comments: I think that this is a great idea, but the potential of Web 2.0 to facilitate open dialogue and open policy making is in its infancy. Consider that we are still working towards universal basic literacy centuries after the invention of the printing press. I think that considerable research into how to use the web for open dialogue and policy making is needed, and how to engage citizens who may not have access to the web or are otherwise unlikely to use this means of participation. Perhaps this could be one of the upcoming challenge areas for the granting councils? (Disclosure: if this happens I might apply for such a grant). 

            Commitment 22: Engage Canadians to Improve Key Canada Revenue Agency Services
            Re: 3rd milestone: “Engage with indigenous Canadians to better understand the issues, root causes, and data gaps that may be preventing eligible individuals from accessing benefits.”
            Recommendation: add a strong, specific commitment to increase the number of indigenous Canadians receiving benefits or perhaps a specific type of benefit to which they are entitled.
            In conclusion, please consider these detailed comments as input intended to improve a solid plan ambitious plan by a new government that already deserves kudos for swift action in a number of important areas. Thank you for the opportunity to provide these comments, and to be actively engaged in the preceding in-person and online consultation processes.
            Respectfully submitted,
            Dr. Heather Morrison
            Assistant Professor
            École des sciences de l’information / School of Information Studies
            University of Ottawa
            The Imaginary Journal of Poetic Economics
            Sustaining the Knowledge Commons http://sustainingknowledgecommons.org/
            Heather dot Morrison at uottawa.ca
            June 23, 2016

            Dramatic Growth of Open Access March 31, 2016


            There are now 150 publishers of peer-reviewed open access books listed in the Directory of Open Access Books, publishing more than 4,400 open access books. 620 books were published in this quarter alone, a 16% increase in just this quarter. The Directory of Open Access Journals has been adding titles at a net rate of 6 titles per day, 540 journals added this quarter for a total of over 11,000 journals. This is the highest DOAJ growth rate since this series started!

            Bielefeld Academic Search Engine repositories collectively added more than 4.7 million documents this quarter for a total of just under 89 million documents.

            SCOAP3 nearly doubled in size this past year (87% annual growth) for a total of 4,690 documents. arXiv grew by over 107,000 documents to over 1.1 million documents during the same time frame. 

            Internet Archive is likely to be featured in the next issue as it is currently edging towards a milestone of 10 million free texts.

            The number of journals actively participating in PubMedCentral, making all content immediately freely accessible, and making all content open access, continues to grow. Meanwhile at PubMed a transition in indexing practice (from manual to automatic) means that a search for NIH-funded articles in the last 90 days significantly underreports results (1,402 NIH funded articles in the past 90 days compared with a range of 7,846 – 19,790 with a 90-day search limit for NIH funded article since 2008). Without the indexing, it is not possible to determine the percentage of full text. Here’s hoping the automated indexing process results in a catch-up soon; it doesn’t matter very much if the statistics for this series fall a bit behind, but people rely on this indexing to search for medical information.

            The Electronic Journals Library added 3,612 journals that can be read free-of-charge in the past year, for a total of 52,000 journals, a 7% growth rate.

            This post is part of the Dramatic Growth of Open Access series. Open data can be downloaded from the Dramatic Growth of Open Access dataverse.

            Editorial: open access, copyright and licensing: basics for open access publishers.

            Just published (February 2016) in the open access Journal of Orthopaedic Case Reports at the invitation of Editor-In-Chief Dr. Ashok Shyam: Editorial: open access, copyright and licensing: basics for open access publishers. Journal of Orthopaedic Case Reports 6:1 p. 1-2. DOI: 10.13107/jocr.2250-0685.360

            This post is part of the Open Access and Creative Commons critique series.