# What’s the Real Value of a Scholarly Publication? Part I

I’ve been invited to a very timely meeting in Oxford next week to discuss the future of Scholarship. “Open Science and the Future of Publishing” http://www.evolutionofscience.org/webFlyer.pdf . The question I want to ask is (roughly):

“We the public pay 10 billion USD annually in journal subscription fees [*] and 200 billion USD for research; what value do WE get? And what value do WE lose by closed access?”

[*] throughout this post I use guestimates which are probably off by half an order of magnitude either way (i.e. factor of 3). This is partly because much of the information is secret (and some so secret that you will be sued if you divulge it) and partly because academia and we the public don’t yet care enough to find out. I am also removing CC-BY publications from the argument to avoid having to say “except for CC-BY” all the time. It’s about 5% of the market, if that. So I’d like your help.

I am also working this up for a (unfortunately virtual) presentation I am giving in Poland next month. I am taking my text from Wikipedia: http://en.wikipedia.org/wiki/Value_%28economics%29 (This is 6 years old and not disputed so I take it as more-or-less correct. If anyone can fault this, we shall all benefit)

Let me tackle COST and PRICE first.

The COST to the public purse of scholarly publishing is of the order of 10 billion USD. There are also contributions from industrial subscriptions, and from student fees, and 1% from pay-per-view, but the bulk is from taxpayers. In return for this the public get virtually no value or rights. If you the public, you the government, you the NHS want to read a paper you either have to pay again or walk to St Pancras and read it in the British library premises (you cannot get this online because of publisher restrictions – mad and sad but true. The BL even charges me to read my own CC-BY papers if I’m not at St P.).

This is set by the PRICE of electronic journals. This bears no relation to the COST of production. The cost of production can be very low. It’s USD 7 for ArXiV (not peer-reviewed) and about 100 USD for Acta Cryst E (a very high-quality peer-reviewed data journal). In an efficient organisation it’s inconceivable that the COST of production of a journal article is more than 200 USD. Any higher PRICE comes from the following:

• Inefficiencies (often gross) in the publishing system. (For example almost all author manuscripts are retyped from scratch).
• Profits

Publishers like Nature estimate costs-per-paper at 20,000 USD. That is not related to the cost of production but something else. Perhaps the high rejection rate? The basis of these “costs” is kept highly secret.

The PRICE of pay-per-view articles (about 35 USD for one day’s rent) is the only part with real elasticity http://en.wikipedia.org/wiki/Elasticity_%28economics%29 . The only evidence I have is from my FOI requests to Oxford/Cambridge University presses (they are public organizations, parts of the Universities, so have to reply – if you want publishing facts consider University presses).

CUP:  [http://www.whatdotheyknow.com/request/88390/response/224094/attach/html/2/FOI%202011%20236%20Murray%20Rust%20response%20letter.pdf.html ]

In 2010, 13,646 articles were purchased as PPV. In 2010, the total number of articles for potential purchase via CJO was 680,000.  Revenues from PPV approximated to 1.3% of Journal subscription revenues in 2010.

In 2010, 37,157 PPV articles were purchased [OUP do not know how many purchasable articles they publish]  PPV represents around 1.5% of total journal subscription income.

I take heart from the consistency of the figures (TWO coincident points!) and surmise that other publishers get 1.5% of their income from Pay-per-view. It’s possible, but unlikely, that the large profits of other publishers comes from Pay-per-view but I and you will doubt that. It’s clear that the price is far too high and it amazes me that publishers still use these levels which were – I assume – set by the cost of paper in interlibrary loans. I’m no economist, but it’s actually stupid to run these prices . If they cut their prices to a fifth – 7USD – and gained 5 times more custom they’d still make the same income, incur no more costs (really!) and gain a great deal of goodwill. And even if they gained no more readers they’d only have lost 1% of their income. But they probably know something about a small subset of customers who have to use this service and they don’t care about everyone else. Which is also inelastic.

If any closed access publisher can give figures here we’d be delighted.

It’s also a serious condemnation of the effort to promote scholarship. Only 2% or all articles are ever purchased each year. I imagine the 680,000 includes historical articles, and if we take this as 50 years, then each modern article is purchased about once each year. Which shows that it’s value to the public is almost zero.

We now need to establish the cost of public (include charity) funded research. I have asked many times without finding authoritative results. So here’s a beer-mat calculation, and allow +- half an order of magnitude. I approach it from these directions:

• Wellcome Trust allow about 2% of a grant to cover publishing. So if scholarly publishing is USD 10 billion, then public research is 500 billion USD
• The income for Cambridge, Stanford, etc is ca 500 million. Assume 1000 research universities in the world (can anyone do better?) and a power law and we get ca USD 200 billion
• The NIH is funded at USD 35 billion. It’s probably the largest, but add in national funders and you are well over USD 100 billion

Let’s use a figure of USD 200 billion (though I am sure it’s higher).

I’m now using VALUE in the sense (from Wikipedia):

Value in the most basic sense can be referred to as “Real Value” or “Actual Value.” This is the measure of worth that is based purely on the utility derived from the consumption of a product or service. Utility derived value allows products or services to be measured on outcome instead of demand or supply theories that have the inherent ability to be manipulated. Illustration: The real value of a book sold to a student who pays $50.00 at the cash register for the text and who earns no additional income from reading the book is essentially zero. However; the real value of the same text purchased in a thrift shop at a price of$0.25 and provides the reader with an insight that allows him or her to earn $100,000.00 in additional income is$100,000.00 or the extended lifetime value earned by the consumer. This is value calculated by actual measurements of ROI instead of production input and or demand vs. supply. No single unit has a fixed value. Value is intrinsically related to the worth derived by the consumer. [Burke(2005)].

And asking “What VALUE do the public get for their 200 billion dollars?”

And

“what extra VALUE would they get if the research was published openly?”

And again, if you have insights let me know.

# JAHA — New Open Access Journal NOW LIVE!

The American Heart Association/American Stroke Association continues to set the gold standard for publishing cardiology and stroke research with the introduction of a new online-only, peer-reviewed, Open Access publication to its portfolio of 11 scientific journals. Welcome JAHA — Journal of the American Heart Association!

JAHA provides a global forum for basic and clinical research articles and timely reviews on cardiovascular disease and stroke. As an Open Access journal, its content is rapidly and freely available, accelerating the translation of strong science into effective practice.

The journal has launched with five original research articles and three editorials that are free for you to read, download, and share! Of particular note,  ‘Relationship of National Institutes of Health Stroke Scale to 30-Day Mortality in Medicare Beneficiaries With Acute Ischemic Stroke’ by Gregg C. Fonarow and colleagues offers an important insight and will be part of the AHA Emerging Science Series webinar on February 29th.

# @ccess: #scholarlypoor: Craig Dylke, teacher and artist

There’s an arrogant assumption among many academics that scholarly publishing is produced by academics (maybe 1% of the population) to be read only by other academics (1% of the population) and that no-one else matters. After all why would anyone other than a dinosaur scholar be competent to read a paper on dinosaurs. And surely dinosaur papers have no financial benefit to the world.

WRONG – on both counts.

Mike Taylor has done an awesome – truly awesome – job in pulling together our ideas and hope for the @ccess movement – the imperative to make scholarship available for the #scholarlyporr. Those are the people who don’t have access to a University library. And access doesn’t mean driving to a building, filling pout forms and getting a paper copy. It means online access. Immediate and expansive. Because that’s the only form of access that’s now reasonable for scholarly articles [I deliberately omit books].

Mike’s been interviewing the scholarly poor. I’ve done an interview [http://whoneedsaccess.org/2012/02/18/peter-murray-rust-chemistry-researcher/ ]– just because I’m at a rich university doesn’t mean I can use the electronic library as I want to. My research is stalled because the publishers forbid it. Everyone is scholarly poor when it comes to text-, data- and image-mining. But you know all that.

What’s tremendous is the stories that are emerging. And I get the impression from Mike that he’s got a number yet to be published. So here’s someone who passionately wants to read the dinosaur literature. http://whoneedsaccess.org/2012/02/21/craig-dylke-teacher-and-artist/ You’ll need to read it yourself, best beloved, because I can’t show his dinosaur pictures. Here is he teaching, and I’ll give some exceprts below:

CD: I try to help connect the science of palaeontology to a larger audience. Palaeo-art lets me do this in a way that combines my childhood obsession with palaeontology and my love of digital art. I’ve become so interested in the the philosophy, and methodology of palaeo-art that, together with Peter Bond, I co-founded the community blog ART Evolved where we discuss and encourage palaeo-art of all forms.

But why does Craig need the literature?

When you scientifically reconstruct an animal, every detail of its physical appearance is important. For most prehistoric life, the only place to get details about fossilized remains and informed speculation on what that extinct life might have looked is in the scientific literature. From my perspective as an artist rather than a researcher, the most useful part of papers is the diagrams and photographs of the fossils

Craig cares about getting it right. As simple and as important as that.

… there are times when I would love to have it to check “facts” in popular children’s books. The number of factual mistakes in these books is sometimes quite alarming. Being on top of the most recent publications can also lead to good discussion topics for my students: news outlets only report a fraction of new science discoveries.

And the problems?

The fees for subscriptions, or for single papers are simply outrageous. Many of my digital art software packages cost less!

Limited access to scientific literature has also created an interesting problem in palaeo-art. Without access to source material, many artists resort to referencing other artists. Then you get artistic “memes” in which organisms are consistently shown with characteristics that we have no actual evidence for. (Since the art is the closest thing we have to photographs, they gain an implied credibility when repeated enough times). This runs completely counter to my science education goal.

What changes would you like to see?

Frankly that answer is simple. Either researchers only publish in free access journals or the publishers get with the times and open access to their content.

I’d also like to see more journals offer unlimited illustrations for authors. On any given subject PLoS papers are almost always the superior source material for me as an artist, as the authors tend to fill them liberally with photos and diagrams of their specimens. Too often I’ve been disappointed to track down a critical paper on topic from a mainstream journal only to find there are no diagrams or photos, leaving me at square one on my restoration.

As I have already noted, even a fraction of the scholarly literature is valuable. We’re fighting to get it all, but until that time we are trying to get as much as possible together for Craig.

And there’s no money in dinosaurs, is there? Jurassic Park grossed 900M USD. By depriving the creative #scholarlypoor of the literature we are denying them their full potential.

# Darkroom and open disclosure: two library solutions for dealing with copyright extremists

Elsevier, the scholarly publisher currently being boycotted by close to 7,000 researchers, does not appear on the exclusions list of the copyright extremist group Access Copyright. To me, this raises the question: are Elsevier and ilk receiving monies from Access Copyright in addition to the substantial fees paid by libraries for subscriptions, and if so, is this a breach of the typical “entire agreement” clause in a library license? Since Access Copyright does not tell us who they are giving money to, why not ask when we purchase? We could call this an “open disclosure” policy. Whenever libraries are purchasing or subscribing to resources, let’s ask – IS this really the entire agreement, or are you looking for money from copyright collectives, too?

Of course, open disclosure would be most effective if it were practiced by Access Copyright. If people knew who they are representing (rather than who is excluded), then we could take appropriate actions. Such actions could include:

• buying their stuff if we must, but putting it away in the most dark, remote corner we can find, in a separate room covered with stern warnings like: “These materials are covered by Access Copyright”. Don’t even THINK about copying!
• set up a bank of computers that people pass by on their way to the dark room featuring open access resources

Another thought: if Access Copyright and those represented by Access Copyright don’t want to participate in open disclosure, then let’s start by encouraging those who aren’t members of Access Copyright to openly proclaim their non-membership. This could be a selling point! Come of think of it, I wonder if anyone is using that Access Copyright exclusions list as an acquisitions tool?

# @ccess is launched!

Today we have launched @ccess – a new site, and more importantly a new community – to make scholarly information REALLY LIBRE available. I’ll stress to start with that this means all disciplines and all types of information and means of communication. Because I’m a scientist I’m concentrating on STEM but it covers everything. By LIBRE we ean free to use, re-use, and redistribute for any purpose. It’s covered by the Open Knowledge Definitions and the actual text of the Budapest Declaration on Open Access 10 years ago.

I’ve blogged about this before. Any information is better visible than not, but simply “being on the web” isn’t good enough for many (I’d say most) modern uses. There are 101 reasons why information must be fully LIBRE and why GRATIS is not good enough. There are 10 million paragraphs on chemical reactions I want to read each year and I must use machines to do this. GRATIS does not work for machines. They can’t work out rights or protect me from being sued. And that’s the reality. If I use a scientific paper beyond what I am allowed to do I’ll be sued and the University of Cambridge will be cut off.

The only way to ensure this is to make sure all the information we want is LIBRE. Free to use, re-use, redistribute for any purpose, commercial as well.

Note that the term “Open Access” is operationally meaningless. The term “fully Open Access” is even worse because it is seriously misused. Some publishers offer “fully open access” and give the reader no rights at all.

The problem is that only about 3-5 percent of current scholarly information is LIBRE. It’s actually very difficult to get a figure, because information isn’t generally labelled with its rights. Print a typical scholarly pub and the print will often tell you very little about the rights. It may not even give the actual copyright owner – so you don’t know whether you can copy it and who will sue you. Some “open access” publishers DO label the material – here’s BMC:

All articles are immediately and permanently available online. Unrestricted use, distribution and reproduction in any medium is permitted, provided the article is properly cited. See our open access charter.

But almost all hybrid papers – where you pay substantial money (perhaps 2000 USD) to make the paper “Open Access” – are neither labelled nor LIBRE. Ross Mounce has shown that only 5% of publishers offer LIBRE “open access” – the rest still impose restrictions or severe restrictions on use. And in my simple study of avian malaria in Pubchem only about 3 papers out of 70 were LIBRE at first glance.

So let’s say 5% of the current published scholarly output can be reused without thinking and without worrying. Because that’s the only guide. If you have to think, then it’s effectively not re-usable on a large scale. Machines can’t understand lawyers. And they can’t interpret information this isn’t given.

What can you do with 5%?

More than you might think at first glance. Much more.

Academics often have a narrow mindset that the only reason for publishing a paper is so some other academic can read your paper. That if we don’t have access to the precise paper we cannot do anything. Sometimes that’s true. But sometimes we just need representative material in that area. Let’s say I want to know the conditions for making an ester (a type of chemical) and there are 500,000 esterifications published a year. 5% of that is 25,000 different reports. My machines will certainly find all the mainstream types of reaction. If I want to know how to grow a common cell type, or prepare a specimen, or find the methods using for recognising motifs in genes or … I’ll certainly find enough examples. If I want to find images of mosquitoes, or a graph of the average rainfall in W Africa the LIBRE literature is almost certainly good enough. If I want to analyse the type of language and terms used in malaria articles the LIBRE literature is more than enough. If I want to find which countries the work is done in the LIBRE literature is all I need.

So we need to label and liberate LIBRE scholarship. And then persuade people to label their articles properly. And hopefully to persuade them of the immense value of LIBRE over GRATIS.

So the recent heroes of our effort have been

• Tom Olijhoek and Bart Knols. Here’s Tom’s report in Malaria World http://www.malariaworld.org/blog/how-easy-can-you-find-information-you-need . Malaria is a really good place to start as the concept is well contained and we can find everything through UK/PubMedCentral. They have also helped to create the site http://access.okfn.org/ . That’s a really good place to start
• Mike Taylor, sauropodologist (http://en.wikipedia.org/wiki/Sauropoda ). Mike has campaigned tirelessly and burnt midnight oil to create the site http://whoneedsaccess.org/ which runs in parallel with the @ccess site. He’s collecting interviews, including one from me, on why we need LIBRE @ccess.
• Mark MacGillivray who continues to add fantastic design and power to http://bibsoup.net . Mark’s Bibserver uses faceted search in an incredibly powerful manner. The technical details are completely hidden from the user. The technology can interact with the Semantic Web / Linked Open data and is a great community builder

Anyone can be a member of this effort – you just need passion and energy and a need to provide LIBRE resources. And if you have a story about how and why you need LIBRE material and can’t get it , then highlight it on the mailing list or help populate the questions on the wiki.