# I have been awarded a Shuttleworth Fellowship to change the world; my first reactions

The Shuttleworth Foundation has done me the honour of appointing me as a Fellow, starting today. The remit (http://www.shuttleworthfoundation.org/fellowship/ ) is:

The holy grail of every funder is sustainability, an idea and approach living long after the money has run out. That is why we fund people not projects. The only true way to sustainability is not a business plan but a champion, someone who will drive an idea through an ever changing landscape, to make a real difference in the world.

We are looking for social innovators who are helping to change the world for the better and are seeking support through an innovative social investment model.

My new entry is here: http://www.shuttleworthfoundation.org/fellows/current/peter-murray-rust/

This is incredible. I’ve had a week or two to adjust but I’m still finding new ideas, visions, people on a daily basis.  So this is a first reaction.

I am going to change the world for the better. Yes. Over the last few years when people have asked me what I want to do I reply “change the world”. It’s what we should all aspire to. And this is the most concentrated  time of innovation in the history of the planet and it’s much easier. In the past heroes such as Diderot had to rely on print to reach people – I can reach millions of people with a few keystrokes.

It’s this ability to create communities that makes us different from our predecessors. As exemplars I look to my own immediate circle of electronic communities: Wikipedia, Mozilla, Open Knowledge Foundation, Creative Commons, Open Rights Group, Blue Obelisk …

All started by people – often just one. And all self-sufficient without their founders. That’s my immediate model for sustainability. I don’t know exactly *how* it will happen , but I am certain it will. (Certainty is an essential ingredient of success). So the goal is to build a community of vision and practice.

This year I have undertaken to liberate 100, 000, 000 FACTs from the scientific/technical/medical literature. FACTs belong to the world, not individuals and not corporations. I use uppercase to stress that they are not protectable as Intellectual property (IP). FACTs save lives (think helicobacter and ulcers). FACTs help to create new materials. FACTs lead to better decision making (e.g. climate change). FACTs generate new information-based industries which generate new wealth (the 4 Billion USD invested in the human genome generated 700 Billion of downstream wealth.  (I’ve blogged a lot about the Content Mine and I’ll be blogging a lot more, of course).

Because it is freely available to everyone on the planet who can connect to the Internet.

Put most of all I must thank the Shuttleworth Foundation. They have a wonderful vision and wonderful people. There’s a lot I am discovering.

But, simply, they put in the effort to make sure people succeed.

They have a wonderful infrastructure that I suspect few other funding bodies can emulate. I have a very real relationship with  Karien Bezuidenhout and Helen Turvey who run the Fellowship program.  We’ve spent a lot of time bouncing ideas around and I shall be meeting Helen in a few days in London. Karien and I will have virtual meetings twice a month! This can make all the difference to being focussed and setting achievable objectives.

And I know several of the Fellows already. Rufus Pollock (OKFN), Daniel Lombraña González (Crowdcrafting)  …

… and Fracois Grey who will be my buddy / mentor. This is a wonderful idea. I’m hoping I can visit New York and run a workshop there with his Citizen Science community.

And then there is the community of the Fellowship – again this is a wonderful resource. Fellows come from all disciplines and experience and the cross-fertilisation will be massive. We meet virtually every week and we have 2 physical meetings a year. I’ll be doing a lot of listening.

It’s a huge responsibility, but that’s absolutely how it should be. I shall give it my best. I cannot know how it will work out in detail. I’ve a loose group of current collaborators and I’ll be talking with Helen about the best way of involving them.We’ve already plotted some activities.

Massive thanks to those who have helped with my application, acted as sounding boards and acted as referees.

Shuttleworth is the difference being *hoping* your ideas will take root  and *knowing* they will.

# Latest Article Alert from BMC Health Services Research

The following new articles have just been published in BMC Health Services Research

For articles using Author Version-first publication you will see a provisional PDF corresponding to the accepted manuscript. In these instances, the fully formatted Final Version PDF and full text (HTML) versions will follow in due course.

Research article
A national survey of inpatient medication systems in

# Latest Article Alert from BMC Infectious Diseases

The following new articles have just been published in BMC Infectious Diseases

For articles using Author Version-first publication you will see a provisional PDF corresponding to the accepted manuscript. In these instances, the fully formatted Final Version PDF and full text (HTML) versions will follow in due course.

Case report
Outcome of acute East African trypanosomiasis in a Polish

# The dramatic growth of BioMedCentral open access article processing charges

The average article processing charge for BioMedCentral journals requested from the University of Ottawa (uO) Library’s author’s fund increased 27% from 2010-11 to 2012-13. The 15% increase from 2011-12 to 2012-13 is 10 times the rate of inflation.

The data indicates that this reflects increases in journal prices rather than changes in which journals uO authors publish in. For example:

Globalization and Health (a BMC journal)

• 2010-11: uO paid an APC of $1,300 US. Assuming this reflects a BMC membership rate in effect at this time (15% discount, that’s still less than$1,500 US.
• 2011-12: uO paid APCs at 2 different rates: $1,425 US and$1,715 US
• 2012-13: uO paid APCSs at $1,670 and$1,715 US
• The BMC rate listed on BMC’s own website as of Feb. 27, 2014 is $2,155 US from: http://www.globalizationandhealth.com/manuscript An increase in APC from$1,715 US to \$2,155 US in the last year is about a 25% increase in the APC for this particular journal. Currency fluctuations could account for about one-tenth of this increase (see below for calculations), and the modest inflation rate would account for about a 1.5% increase. This still leaves more than a 20% increase in price above and beyond currency variations and inflation.

Currency variations UK pound sterling to USD, based on Bank of Canada daily and 10-year currency converter.

• UK pound sterling to USD conversion rate:
• Jan. 2011: 1.5586
• Jan. 2012: 1.5654 (.0043 increase over 2011)
• Jan. 2013: 1.6254 (.0383 increase over 2012)
• as of Feb. 27, 2014: 1.6691 (.02688 increase over 2013)
• Total increase in value of UK pound sterling in comparison with US dollar 2014 / 2011: 7%

Public Library of Science (PLoS), by contrast, has kept prices for their journals at exactly the same rates during this time frame. PLoS’ achievement of a 23% surplus during this time frame indicates that this was done without financial sacrifice. While I continue to call on the not-for-profit PLoS to actually lower their prices to facilitate the transition to open access, the remarkable contrast between PLoS’ holding the line on prices and while BMC raises their prices at rates far above inflation is worth noting.

Thanks to Jeanette Hatherill and the University of Ottawa Library for posting the Open Access publication rates in the uO institutional repository. This dataset contains the amounts paid for through the library’s author’s fund for open access article processing charges from 2010 – 2013. Watch for further calculations and release of my calculations spreadsheet as part of the open access article processing charges series.

This post also illustrates the value of open data. By posting this data for open access in the University of Ottawa’s institutional repository, uO is making it possible for me to conduct research like this that could be useful to uO’s own decision-making processes in future. Let’s hope this post inspires others to follow uO’s lead and share their data, too.

This post is part of the Open access article processing charges research series

# Open access article processing charges series

This post gathers posts on my open access article processing charges research.

# Latest Article Alert from Respiratory Research

The following new articles have just been published in Respiratory Research

For articles using Author Version-first publication you will see a provisional PDF corresponding to the accepted manuscript. In these instances, the fully formatted Final Version PDF and full text (HTML) versions will follow in due course.

Research
Inhibition of mTORC1 induces loss of E-cadherin through AKT/GSK-3beta

# Latest Article Alert from BMC Medicine

The following new articles have just been published in BMC Medicine

For articles using Author Version-first publication you will see a provisional PDF corresponding to the accepted manuscript. In these instances, the fully formatted Final Version PDF and full text (HTML) versions will follow in due course.

Commentary
Epigenetics in the pathogenesis of rheumatoid arthritis
Glant TT, Mikecz K,

# 101 uses for Content Mining

It’s often said by detractors and obfuscates that “there is no demand for content mining”. It’s difficult to show demand for something that isn’t widely available and which people have been scared to use publicly. So this is an occasional post to show the very varied things that content mining can do.

It wouldn’t be difficult to make a list of 101 things that a book can be used for. Or television. Or a computer (remember when IBM told the world that it only needed 10 computers?) Content mining of the public Internet is no different.

I’m listing them in the order they come into my head, and varying them. The primary target will be scientific publications (open or closed – FACTs cannot be copyrighted) but the technology can be applied to government documents, catalogues, newspapers, etc. Since most people probably limit “content” to words in the text (e.g. in a search engine) I’ll try to enlarge the vision. I’ll put in brackets the scale of the problem

1. Which universities in SE Asia do scientists from Cambridge work with? (We get asked this sort of thing regularly by ViceChancellors). By examining the list of authors of papers from Cambridge and the affiliations of their co-authors we can get a very good approximation. (Feasible now).
2. Which papers contain grayscale images which could be interpreted as Gels? A http://en.wikipedia.org/wiki/Polyacrylamide_gel is a universal method of identifying proteins and other biomolecules. A typical gel (Wikipedia CC-BY-SA) looks like   Literally millions of such gels are published each year and they are highly diagnostic for molecular biology. They are always grayscale and have vertical tracks, so very characteristic. (Feasibility – good summer student project in simple computer vision using histograms).
3. Find me papers in subjects which are (not) editorials, news, corrections, retractions, reviews, etc. Slightly journal/publisher-dependent but otherwise very simple.
4. Find papers about chemistry in the German language. Highly tractable. Typical approach would be to find the 50 commonest words (e.g. “ein”, “das”,…) in a paper and show the frequency is very different from English (“one”, “the” …)
5. Find references to papers by a given author. This is metadata and therefore FACTual. It is usually trivial to extract references and authors. More difficult, of course to disambiguate.
6. Find uses of the term “Open Data” before 2006. Remarkably the term was almost unknown before 2006 when I started a Wikipedia article on it.
7. Find papers where authors come from chemistry department(s) and a linguistics department.  Easyish (assuming the departments have reasonable names and you have some aliases (“Molecular Sciences”, “Biochemistry”)…)
8. Find papers acknowledging support from the Wellcome Trust. (So we can check for OA compliance…).
9. Find papers with supplemental data files. Journal-specific but easily scalable.
10. Find papers with embedded mathematics.  Lots of possible approaches. Equations are often whitespaced, text contains non-ASCII characters (e.g. greeks, scripts, aleph, etc.) Heavy use of sub- and superscripts. A fun project for an enthusiast

So that’s just a start. I can probably get to 50 fairly easily but I’d love to have ideas from…

…YOU

[The title many or may not allude to http://en.wikipedia.org/wiki/101_Uses_for_a_Dead_Cat ]

# Latest Article Alert from BMC Cardiovascular Disorders

The following new articles have just been published in BMC Cardiovascular Disorders

For articles using Author Version-first publication you will see a provisional PDF corresponding to the accepted manuscript. In these instances, the fully formatted Final Version PDF and full text (HTML) versions will follow in due course.

Research article
Association between serum uric acid levels and

# Content Mining Myths 1: “It’s too hard for me to do”; no it’s easy

One of the many myths about content mining is that it’s difficult and only experts can do it.

Quite the opposite – with the right tools anyone can do it. And in fact most of you do content-mining every day…

• When you type a phrase into a search engine (Google, Bing)  you are using the mined content of the web. You phrase your question to try to get the most precise, most relevant answers. Agreed, it’s not easy to WRITE a search engine, but it is easy to use one. If we know what questions you want to ask the scientific literature then we can work out how to build the engine.
• When you use software to examine photographs it can pick out faces. Again it’s not easy to write such software but it’s easy to use it. And that’s what we are doing for chemistry – recognising compounds and reactions in pictures. We’ll present this at the upcoming American Chemical Society meeting in Dallas next month so if you are there you’ll get an idea. It’s only 3 months old but we’ve come a long way.
• When you search your mail for a name you are mining the content. Again it’s easy to do.

Because content-mining in science has been held back by restrictive practices there are lots of valuable tools waiting to be applied. That’s what we are doing. We expect progress to be rapid. Obviously we’ll appreciate direct help, but we’ll also appreciate general interest.

What do you want to be able to do? What FACTs do you want to extract (or for us to extract and publish)? It won’t all be possible , but a huge amount will be.

And when we have tens of thousands of scientists mining the literature and making the results public there will be a huge acceleration.