The plan to mine the world’s research papers

Carl Malamud is on a crusade to liberate information locked up behind paywalls — and his campaigns have scored many victories. He has spent decades publishing copyrighted legal documents, from building codes to court records, and then arguing that such texts represent public-domain law that ought to be available to any citizen online. Sometimes, he has won those arguments in court. Now, the 60-year-old American technologist is turning his sights on a new objective: freeing paywalled scientific literature. And he thinks he has a legal way to do it.

Over the past year, Malamud has — without asking publishers — teamed up with Indian researchers to build a gigantic store of text and images extracted from 73 million journal articles dating from 1847 up to the present day. The cache, which is still being created, will be kept on a 576-terabyte storage facility at Jawaharlal Nehru University (JNU) in New Delhi. “This is not every journal article ever written, but it’s a lot,” Malamud says. It’s comparable to the size of the core collection in the Web of Science database, for instance. Malamud and his JNU collaborator, bioinformatician Andrew Lynn, call their facility the JNU data depot.

No one will be allowed to read or download work from the repository, because that would breach publishers’ copyright. Instead, Malamud envisages, researchers could crawl over its text and data with computer software, scanning through the world’s scientific literature to pull out insights without actually reading the text….”

Peter Suber: The largest obstacles to open access are unfamiliarity and misunderstanding of open access itself

I’ve already complained about the slowness of progress. So I can’t pretend to be patient. Nevertheless, we need patience to avoid mistaking slow progress for lack of progress, and I’m sorry to see some friends and allies make this mistake. We need impatience to accelerate progress, and patience to put slow progress in perspective. The rate of OA growth is fast relative to the obstacles, and slow relative to the opportunities.”

Peter Suber: The largest obstacles to open access are unfamiliarity and misunderstanding of open access itself

I’ve already complained about the slowness of progress. So I can’t pretend to be patient. Nevertheless, we need patience to avoid mistaking slow progress for lack of progress, and I’m sorry to see some friends and allies make this mistake. We need impatience to accelerate progress, and patience to put slow progress in perspective. The rate of OA growth is fast relative to the obstacles, and slow relative to the opportunities.”

India’ Open Government Data Platform Is Helping Data Scientists Kick-start Their ML Journey

The NDA government has come into its new term with a renewed gusto towards analytics in the public sector. Recognising the disruptive effect that the upcoming AI wave will have on citizen’s day-to-day activities, the government has put it on a spotlight.

One of the biggest needs for a healthy analytics ecosystem in any given environment is data. Identifying the data-hungry nature of the new data science and analytics startups in India, the government initiated the Open Government Data Platform at data.gov.in….

This move allows data scientists and machine learning engineers alike to harness one of the biggest collections of datasets available to the public….”

Should India adopt Plan S to realise Open Access to Public-funded Scientific Research?

“Timely and affordable access to scientific research remains a problem in this digital day and age. Around three decades ago, the radical response that emerged was making public-funded scientific research “open access”, i.e. publishing it on the Web without any legal, technical or financial barriers to access and use such research. Several Indian public research institutions also adopted open access mandates and built self-archiving digital tools, however, the efforts haven’t yielded much. Most countries including India, continue to struggle with implementing open access. The latest international initiative (created in Europe) to remedy this problem is Plan S. Plan S is has been positioned as a strategy to implement immediate open access to scientific publications from 2021 – which India is considering adopting. This article unpacks the disorderly growth of open access in India, and discusses the gap between the Plan’s vision and current Indian scenario in some respects….”

Influence of open educational resources on educational practices in the Global South | Nature Human Behaviour

“Open educational resources enable the effective use and sharing of knowledge with those who have been denied an education due to economic or social circumstances. Sarita Kumar outlines how open educational resources can benefit education systems across the Global South by opening up an entire generation to new ideas, technologies and advancements….”

Making the case for a Public Library of India – Bangalore International Centre

“Can India lead a global revolution in access to knowledge? In this talk, Carl Malamud will discuss some efforts in India to take some small initial steps to change how we access information. He will discuss public interest litigation in the Hon’ble High Court of Delhi with two co-petitioners to make all Indian standards available.

In Bengaluru, the Indian Academy of Sciences has embarked on an ambitious program to digitize scientific literature, a program which will soon expand to other kinds of institutions in Chennai, Mangalore, and other locations, a program driven by a volunteer group known as the Servants of Knowledge. And, in Delhi, 750 terra bytes of disk is spinning at JNU and IIT Delhi, the beginnings of a research facility for big data and text mining as well as a distribution depot for moving content throughout India. Carl will explain who these components are part of his vision for what might become a Public Library of India, making available the vast treasures of knowledge of India to all….”

Flaws in Academic Publishing Perpetuate a Form of Neo-Colonialism

Some publications charge up to $3,900 (Rs 2.7 lakh) as APCs, which leaves researchers from lower to middle-income countries such as India much poorer. And if academic publication is skewed in favour of high-income countries, science becomes skewed in favour of them.

Explaining real-world phenomena objectively has always been touted as the “white man’s burden” and has been the backbone of the colonising mission. Often only researchers and academics from certain privileged pockets have the resources to conduct and publish cutting-edge research. After all, they enjoy superior infrastructure and funding opportunities.

This disparity is exacerbated when they have sufficient resources to publish their work, often allowing knowledge to be created by only a certain kind of individual. Further, their blinkers and biases may continue to play a role in what they propose is a universal phenomenon – a form of neo-colonialism. Therefore, making science open access from both the production and the consumption perspectives is essential to make knowledge more democratic….”

Geographic trends in attitudes to open access | Research Information

In the OA report, when asked whether authors had ever published in an OA journal, the majority of researchers from each country responded affirmatively (B, 68% of 1,133 respondents; I, 57% of 213; J, 59% of 708; UK 60% of 111; US, 51% of 419), except for China (34% of 2,085) and South Korea (44% of 409; roughly equal, yes verses no). Overall, across all survey respondents, with Yes at 45% and No at 35%, OA advocates may feel comfortable that the pendulum is swinging in the right direction. However, there are some striking differences in the geographic profiles of whether or not an author chooses to publish in an OA journal, with an overall 9% of responding authors indicating that they don’t know what OA publishing is.

For example, in response to why respondents chose to publish in an OA journal, more than 60% of authors in almost all geographic areas responded “I wanted my paper to be read by a larger audience” (B, 60% of 766; C, 69% of 710; I, 64% of 121; J, 64% of 415; UK, 63% of 67; US, 60% of 215), however in South Korea, only 37% of 181 authors responded in such a manner, and instead, 71% of 181 authors indicated that “I chose the journal that was the best fit for my paper and it happened to be OA”. This was in striking contrast to authors in the UK, for which the “best fit being OA” response was only indicated by 31% of 67 authors. Notably, when authors in the UK who had “never” published in an OA journal were asked why, 65% (of 34) said “I chose the journal that was the best fit for my paper and it happened to be a subscription journal”. …”

Plan S and the Global South – What do countries in the Global South stand to gain from signing up to Europe’s open access strategy? | Impact of Social Sciences

“Plan S raises challenging questions for the Global South. Even if Plan S fails to achieve its objectives the growing determination in Europe to trigger a “global flip” to open access suggests developing countries will have to develop an alternative strategy. In this post Richard Poynder asks: what might that strategy be?…”