Data mining: why the EU’s proposed copyright measures get it wrong – An opinion by Prof. Kretschmer and Dr. Margoni published in The Conversation | CREATe

This is an interesting time for EU copyright law. In order to offer a comprehensive overview of a complex process, CREATe has collected various resources available here, here and here. To synthesize, it could be said that within the EU copyright reform package, which is intended to “modernise” EU copyright law, the Proposal for a Directive on copyright in the Digital Single Market stands out for its numerous provisions. Some of those have been object of a great deal of attention from academics, lawyers and citizens in general as they will significantly change the current EU copyright framework. And not necessarily in the right direction. In an opinion published today in The Conversation (here) CREATe’s Martin Kretschmer and Thomas Margoni discuss one of these provisions, dedicated to Text and Data Mining (TDM).

 

An interview with the co-founder of Iris.ai – the world’s first Artificial Intelligence science assistant | The Saint

“Have you ever spent hours sifting through journal papers? Ever got frustrated at your inability to find relevant research? Ever wished that there was an easier way to filter the seemingly endless stream of information on the web? The team at Iris.ai certainly did, which is why they have created an AI-powered science assistant to help anyone that wants to find related papers for an original research question. The software – Iris.ai – can be used to build a precise reading list of research documents, and the company claims that it can solve your research problems 78% faster (without compromising quality) than if you were carrying out the tasks manually. The concept for Iris.ai was first established three years ago at NASA Ames Research Centre. The team was taking part in a summer programme run by Singularity University (SU) when they were set the task of creating a concept that would positively affect the lives of a billion people. This exercise got the team thinking about the current state of scientific research, and more specifically about the restrictions created by paywalls, and the inability of human intelligence alone to process the three thousand or so research papers that are published around the world every single day….

When asked about challenges that the team have experienced so far, Ms Ritola was quick to point out the issue of paywalls. She explained that the Iris.ai system is connected to about 130 million open access papers – almost all those available to the public – but that many useful documents are still hidden behind systems that require users to pay for access.

However, rather than just accepting this situation as it is, the Iris.ai team have devised a scheme to solve the problem– Project Aiur – an initiative that aims to revolutionise the current workings of the research world.

“What we’re trying to do is to build a community, which is not owned by us, but by a community of researchers, a community of coders, anyone who wants to contribute to building a new economic model for science that works around a community governed AI-based Knowledge Validation Engine and an open, validated repository of science. Over time, the goal is to give access to all the research articles that are in this world”, Ms Ritola told The Saint.

This is not a straightforward task, as the Iris.ai team are faced with the challenge of encouraging researchers to publish and carry out their investigations using Aiur rather than the current systems- something that will take a fair amount of research and incentivisation. The team have started a pledge, offering students and researchers the chance to be an “advocate for validated, reproducible, open-access scientific research.” At the time of the interview,Ms Ritola informed The Saint that more than 5,000 people had signed the pledge….”

ScienceFair

“ScienceFair uses blazing-fast search and a clean user interface to help you find and filter the literature you need. No hidden menus or complex settings….Instead of static PDFs, ScienceFair uses the eLife Lens reader for a rich reading experience that helps you navigate and interpret scientific papers better….Search your own library and any number of distributed literature collections simultaneously – the results are seamlessly merged as they stream in from the peer-to-peer network….Results are automatically data-mined in real-time, giving you a live updating dashboard you can use to analyse the literature and refine your discovery process.”

Releasing 1.8 million open access publications from publisher systems for text and data mining

Text and data mining offers an opportunity to improve the way we access and analyse the outputs of academic research. But the technical infrastructure of the current scholarly communication system is not yet ready to support TDM to its full potential, even for open access outputs. To address this problem, Petr KnothNancy Pontika and Lucas Anastasiou have developed the CORE Publisher Connector, a toolkit service designed to assist text miners in accessing content though a single machine interface. The Connector aims to solve the heterogeneity among publisher APIs and assist text miners with data collection, provide a centralised point of access to all openly available scientific publications, and provide a high-performance, constantly updated access interface.

Extracting research evidence from publications | EMBL-EBI Train online

“Extracting research evidence from publications Bioinformaticians are routinely handling big data, including DNA, RNA, and protein sequence information. It’s time to treat biomedical literature as a dataset and extract valuable facts hidden in the millions of scientific papers. This webinar demonstrates how to access text-mined literature evidence using Europe PMC Annotations API. We highlight several use cases, including linking diseases with potential treatment targets, or identifying which protein structures are cited along with a gene mutation.

This webinar took place on 5 March 2018 and is for wet-lab researchers and bioinformaticians who want to access scientific literature and data programmatically. Some prior knowledge of programmatic access and common programming languages is recommended.

The webinar covers: Available data (annotation types and sources) (1:50) API operations and parameters and web service outputs (8:08) Use case examples (16:56) How to get help (24:16)

You can download the slides from this webinar here. You can learn more about Europe PMC in our Europe PMC: Quick tour and our previous webinar Europe PMC, programmatically.

For documentation, help and support visit the Europe PMC help pages or download the developer friendly web service guide. For web service related question you can get in touch via the Google group or contact the helpdesk [at] europepmc.org”>help desk.”

MyScienceWork: The Global Scientific Platform

“Founded in 2010 by Virginie Simon, a biotech engineer and PhD in nanotechnology, and Tristan Davaille, a financial engineer with a degree in economics — MyScienceWork serves the international scientific community and the promotes easy access to scientific publications, unrestricted diffusion of knowledge and open science. Our comprehensive database includes more than 70 million scientific publications and 12 million patents.

We host a community of professional scientists and science enthusiasts from around the world who use MyScienceWork’s open network to deposit and discover scientific publications of all disciplines. Join the community! 

For Research Institutions, Scientific Publishers & private-sector R&D companies, MyScienceWork provides a suite of data-driven solutions. Learn more about our products.

Our vision for the near future is to empower research institutions and industries with more intelligence related to research fields by aggregating all available data related to research results to accelerate findings, optimize funding and research efforts, improve transparency, bridge the knowledge gap between academia and industry and avoid duplicate research.

MySciencework believes that making science more accessible will foster data sharing amongst science organizations….”

Open Data Grant Winners to Conduct Sentiment Analysis of Thousands of French Revolution Pamphlets | Newberry

“Today, you can see trends on Twitter at a glance and get immediate insights into the public discourse surrounding current events. But how can we learn about trending topics and public opinion in centuries past? The recipients of the Newberry’s Open Data Grant intend to find out. The Open Data Grant helps support innovative scholarship that applies technologies such as digital mapping, text mining, and data visualization to digitized primary sources. Joseph Harder, a chemist and data scientist, and Mimi Zhou, an expert in digital humanities studying early French literature, will use the award to complete a sentiment analysis of the Newberry’s recently digitized collection of more than 30,000 French Revolution pamphlets….”

About OpenAIRE-Advance

“OpenAIRE-Advance continues the mission of OpenAIRE to support the Open Access/Open Data mandates in Europe. By sustaining the current successful infrastructure, comprised of a human network and robust technical services, it consolidates its achievements while working to shift the momentum among its communities to Open Science, aiming to be a trusted e-Infrastructure within the realms of the European Open Science Cloud.

In this next phase, OpenAIRE-Advance strives  to empower its National Open Access Desks (NOADs) so they become a pivotal part within their own national data infrastructures, positioning OA and open science onto national agendas. The capacity building activities bring together experts on topical task groups in thematic areas (open policies, RDM, legal issues, TDM), promoting a train the trainer approach, strengthening and expanding the pan-European Helpdesk with support and training toolkits, training resources and workshops. It examines key elements of scholarly communication, i.e., co-operative OA publishing and next generation repositories, to develop essential building blocks of the scholarly commons.

On the technical level OpenAIRE-Advance focuses on the operation and maintenance of the OpenAIRE technical TRL8/9 services, and radically improves the OpenAIRE services on offer by: a) optimizing their performance and scalability, b) refining their functionality based on end-user feedback, c) repackaging them into products, taking a professional marketing approach with well-defined KPIs, d) consolidating the range of services/products into a common e-Infra catalogue to enable a wider uptake.

OpenAIREAdvance steps up its outreach activities with concrete pilots with three major RIs, citizen science initiatives, and innovators via a rigorous Open Innovation programme. Finally, viaits partnership with COAR, OpenAIRE-Advance consolidates OpenAIRE’s global roleextending its collaborations with Latin America, US, Japan, Canada, and Africa….”