An interview with the co-founder of Iris.ai – the world’s first Artificial Intelligence science assistant | The Saint

“Have you ever spent hours sifting through journal papers? Ever got frustrated at your inability to find relevant research? Ever wished that there was an easier way to filter the seemingly endless stream of information on the web? The team at Iris.ai certainly did, which is why they have created an AI-powered science assistant to help anyone that wants to find related papers for an original research question. The software – Iris.ai – can be used to build a precise reading list of research documents, and the company claims that it can solve your research problems 78% faster (without compromising quality) than if you were carrying out the tasks manually. The concept for Iris.ai was first established three years ago at NASA Ames Research Centre. The team was taking part in a summer programme run by Singularity University (SU) when they were set the task of creating a concept that would positively affect the lives of a billion people. This exercise got the team thinking about the current state of scientific research, and more specifically about the restrictions created by paywalls, and the inability of human intelligence alone to process the three thousand or so research papers that are published around the world every single day….

When asked about challenges that the team have experienced so far, Ms Ritola was quick to point out the issue of paywalls. She explained that the Iris.ai system is connected to about 130 million open access papers – almost all those available to the public – but that many useful documents are still hidden behind systems that require users to pay for access.

However, rather than just accepting this situation as it is, the Iris.ai team have devised a scheme to solve the problem– Project Aiur – an initiative that aims to revolutionise the current workings of the research world.

“What we’re trying to do is to build a community, which is not owned by us, but by a community of researchers, a community of coders, anyone who wants to contribute to building a new economic model for science that works around a community governed AI-based Knowledge Validation Engine and an open, validated repository of science. Over time, the goal is to give access to all the research articles that are in this world”, Ms Ritola told The Saint.

This is not a straightforward task, as the Iris.ai team are faced with the challenge of encouraging researchers to publish and carry out their investigations using Aiur rather than the current systems- something that will take a fair amount of research and incentivisation. The team have started a pledge, offering students and researchers the chance to be an “advocate for validated, reproducible, open-access scientific research.” At the time of the interview,Ms Ritola informed The Saint that more than 5,000 people had signed the pledge….”

Project AIUR by Iris.ai: Democratize Science through blockchain-enabled disintermediation

“There are a number of problems in the world of science today hampering global progress. In an almost monopolized industry with terrible incentive misalignments, a radical change is needed. The only way to change this is with a grassroots movement – of researchers and scientists, librarians, scientific societies, R&D departments, universities, students, and innovators – coming together. We need to remove the powerful intermediaries, create new incentive structures, build commonly owned tools to validate all research and build a common Validated Repository of human knowledge. A combination of blockchain and artificial intelligence provides the technology framework, but as with all research, the scientist herself needs to be in the center. That is what we are proposing with Project Aiur, and we hope you will join us….

The outlined core software tool of the community will be the Knowledge Validation Engine (KVE). It will be a fully-fledged technical platform able to pinpoint: ? the building blocks of a scientific text;

? what the reader needs to know to be able to understand the text;

? what are the text’s factual sources; and,

? what is the reproducibility level of the different building blocks.

The platform will take a scientific document in the form of a scientific paper or technical report as an input, and it will provide an analytical report presenting:

? the knowledge architecture of the document;

? the hypotheses tree supporting the presented document’s hypothesis;

? the support level found for each of the hypotheses on the hypotheses tree; and,

? their respective reproducibility. All of this will be based on the knowledge database of scientific documents accessible to the system at any given point in time (knowledge in an Open Access environment). …”

Taylor & Francis is bringing AI to academic publishing – but it isn’t easy | The Bookseller

“Leading academic publisher Taylor & Francis is developing natural language processing technology to help machines understand its books and journals, with the aim to enrich customers’ online experiences and create new tools to make the company more efficient.

The first step extracts topics and concepts from text in any scholarly subject domain, and shows recommendations of additional content to online users based on what they are already reading, allowing them to discover new research more easily. Further steps will lead to semantic content enrichment for more improvements in areas such as relatedness, better searches, and finding peer-reviewers and specialists on particular subjects….”

Taylor & Francis is bringing AI to academic publishing – but it isn’t easy | The Bookseller

“Leading academic publisher Taylor & Francis is developing natural language processing technology to help machines understand its books and journals, with the aim to enrich customers’ online experiences and create new tools to make the company more efficient.

The first step extracts topics and concepts from text in any scholarly subject domain, and shows recommendations of additional content to online users based on what they are already reading, allowing them to discover new research more easily. Further steps will lead to semantic content enrichment for more improvements in areas such as relatedness, better searches, and finding peer-reviewers and specialists on particular subjects….”

AI experts call for support of STEM education, research and open data policies at House hearing | CIO Dive

“The industry and government have spent many years collecting data, and now there is finally a tool [AI] to derive insight from it. The panel [at a House Oversight and Government Reform committee hearing Wednesday] agreed that open data policies are needed so the industry can begin making use of it, but it won’t be an easy process. Most of the government’s data is still unstructured and needs to be organized in a meaningful way.”

Semantic Scholar – An academic search engine for scientific articles

“We’ve pulled over 40 million scientific papers from sources like PubMed, Nature, and ArXiv….Our AI analyzes research papers and pulls out authors, references, figures, and topics….We link all of this information together into a comprehensive picture of cutting-edge research….What if a cure for an intractable cancer is hidden within the results of thousands of clinical studies? We believe that in 20 years’ time, AI will be able to connect the dots between studies to identify hypotheses and suggest experiments that would otherwise be missed. That’s why we’re building Semantic Scholar and making it free and open to researchers everywhere.

Semantic Scholar is a project at the Allen Institute for Artificial Intelligence (AI2). AI2 was founded to conduct high-impact research and engineering in the field of artificial intelligence. We’re funded by Paul Allen, Microsoft co-founder, and led by Dr. Oren Etzioni, a world-renowned researcher and professor in the field of artificial intelligence….”

An artificial future | Research Information

“One of the most exciting data projects we [Elsevier] are working on at the moment is with a UK based charity, Findacure. We are helping the charity to find alternative treatment options for rare diseases such as Congenital Hyperinsulinism by offering our informatics expertise, and giving them access to published literature and curated data through our online tools, at no charge.

We are also supporting The Pistoia Alliance, a not-for-profit group that aims to lower barriers to collaboration within the pharmaceutical and life science industry. We have been working with its members to collaborate and develop approaches that can bring benefits to the entire industry. We recently donated our Unified Data Model to the Alliance; with the aim of publishing an open and freely available format for the storage and exchange of drug discovery data. I am still proud of the work I did with them back in 2009 on the SESL project (Semantic Enrichment of Scientific Literature), and my involvement continues as part of the special interest group in AI….”

[1707.04207] Incidental or influential? – Challenges in automatically detecting citation importance using publication full texts

“This work looks in depth at several studies that have attempted to automate the process of citation importance classification based on the publications full text. We analyse a range of features that have been previously used in this task. Our experimental results confirm that the number of in text references are highly predictive of influence. Contrary to the work of Valenzuela et al. we find abstract similarity one of the most predictive features. Overall, we show that many of the features previously described in literature are not particularly predictive. Consequently, we discuss challenges and potential improvements in the classification pipeline, provide a critical review of the performance of individual features and address the importance of constructing a large scale gold standard reference dataset.” 

Science Beam – using computer vision to extract PDF data | Labs | eLife

“There’s a vast trove of science out there locked inside the PDF format. From preprints to peer-reviewed literature and historical research, millions of scientific manuscripts today can only be found in a print-era format that is effectively inaccessible to the web of interconnected online services and APIs that are increasingly becoming the digital scaffold of today’s research infrastructure….Extracting key information from PDF files isn’t trivial. …It would therefore certainly be useful to be able to extract all key data from manuscript PDFs and store it in a more accessible, more reusable format such as XML (of the publishing industry standard JATS variety or otherwise). This would allow for the flexible conversion of the original manuscript into different forms, from mobile-friendly layouts to enhanced views like eLife’s side-by-side view (through eLife Lens). It will also make the research mineable and API-accessible to any number of tools, services and applications. From advanced search tools to the contextual presentation of semantic tags based on users’ interests, and from cross-domain mash-ups showing correlations between different papers to novel applications like ScienceFair, a move away from PDF and toward a more open and flexible format like XML would unlock a multitude of use cases for the discovery and reuse of existing research….We are embarking on a project to build on these existing open-source tools, and to improve the accuracy of the XML output. One aim of the project is to combine some of the existing tools in a modular PDF-to-XML conversion pipeline that achieves a better overall conversion result compared to using individual tools on their own. In addition, we are experimenting with a different approach to the problem: using computer vision to identify key components of the scientific manuscript in PDF format….To this end, we will be collaborating with other publishers to collate a broad corpus of valid PDF/XML pairs to help train and test our neural networks….”