Taylor & Francis is bringing AI to academic publishing – but it isn’t easy | The Bookseller

“Leading academic publisher Taylor & Francis is developing natural language processing technology to help machines understand its books and journals, with the aim to enrich customers’ online experiences and create new tools to make the company more efficient.

The first step extracts topics and concepts from text in any scholarly subject domain, and shows recommendations of additional content to online users based on what they are already reading, allowing them to discover new research more easily. Further steps will lead to semantic content enrichment for more improvements in areas such as relatedness, better searches, and finding peer-reviewers and specialists on particular subjects….”

Taylor & Francis is bringing AI to academic publishing – but it isn’t easy | The Bookseller

“Leading academic publisher Taylor & Francis is developing natural language processing technology to help machines understand its books and journals, with the aim to enrich customers’ online experiences and create new tools to make the company more efficient.

The first step extracts topics and concepts from text in any scholarly subject domain, and shows recommendations of additional content to online users based on what they are already reading, allowing them to discover new research more easily. Further steps will lead to semantic content enrichment for more improvements in areas such as relatedness, better searches, and finding peer-reviewers and specialists on particular subjects….”

Extracting research evidence from publications | EMBL-EBI Train online

“Extracting research evidence from publications Bioinformaticians are routinely handling big data, including DNA, RNA, and protein sequence information. It’s time to treat biomedical literature as a dataset and extract valuable facts hidden in the millions of scientific papers. This webinar demonstrates how to access text-mined literature evidence using Europe PMC Annotations API. We highlight several use cases, including linking diseases with potential treatment targets, or identifying which protein structures are cited along with a gene mutation.

This webinar took place on 5 March 2018 and is for wet-lab researchers and bioinformaticians who want to access scientific literature and data programmatically. Some prior knowledge of programmatic access and common programming languages is recommended.

The webinar covers: Available data (annotation types and sources) (1:50) API operations and parameters and web service outputs (8:08) Use case examples (16:56) How to get help (24:16)

You can download the slides from this webinar here. You can learn more about Europe PMC in our Europe PMC: Quick tour and our previous webinar Europe PMC, programmatically.

For documentation, help and support visit the Europe PMC help pages or download the developer friendly web service guide. For web service related question you can get in touch via the Google group or contact the helpdesk [at] europepmc.org”>help desk.”

Knowtro

“Knowtro has:

  • Identified elements of knowledge shared across research disciplines and mapped the elements critical to the successful transfer of knowledge from document to user.
  • Built a technology-facilitated process whereby complex analyses can be distilled for ease of discovery and use. Findings from published research papers in only the top academic journals are added to the platform daily.
  • Implemented a search results display feature that (1) uses consistent, logical expressions about research findings rather than happenstance excerpts of text, and (2) prioritizes results not by popularity, but according to validity and usefulness (e.g., research design). …”

Release ‘open’ data from their PDF prisons using tabulizer | R-bloggers

“As a political scientist who regularly encounters so-called “open data” in PDFs, this problem is particularly irritating. PDFs may have “portable” in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally.”

Scraping Scientific Web Repositories: Challenges and Solutions for Automated Content Extraction

Abstract: “Aside from improving the visibility and accessibility of scientific publications, many scientific Web repositories also assess researchers’ quantitative and qualitative publication performance, e.g., by displaying metrics such as the h-index. These metrics have become important for research institutions and other stakeholders to support impactful decision making processes such as hiring or funding decisions. However, scientific Web repositories typically offer only simple performance metrics and limited analysis options. Moreover, the data and algorithms to compute performance metrics are usually not published. Hence, it is not transparent or verifiable which publications the systems include in the computation and how the systems rank the results. Many researchers are interested in accessing the underlying scientometric raw data to increase the transparency of these systems. In this paper, we discuss the challenges and present strategies to programmatically access such data in scientific Web repositories. We demonstrate the strategies as part of an open source tool (MIT license) that allows research performance comparisons based on Google Scholar data. We would like to emphasize that the scraper included in the tool should only be used if consent was given by the operator of a repository. In our experience, consent is often given if the research goals are clearly explained and the project is of a non-commercial nature.”