ScholarPhi: A Novel Interface for Reading Scientific Papers | UC Berkeley School of Information

“To help scientists deal with the increasing volume of published scientific literature, a research team at the I School is designing ScholarPhi, an augmented reading interface that makes scientific papers more understandable and contextually rich.

The project is led by UC Berkeley School of Information Professor Marti Hearst, and includes UC Berkeley postdoctoral fellows Andrew Head and Dongyeop Kang, and collaborators Raymond Folk, Kyle Lo, Sam Sjonsberg, and Dan Weld from the Allen Institute for AI (AI2) and the University of Washington. It is funded in part by the Alfred P. Sloan Foundation and by AI2. 

ScholarPhi broadens access to scientific literature by developing a new document reader user interface and natural language analysis algorithms for context-relevant explanations of technical terms and notation….”

Semantic Scholar | Semantic Reader

“Semantic Reader Beta is an augmented reader with the potential to revolutionize scientific reading by making it more accessible and richly contextual.

Observations of scientists reading technical papers showed that readers frequently page back and forth looking for the definitions of terms and mathematical symbols as well as for the details of cited papers. This need to jump around through the paper breaks the flow of paper comprehension.

Semantic Reader provides this information directly in context by dimming unrelated text and providing details in tooltips, and soon will also provide corresponding term definitions. It uses artificial intelligence to understand a document’s structure. Usability studies show readers answered questions requiring deep understanding of paper concepts significantly more quickly with ScholarPhi than with a baseline PDF reader; furthermore, they viewed much less of the paper.

Based on the ScholarPhi research from the Semantic Scholar team at AI2, UC Berkeley and the University of Washington, and supported in part by the Alfred P. Sloan Foundation, the Semantic Reader is now available in beta for a select group of arXiv papers on with plans to add additional features and expand coverage soon….”

We’re building a replacement for Microsoft Academic Graph – Our Research blog

“This week Microsoft Research announced that their free bibliographic database–Microsoft Academic Graph, or MAG for short–is being discontinued. This is sad news, because MAG was a great source of open scholcomm metadata, including citation counts and author affiliations. MAG data is used in Unsub, as well as several other well-known open science tools.

Thankfully, we’ve got a contingency plan for this situation, which we’ve been working on for a while now. We’re building a successor to MAG. Like all our projects, it’ll be open-source and the data will be free to everyone via data dump and API. It will launch at the end of the year, when MAG is scheduled to disappear.

It’s important to note that this new service will not be a perfect replacement, especially right when it launches. MAG has excellent support for conference proceedings, for example; we won’t match that for a while, if ever.  Instead, we’ll be focusing on supporting the most important use-cases, and building out from there. If you use MAG today, we’d love to hear what your key use-cases are, so we can prioritize accordingly. Here’s where you can tell us.

We plan to have this launched by the time MAG disappears at year’s end. That’s an aggressive schedule, but we’ve built and launched other large projects (Unpaywall, Unsub) in less time. We’ve also got a good head start, since we’ve been working toward this as an internal project for a while now….”

Read, Hot & Digitized: Visualizing Wikipedia’s Gender Gap | TexLibris

“However, Wikipedia has a long-standing problem of gender imbalance both in terms of article content and editor demographics. Only 18% of content across Wikimedia platforms are about women. The gaps on content covering non-binary and transgender individuals are even starker: less than 1% of editors identify as trans, and less than 1% of biographies cover trans or nonbinary individuals. When gender is combined with other factors, such as race, nationality, or ethnicity, the numbers get even lower. This gender inequity has long been covered in the scholarly literature via editor surveys and analysis of article content (Hill and Shaw, 2013; Graells-Garrido, Lalmas, and Menczer, 2015; Bear and Collier, 2016; Wagner, Graells-Garrido, Garcia, and Menczer, 2016; Ford and Wajcman, 2017). To visualize these inequalities in nearly real time, the Humaniki tool was developed….”

Tilting the balance back towards libraries | Research Information

Jason Priem tells of his hopes for a ‘long-overdue’ change in academic publishing.

“This presents a compelling opportunity for us as OA advocates: by helping libraries quantify the alternatives to toll-access publishing, we can empower librarians to cancel multi-million dollar big deals. This in turn will begin to turn off the faucet of money flowing from universities to toll-access publishing houses. In short: by helping libraries cancel big deals, we can make toll-access publishing less profitable, and accelerate the transition toward universal OA.”

Day-to-day discovery of preprint–publication links | SpringerLink

Abstract:  Preprints promote the open and fast communication of non-peer reviewed work. Once a preprint is published in a peer-reviewed venue, the preprint server updates its web page: a prominent hyperlink leading to the newly published work is added. Linking preprints to publications is of utmost importance as it provides readers with the latest version of a now certified work. Yet leading preprint servers fail to identify all existing preprint–publication links. This limitation calls for a more thorough approach to this critical information retrieval task: overlooking published evidence translates into partial and even inaccurate systematic reviews on health-related issues, for instance. We designed an algorithm leveraging the Crossref public and free source of bibliographic metadata to comb the literature for preprint–publication links. We tested it on a reference preprint set identified and curated for a living systematic review on interventions for preventing and treating COVID-19 performed by international collaboration: the COVID-NMA initiative ( The reference set comprised 343 preprints, 121 of which appeared as a publication in a peer-reviewed journal. While the preprint servers identified 39.7% of the preprint–publication links, our linker identified 90.9% of the expected links with no clues taken from the preprint servers. The accuracy of the proposed linker is 91.5% on this reference set, with 90.9% sensitivity and 91.9% specificity. This is a 16.26% increase in accuracy compared to that of preprint servers. We release this software as supplementary material to foster its integration into preprint servers’ workflows and enhance a daily preprint–publication chase that is useful to all readers, including systematic reviewers. This preprint–publication linker currently provides day-to-day updates to the biomedical experts of the COVID-NMA initiative.


OA.Works: Powerfully simple open access tools

“OA.Works is a non-profit building tools so that open access is easy and equitable. They’re free, open source, and co-designed with advocates for a just world….

We build tools like that make it simple to freely and legally unlock papers from behind paywalls. We helped create a global network of libraries sharing knowledge freely during COVID through RSCVD. Our tools like InstantILL and Open Access Button support accessing research without a steep price tag. In the past decade, our products have been used millions of times, across the world….”