Datasets on arXiv. We’re excited to announce our… | by Robert Stojnic | PapersWithCode | May, 2021 | Medium

“We’re excited to announce our partnership with arXiv to support links to datasets on arXiv!

Machine learning articles on arXiv now have a Code & Data tab to link to datasets that are used or introduced in a paper….

This makes it much easier to track dataset usage across the community and quickly find other papers using the same dataset. From Papers with Code you can discover other papers using the same dataset, track usage over time, compare models and find similar datasets….”

Call for Nominees for Two Open Library Foundation Board Positions

“The Open Library Foundation is accepting nominations to fill two Board of Directors seats. The Foundation is looking for proven leaders who support open source projects and open source software development and are interested in enabling discovery and research. Nominees are welcome either from within the Open Library Foundation community or the open source development community at large….”

We’re building a replacement for Microsoft Academic Graph – Our Research blog

“This week Microsoft Research announced that their free bibliographic database–Microsoft Academic Graph, or MAG for short–is being discontinued. This is sad news, because MAG was a great source of open scholcomm metadata, including citation counts and author affiliations. MAG data is used in Unsub, as well as several other well-known open science tools.

Thankfully, we’ve got a contingency plan for this situation, which we’ve been working on for a while now. We’re building a successor to MAG. Like all our projects, it’ll be open-source and the data will be free to everyone via data dump and API. It will launch at the end of the year, when MAG is scheduled to disappear.

It’s important to note that this new service will not be a perfect replacement, especially right when it launches. MAG has excellent support for conference proceedings, for example; we won’t match that for a while, if ever.  Instead, we’ll be focusing on supporting the most important use-cases, and building out from there. If you use MAG today, we’d love to hear what your key use-cases are, so we can prioritize accordingly. Here’s where you can tell us.

We plan to have this launched by the time MAG disappears at year’s end. That’s an aggressive schedule, but we’ve built and launched other large projects (Unpaywall, Unsub) in less time. We’ve also got a good head start, since we’ve been working toward this as an internal project for a while now….”

Day-to-day discovery of preprint–publication links | SpringerLink

Abstract:  Preprints promote the open and fast communication of non-peer reviewed work. Once a preprint is published in a peer-reviewed venue, the preprint server updates its web page: a prominent hyperlink leading to the newly published work is added. Linking preprints to publications is of utmost importance as it provides readers with the latest version of a now certified work. Yet leading preprint servers fail to identify all existing preprint–publication links. This limitation calls for a more thorough approach to this critical information retrieval task: overlooking published evidence translates into partial and even inaccurate systematic reviews on health-related issues, for instance. We designed an algorithm leveraging the Crossref public and free source of bibliographic metadata to comb the literature for preprint–publication links. We tested it on a reference preprint set identified and curated for a living systematic review on interventions for preventing and treating COVID-19 performed by international collaboration: the COVID-NMA initiative ( The reference set comprised 343 preprints, 121 of which appeared as a publication in a peer-reviewed journal. While the preprint servers identified 39.7% of the preprint–publication links, our linker identified 90.9% of the expected links with no clues taken from the preprint servers. The accuracy of the proposed linker is 91.5% on this reference set, with 90.9% sensitivity and 91.9% specificity. This is a 16.26% increase in accuracy compared to that of preprint servers. We release this software as supplementary material to foster its integration into preprint servers’ workflows and enhance a daily preprint–publication chase that is useful to all readers, including systematic reviewers. This preprint–publication linker currently provides day-to-day updates to the biomedical experts of the COVID-NMA initiative.


Enabling research with Open Software and Data Tickets, Tue 11 May 2021 at 11:00 | Eventbrite

“On the 10 – 14 May 2021, during Open Scholarship Week (OSW2021) staff, students, members of the public and a variety of other stakeholders will come together to talk about changing the ways scholarly information is openly communicated, shared and used. OSW2021 will offer a diverse range of talks and workshops representing many different perspectives and disciplines on Open practices in research and education….”

WordPress Saves Creative Commons Search Engine From Shutting Down

“Creative Commons Search is joining, which will help keep the search engine of free-to-use images running for the foreseeable future.

Matt Mullenweg, CEO of WordPress parent company Automattic, says he decided to bring CC Search on board after hearing it was in danger of shutting down….”

Open Library Foundation

“The Open Library Foundation enables the development, accessibility and sustainability of open source and open access projects for and by libraries. The Foundation seeks to enable and support creative collaboration among librarians, technologists, designers, service providers and vendors to share expertise and resources and to create innovative new software and resources that support libraries….”


2i2c: Interactive computing infrastructure for your community

“We make interactive computing more accessible and powerful for research and education. We strive to accelerate research and discovery, and to empower education to be more accessible, intuitive, and enjoyable. We do this through these primary actions: …”

The Customer Right to Replicate | 2i2c

“To ensure the Right to Replicate to our customers, 2i2c makes the following commitments to infrastructure we build and operate:

We MUST use only open source software to run our infrastructure. By only using software that is available to everyone on the same terms, we can ensure that customers can replicate the infrastructure without having to negotiate licensing terms with proprietary software vendors. In addition, any changes we make to open source software will be made in public and/or contributed upstream, so customers continue to have access to them regardless of where their infrastructure is.

We MUST NOT directly depend on proprietary cloud vendor specific products or APIs. Instead, we use cloud-managed open source software, or hide the dependency behind a layer of abstraction. This ensures that customers can port their infrastructure to any cloud provider of their choice, or run it on their own hardware with purely open source software.

This set of commitments acts as a business continuity plan for our customers, ensuring 2i2c will follow best practices within the open source, open education and open research ecosystems….”