2.5% for Open: An inventory of investment opportunities – Google Docs

“David Lewis has recently proposed that libraries devote 2.5% of its total budget to support the common infrastructure needed to create the open scholarly commons….In the early stages of exploring this idea, we want to come to some level agreement about what would in fact count as such an investment, and then build a registry that would allow libraries to record their investments in this area, track their investments over time, and compare their investments with like institutions. The registry would also serve as a guide for those looking for ideas for how to make the best investments for their institution, providing a listing of all ‘approved’ ways to invest in open, and as a place for those seeking investment to be discovered. As a first step towards building such a thing, we are crowdsourcing the creation of the inventory of ways to invest….”

Open sourcing our repository code – ResearchSpace

“To celebrate RepositoryFringe 2017, we’re now opening up some of our source code to enable third party developers to implement their own repository integrations with RSpace. This code, our Repository Service Provider Interface (SPI) is a collection of utility Java classes and interfaces whose implementations  connect and send ELN data and metadata from RSpace to its destination. The project also contains some example code showing the implementation we did for Figshare deposits, which we hope will give people a head start in writing implementations for other repositories.”

“How automated workflows helped us ingest 600 faculty publications in t” by Shilpa Rele and Jessea Young

Abstract: “Conducting copyright clearance and ingesting appropriate versions of faculty publications can be a labor intensive and time consuming process. At Loyola Marymount University (LMU), a medium-size, private institution, the Digital Library Program (DLP) had been conducting copyright clearance one publication at a time. This meant that it took an enormous amount of time from start to finish to review and process the list of publications on a given faculty member’s CV. In October 2016, the Digital Program Librarian learned about the automated workflow developed by librarians at University of North Texas and decided to give it a try. At this time, the DLP hired a Library Assistant who then began exploring and experimenting with this automated workflow. The goal of such experimentation was to increase efficiency in our processes to ingest more faculty publications in LMU’s institutional repository.

In this session, we will share information about our workflows and tools used to manage our various processes. […]”

Release ‘open’ data from their PDF prisons using tabulizer | R-bloggers

“As a political scientist who regularly encounters so-called “open data” in PDFs, this problem is particularly irritating. PDFs may have “portable” in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally.”

Building Manifold | Building Manifold

“Welcome to Building Manifold, a blog that will document the process of creating Manifold Scholarship, a project at the University of Minnesota Press in partnership with the GC Digital Scholarship Lab at the Graduate Center, CUNY and Cast Iron Coding. Manifold Scholarship is funded through a generous grant from the Andrew W. Mellon Foundation as part of a series of 2015 grants made to university presses.

Manifold Scholarship is composed of two parts:

1) The creation of Manifold, an intuitive, collaborative, open-source platform for scholarly works. With iterative texts, powerful annotation tools, rich media support, and robust community dialogue, Manifold will transform scholarly publications into living digital works.

2) Rethinking the print-focused mode of scholarly authorship and university press editorial procedures and production workflows to accommodate the differences in creating content for iterative, networked publication….”

Automatically Determining Versions of Scholarly Articles | Rothchild | Scholarly and Research Communication

Abstract:  Background: Repositories of scholarly articles should provide authoritative information about the materials they distribute and should distribute those materials in keeping with pertinent laws. To do so, it is important to have accurate information about the versions of articles in a collection.

Analysis: This article presents a simple statistical model to classify articles as author manuscripts or versions of record, with parameters trained on a collection of articles that have been hand-annotated for version. The algorithm achieves about 94 percent accuracy on average (cross-validated). Conclusion and implications: The average pairwise annotator agreement among a group of experts was 94 percent, showing that the method developed in this article displays performance competitive with human experts.

DOI-OA-Crawler/README.md at master · atmire/DOI-OA-Crawler · GitHub

“Wiley article page urls can be extended with /epdf. From these EPDF pages, we retrieve the value of the field “‘WOL-Article-Access-State'” in the returned HTML source. Currently, this approach only works for articles hosted on onlinelibrary.wiley.com. However, we are very interested in developing similar approaches for other publishers/vendors….”