“David Lewis has recently proposed that libraries devote 2.5% of its total budget to support the common infrastructure needed to create the open scholarly commons….In the early stages of exploring this idea, we want to come to some level agreement about what would in fact count as such an investment, and then build a registry that would allow libraries to record their investments in this area, track their investments over time, and compare their investments with like institutions. The registry would also serve as a guide for those looking for ideas for how to make the best investments for their institution, providing a listing of all ‘approved’ ways to invest in open, and as a place for those seeking investment to be discovered. As a first step towards building such a thing, we are crowdsourcing the creation of the inventory of ways to invest….”
“To celebrate RepositoryFringe 2017, we’re now opening up some of our source code to enable third party developers to implement their own repository integrations with RSpace. This code, our Repository Service Provider Interface (SPI) is a collection of utility Java classes and interfaces whose implementations connect and send ELN data and metadata from RSpace to its destination. The project also contains some example code showing the implementation we did for Figshare deposits, which we hope will give people a head start in writing implementations for other repositories.”
Abstract: “Conducting copyright clearance and ingesting appropriate versions of faculty publications can be a labor intensive and time consuming process. At Loyola Marymount University (LMU), a medium-size, private institution, the Digital Library Program (DLP) had been conducting copyright clearance one publication at a time. This meant that it took an enormous amount of time from start to finish to review and process the list of publications on a given faculty member’s CV. In October 2016, the Digital Program Librarian learned about the automated workflow developed by librarians at University of North Texas and decided to give it a try. At this time, the DLP hired a Library Assistant who then began exploring and experimenting with this automated workflow. The goal of such experimentation was to increase efficiency in our processes to ingest more faculty publications in LMU’s institutional repository.
In this session, we will share information about our workflows and tools used to manage our various processes. […]”
“As a political scientist who regularly encounters so-called “open data” in PDFs, this problem is particularly irritating. PDFs may have “portable” in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally.”
“Welcome to Building Manifold, a blog that will document the process of creating Manifold Scholarship, a project at the University of Minnesota Press in partnership with the GC Digital Scholarship Lab at the Graduate Center, CUNY and Cast Iron Coding. Manifold Scholarship is funded through a generous grant from the Andrew W. Mellon Foundation as part of a series of 2015 grants made to university presses.
Manifold Scholarship is composed of two parts:
1) The creation of Manifold, an intuitive, collaborative, open-source platform for scholarly works. With iterative texts, powerful annotation tools, rich media support, and robust community dialogue, Manifold will transform scholarly publications into living digital works.
2) Rethinking the print-focused mode of scholarly authorship and university press editorial procedures and production workflows to accommodate the differences in creating content for iterative, networked publication….”
Abstract: Background: Repositories of scholarly articles should provide authoritative information about the materials they distribute and should distribute those materials in keeping with pertinent laws. To do so, it is important to have accurate information about the versions of articles in a collection.
Analysis: This article presents a simple statistical model to classify articles as author manuscripts or versions of record, with parameters trained on a collection of articles that have been hand-annotated for version. The algorithm achieves about 94 percent accuracy on average (cross-validated). Conclusion and implications: The average pairwise annotator agreement among a group of experts was 94 percent, showing that the method developed in this article displays performance competitive with human experts.
“Wiley article page urls can be extended with /epdf. From these EPDF pages, we retrieve the value of the field “‘WOL-Article-Access-State'” in the returned HTML source. Currently, this approach only works for articles hosted on onlinelibrary.wiley.com. However, we are very interested in developing similar approaches for other publishers/vendors….”
Abstract: “This is the first in-depth study on the coverage of Microsoft Academic (MA). The coverage of a verified publication list of a university was analyzed on the level of individual publications in MA, Scopus, and Web of Science (WoS). Citation counts were analyzed and issues related to data retrieval and data quality were examined. A Perl script was written to retrieve metadata from MA. We find that MA covers journal articles, working papers, and conference items to a substantial extent. MA surpasses Scopus and WoS clearly with respect to book-related document types and conference items but falls slightly behind Scopus with regard to journal articles. MA shows the same biases as Scopus and WoS with regard to the coverage of the social sciences and humanities, non-English publications, and open-access publications. Rank correlations of citation counts are high between MA and the benchmark databases. We find that the publication year is correct for 89.5% of all publications and the number of authors for 95.1% of the journal articles. Given the fast and ongoing development of MA, we conclude that MA is on the verge of becoming a bibliometric superpower. However, comprehensive studies on the quality of MA data are still lacking.”