“With support from the Andrew W. Mellon Foundation, the Linked Data for Production Phase 2 (LD4P2) partners (Cornell University, Harvard University, Stanford University, and the University of Iowa’s School of Library and Information Science), in collaboration with the Library of Congress and the Program for Cooperative Cataloging (PCC), are building a pathway for the cataloging community to begin shifting to linked data to describe library resources. LD4P2 builds on the foundational work of Linked Data for Production (LD4P) Phase 1 and Linked Data for Libraries Labs (LD4L Labs). More on LD4P2 Project Background and Goals….”
Abstract: This thesis is about research communication in the context of the Web. I analyse literature which reveals how researchers are making use of Web technologies for knowledge dissemination, as well as how individuals are disempowered by the centralisation of certain systems, such as academic publishing platforms and social media. I share my findings on the feasibility of a decentralised and interoperable information space where researchers can control their identifiers whilst fulfilling the core functions of scientific communication: registration, awareness, certification, and archiving.
The contemporary research communication paradigm operates under a diverse set of sociotechnical constraints, which influence how units of research information and personal data are created and exchanged. Economic forces and non-interoperable system designs mean that researcher identifiers and research contributions are largely shaped and controlled by third-party entities; participation requires the use of proprietary systems.
From a technical standpoint, this thesis takes a deep look at semantic structure of research artifacts, and how they can be stored, linked and shared in a way that is controlled by individual researchers, or delegated to trusted parties. Further, I find that the ecosystem was lacking a technical Web standard able to fulfill the awareness function of research communication. Thus, I contribute a new communication protocol, Linked Data Notifications (published as a W3C Recommendation) which enables decentralised notifications on the Web, and provide implementations pertinent to the academic publishing use case. So far we have seen decentralised notifications applied in research dissemination or collaboration scenarios, as well as for archival activities and scientific experiments.
Another core contribution of this work is a Web standards-based implementation of a clientside tool, dokieli, for decentralised article publishing, annotations and social interactions. dokieli can be used to fulfill the scholarly functions of registration, awareness, certification, and archiving, all in a decentralised manner, returning control of research contributions and discourse to individual researchers.
The overarching conclusion of the thesis is that Web technologies can be used to create a fully functioning ecosystem for research communication. Using the framework of Web architecture, and loosely coupling the four functions, an accessible and inclusive ecosystem can be realised whereby users are able to use and switch between interoperable applications without interfering with existing data.
Technical solutions alone do not suffice of course, so this thesis also takes into account the need for a change in the traditional mode of thinking amongst scholars, and presents the Linked Research initiative as an ongoing effort toward researcher autonomy in a social system, and universal access to human- and machine-readable information?. Outcomes of this outreach work so far include an increase in the number of individuals self-hosting their research artifacts, workshops publishing accessible proceedings on the Web, in-the-wild experiments with open and public peer-review, and semantic graphs of contributions to conference proceedings and journals (the Linked Open Research Cloud).
Some of the future challenges include: addressing the social implications of decentralised Web publishing, as well as the design of ethically grounded interoperable mechanisms; cultivating privacy aware information spaces; personal or community-controlled on-demand archiving services; and further design of decentralised applications that are aware of the core functions of scientific communication.
“My colleagues Jean Godby, Karen Smith-Yoshimura, and Bruce Washburn, along with a host of partners, have just released Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage, a fascinating account of their experiences working with a customized instance of Wikibase to create resource descriptions in the form of linked data. In the spirit of their report, I’d like to offer a modest yet illustrative use case showing how access to the relationships and properties of the linked data in another Wikibase environment – Wikidata – smoothed the way for OCLC Research’s recent study of the Canadian presence in the published record.
Maple Leaves: Discovering Canada Through the Published Record is the latest in a series of OCLC Research studies that explore national contributions to the world’s accumulated body of published materials. A national contribution is defined as materials published in, about, and/or by the people of that country. The last category presents a special challenge: how to assemble a list of entities – people and organizations – associated with a particular country from which authors, musicians, film makers, and other creators of published works can be identified?…”
Abstract: Library catalogues may be connected to the linked data cloud through various types of thesauri. For name authority thesauri in particular I would like to suggest a fundamental break with the current distributed linked data paradigm: to make a transition from a multitude of different identifiers to using a single, universal identifier for all relevant named entities, in the form of the Wikidata identifier. Wikidata (https://wikidata.org) seems to be evolving into a major authority hub that is lowering barriers to access the web of data for everyone. Using the Wikidata identifier of notable entities as a common identifier for connecting resources has significant benefits compared to traversing the ever-growing linked data cloud. When the use of Wikidata reaches a critical mass, for some institutions, Wikidata could even serve as an authority control mechanism.
“There has been a growing interest from libraries and other cultural heritage organizations in Wikidata. Of the many potential uses for Wikidata, one emerging area of focus has been using Wikidata as a hub for institutional identifiers. Many organizations maintain unique identifiers for people, subjects, works, etc. If these IDs are all added to Wikidata then you could seamlessly access data from dozens of sources if you know the Wikidata ID. If we return to the author example from above you can see the Wikidata page for Virginia Woolf has ninety external links to various organizations. Many of these are national libraries, museums, and other cultural heritage institutions including the Library of Congress.
The Library of Congress maintains many authority files that are widely used. Two of the largest are the Name Authority File (NAF) and Library of Congress Subject Headings (LCSH). The Network Development and MARC Standard Office maintains the Linked Open Data version of these files at the site id.loc.gov. For example, authority data for Virginia Woolf is located at //id.loc.gov/authorities/names/n79041870. This data ensures that items being cataloged are all referencing the same person. One of the goals of linked data is to make sure you link out to other’s data. With id.loc.gov we maintain links to many other institutions authority files including the French and German national libraries, other government services such as Department of Agriculture and other cultural institutions like the Getty Museum. You’ll notice these links on the page and are also present in the machine readable data. With the potential of Wikidata being a hub of identifiers we wanted to also include links in our authority record out to Wikidata….
Using records from the Library of Congress Prints & Photographs Division I built an interface that combines Library of Congress collection items with Wikidata information. This tool demonstrates the possibilities in connecting these two knowledge systems….”
Abstract: Knowledge workers like researchers, students, journalists, research evaluators or funders need tools to explore what is known, how it was discovered, who made which contributions, and where the scholarly record has gaps. Existing tools and services of this kind are not available as Linked Open Data, but Wikidata is. It has the technology, active contributor base, and content to build a large-scale knowledge graph for scholarship, also known as WikiCite. Scholia visualizes this graph in an exploratory interface with profiles and links to the literature. However, it is just a working prototype. This project aims to “robustify Scholia” with back-end development and testing based on pilot corpora. The main objective at this stage is to attain stability in challenging cases such as server throttling and handling of large or incomplete datasets. Further goals include integrating Scholia with data curation and manuscript writing workflows, serving more languages, generating usage stats, and documentation.
“In a new Association of Research Libraries (ARL) white paper, a task force of expert Wikidata users recommend a variety of ways for librarians to use the open knowledge base in advancing global discovery of their collections, faculty, and institutions.
Librarians are using Wikidata’s structured data about people, topics, concepts, and objects to populate open source faculty profiling systems, to enhance bibliographic records in online catalogs, and to collaborate with communities on meaningful, culturally relevant, descriptive metadata for special collections and archives. The white paper, circulated for public comment in fall 2018, contains examples of Wikidata applications, screenshots, and recommendations for involvement on an individual or organizational level….”
“[Recommendations] For individual librarians: • Use and experiment with Wikidata, for example: • Contribute local name authorities to Wikidata, particularly for underrepresented creators and organizations. • Add institutional holdings to existing Wikidata items using the “archives at”13 property.14 • Create items for faculty in an institution. • Explore and experiment with Wikidata editing tools such as Mix’n’match, batch uploading, and database dumps.15 • Create a “hub of hubs” for authority controls, metadata vocabularies, and other data sources, to facilitate the connection between existing external metadata sources and Wikidata. • Get involved in the greater Wikimedia community by holding edit-a-thons and workshops, participating in discussions on email lists and in social media channels, and by joining the Wikimedia and Libraries User Group.16 • Advocate within your research communities and organizations for open, compatible licensing of data sets so that they can be incorporated into Wikidata.17 …
[Recommendations] For library leadership and organizations: • Give staff time to experiment and contribute to Wikidata, including by determining tasks that can be added to existing positions and workflows, or incorporating Wikidata participation into existing incentive and reward structures. • Expand capacity with Wikimedians in Residence or fellowships. • Inform and advocate with your patrons/scholars/research community to use LOD for their research projects that involve data/data sets. • Make data sets and scholarship from existing institutional projects visible on Wikidata as part of a global network of knowledge. Large-scale cooperative projects like Social Networks and Archival Context (SNAC),18 and VIAF, the Virtual International Authority File,19 for example, have added identifiers to Wikidata. • Provide linked data support to researchers, academics, and other patrons wishing to expand the context of their own research and data or to develop web applications representing knowledge from their field. • Engage scholars and communities working in underrepresented knowledge areas to help extend existing sets of knowledge in Wikidata. • Explore and advocate for the use of Wikidata identifiers (“Q IDs”) or equivalent uniform resource identifiers (URIs) in library and archival systems, repositories, and platforms. • Consider the use of Wikibase as a LOD store for local identifiers and authority-like data….”
Abstract: Interdisciplinary collaborations and data sharing are essential to addressing the long history of human-environmental interactions underlying the modern biodiversity crisis. Such collaborations are increasingly facilitated by, and dependent upon, sharing open access data from a variety of disciplinary communities and data sources, including those within biology, paleontology, and archaeology. Significant advances in biodiversity open data sharing have focused on neontological and paleontological specimen records, making available over a billion records through the Global Biodiversity Information Facility. But to date, less effort has been placed on the integration of important archaeological sources of biodiversity, such as zooarchaeological specimens. Zooarchaeological specimens are rich with both biological and cultural heritage data documenting nearly all phases of human interaction with animals and the surrounding environment through time, filling a critical gap between paleontological and neontological sources of data within biodiversity networks. Here we describe technical advances for mobilizing zooarchaeological specimen-specific biological and cultural data. In particular, we demonstrate adaptations in the workflow used by biodiversity publisher VertNet to mobilize Darwin Core formatted zooarchaeological data to the GBIF network. We also show how a linked open data approach can be used to connect existing biodiversity publishing mechanisms with archaeoinformatics publishing mechanisms through collaboration with the Open Context platform. Examples of ZooArchNet published datasets are used to show the efficacy of creating this critically needed bridge between biological and archaeological sources of open access data. These technical advances and efforts to support data publication are placed in the larger context of ZooarchNet, a new project meant to build community around new approaches to interconnect zoorchaeological data and knowledge across disciplines.
“In June 2018, the Association of Research Libraries (ARL) charged a task force to look at Wikidata. The task force emerged from several years of discussion between ARL and the Wikimedia Foundation on where the two communities can effectively collaborate. The focus on Wikidata and Wikibase came from two points of alignment in particular: interest in linked open data for both library discovery systems and Wikipedia, and advancing a diversity and inclusion agenda in the cultures of both libraries and Wikimedia….”