The Linked Commons 2.0: What’s New?

This is part of a series of posts introducing the projects built by open source contributors mentored by Creative Commons during Google Summer of Code (GSoC) 2020 and Outreachy. Subham Sahu was one of those contributors and we are grateful for his work on this project.


The CC Catalog data visualization—the Linked Commons 2.0—is a web application which aims to showcase and establish a relationship between the millions of data points of CC-licensed content using graphs. In this blog, I’ll discuss the motivation for this visualization and explore the latest features of the newest edition of the Linked Commons.

Motivation

The number of websites using CC-licensed content is enormous, and snowballing. The CC Catalog collects and stores these millions of data points, and each node (a unit in a data structure) contains information about the URL of the websites and the licenses used. It’s possible to do rigorous data analysis in order to understand fully how these are interconnected and to identify trends, but this would be exclusive to those with a technical background. However, by visualizing the data, it becomes easier to identify broad patterns and trends.

For example, by identifying other websites that are linking to your content, you can try to have a specific outreach program or collaborate with them. In this way out of billions of webpages out there on the web, you can very efficiently focus on the webpages where you are more likely to see an increase in growth.

Latest Features

Let’s look at some of the new features in the Linked Commons 2.0.

  • Filtering based on the node name

The Linked Commons 2.0 allows users to search for their favorite node and then explore all of that node’s neighbors across the thousands present in the database. We have color-coded the links connecting the neighbors to the root node, as well as the neighbors which are connected to the root node differently. This makes it immaculately easy for users to classify the neighbors into two categories.

  • A sleek and revamped design

The Linked Commons 2.0 has a sleek design, with a clean and refreshing look along with both a light and dark theme.

The Linked Commons new design

  • Tools for smooth interaction with the canvas

The Linked Commons 2.0 ships with a few tools that allow the user to zoom in, zoom out, and reset zoom with just one tap. It is especially useful to users who are on touch devices or using a trackpad.

The Linked Commons toolbox

  • Autocomplete feature

The current database of the Linked Commons 2.0 contains around 240 thousand nodes and 4.14 million links. Unfortunately, some of the node names are uncommon and lengthy. To prevent users from the exhausting work of typing complete node names, this version ships with an autocomplete feature: for every keystroke, node names will appear that correspond with what the user might be looking for.

The Linked Commons autocomplete

What’s next for the Linked Commons?

In the current version, there are some nodes which are very densely connected. For example, the node “Wikipedia” has around 89k nodes and 102k links as neighbours. This number is too big for web browsers to render. Therefore, we need to configure a way to reduce this to a more reasonable number.

During the preprocessing, we dropped a lot of the nodes and removed more than 3 million nodes which didn’t have CC license information. In general, the current version shows only those nodes which are soundly linked with other domains and their licenses information is available. However, to provide a more complete picture of the CC Catalog, the Linked Commons needs additional filtering methods and other tools. These potentially include:

  • filtering based on Top-Level domain
  • filtering based on the number of web links associated with a node 

Contributing

We plan to continue working on the Linked Commons. You can follow the project development by visiting our GitHub repo. We encourage you to contribute to the Linked Commons, by reporting bugs, suggesting features or by helping us write code. The new Linked Commons makes it easy for anyone to set up the development environment.

The project consists of a dedicated server which powers the filtering by node name and query autocompletion. The frontend is built using ReactJS, for smooth rendering performance. So, it doesn’t matter whether you’re a frontend developer, a backend developer, or a designer: there is some part of the Linked Commons that you can work on and improve. We look forward to seeing you on board with sparkling ideas!

We are extremely proud and grateful for the work done by Subham Sahu throughout his 2020 Google Summer of Code internship. We look forward to his continued contributions to the Linked Commons as a project core committer in the CC Open Source Community! 

Please consider supporting Creative Commons’ open source work on GitHub Sponsors.

The post The Linked Commons 2.0: What’s New? appeared first on Creative Commons.

Wikidata, Wikibase and the library linked data ecosystem: an OCLC Research Library Partnership discussion – Hanging Together

“n late July the OCLC Research Library Partnership convened a discussion that reflected on the current state of linked data. The discussion format was (for us) experimental — we invited participants to prepare by viewing a pre-recorded presentation, Re-envisioning the fabric of the bibliographic universe – From promise to reality* The presentation covers experiences of national and research libraries as well as OCLC’s own journey in linked data exploration. OCLC Researchers Annette Dortmund and Karen Smith-Yoshimura looked at relevant milestones in the journey from entity-based description research, prototypes, and on to actual practices, based on work that has been undertaken with library partners right up to the present day….”

How your library will benefit from linked data

“When operationalized, linked data will provide participating libraries with:

A massive collection of descriptive information and identifiers for creative works, persons, and other things libraries need to refer to
The capability to enhance these descriptions, or add them for things missing from the collection
An ecosystem (including a lightweight UI and APIs) that will allow library workers to create linked data natively, instead of through conversion from MARC
Tools to reconcile local library metadata with that of the ecosystem, and connect library metadata with nonlibrary sources….”

DHQ: Digital Humanities Quarterly: A Prosopography as Linked Open Data: Some Implications from DPRR

Abstract:  The Digital Prosopography of the Roman Republic (DPRR) project has created a freely available structured prosopography of people from the Roman Republic. As a part of this work the materials that were produced by the project have been made available as Linked Open Data (LOD): translated into RDF, and served through an RDF Server. This article explains what it means to present the material as Linked Open Data by means of working, interactive examples. DPRR didn’t do some of the work which has been conventionally associated with Linked Open Data. However, by considering the two conceptions of the Semantic Web and Linked Open Data as proposed by Tim Berners-Lee one can see how DPRR’s RDF Server fits best into the LOD picture, including how it might serve to facilitate new ways to explore its material. The article gives several examples of ways of exploiting DPRR’s RDF dataset, and other similarly structured materials, to enable new research approaches.

 

Modelling Overlay Peer Review Processes with Linked Data Notifications

In November 2017, the Confederation of Open Access Repositories (COAR) published a report outlining the technologies and behaviours of the Next Generation Repository (NGR). In the report, the NGR Working Group argues that repositories must take their place in a resource-centric network, where the individual resources (metadata and actual content) within the repositories are interconnected on the Web both with each other and, more importantly with resource-oriented networked services. These links between resources and overlay services can bring many new opportunities for broadening the scope of the services offered by repositories and 3rd party initiatives. The emphasis on moving to a fully resource-centric paradigm presented in the vision for the Next Generation Repository offers an opportunity to exploit what programmers call “pass by reference” – a notion which underlies the fundamental function of the Web.

One specific use case related to this vision is the linking of repository resources with services providing commentary, annotation and peer reviews; a use case that is currently being considered by several different initiatives in the scholarly communications landscape. The wide distribution of resources (typified by articles) in repositories, coupled with the growing interest in overlay journals, introduces the possibility of adopting an asynchronous notification paradigm to achieve interoperability between repositories and peer review systems….”

Introduction to Democratic Openbook Humanism and LODLIBs | Zenodo

“Hypothesis 1. If we humans create and share LODLIBs for free with the world, those books will make us free individually.

 

Hypothesis 2. If we humans connect our LODLIBs with each other’s LODLIBs, that will make us free socially.

 

Hypothesis 3. If we humans demand that enslaved digital books become LODLIBs, too, then that will make the whole world and almost all of its knowledge free.

 

Put all three into practice, and the vast majority of scientific and cultural knowledge can truly become universal, which is exactly what it should be. That’s one of the foundational principles of librarianship….

 

A LODLIB is a Linked Open Data Living Informational Book….”

Library’s linked-data project gets new grant | Cornell Chronicle

“A $2.5 million grant from The Andrew W. Mellon Foundation is boosting a multi-institution initiative to develop tools and workflows that improve the sharing of catalog data among libraries and help internet users discover library resources on the web.

Known as Linked Data for Production, the project is part of a long-term collaboration among Cornell University Library, Stanford Libraries and the School of Library and Information Science at the University of Iowa.

Through linked data, information about books and other items in library records will be enhanced by related information from external online sources….”

Library’s linked-data project gets new grant | Cornell Chronicle

“A $2.5 million grant from The Andrew W. Mellon Foundation is boosting a multi-institution initiative to develop tools and workflows that improve the sharing of catalog data among libraries and help internet users discover library resources on the web.

Known as Linked Data for Production, the project is part of a long-term collaboration among Cornell University Library, Stanford Libraries and the School of Library and Information Science at the University of Iowa.

Through linked data, information about books and other items in library records will be enhanced by related information from external online sources….”