Towards a Global Dataset of Digitised Texts: Final Report of the Global Digitised Dataset Network

“This report is one output of the AHRC-funded Global Digitised Dataset Network (GDDNetwork)1. The network was set up in response to a research networking highlight notice for UK-US collaborations focused upon digital scholarship in cultural institutions.2 We set out to develop a new collaboration, led by the University of Glasgow and HathiTrust, to investigate the feasibility of developing a single global dataset documenting the extent of digitized works. The core network partners also include the British Library (BL), the National Library of Wales (NLW), the National Library of Scotland (NLS), and Research Libraries UK (RLUK). The network ran from February 2019 to January 2020, and aimed to: – Seek to answer the question of whether it is feasible and worthwhile to create a global dataset of digitised texts for digital scholars, libraries, and readers; – Develop a stronger understanding of the potential impact of a global dataset of digitised texts, particularly in relation to supporting digital scholarship; – Investigate models for developing a sustainable global dataset, expanding on the network’s initial scope to develop a truly global network….

 we will demonstrate that no single resource currently achieves these objectives at a global, and comprehensive, scale. We hope that the report will act as a catalyst for further research and development towards realising the value of the proposed dataset, in a manner that would support collective approaches to maximising the benefits of digitised materials….”

The GDDNetwork Final Report (and, well, a global pandemic) – GDD Network

“First, I’m delighted to announce that the GDDNetwork report is now available! It describes our project findings, looking at the feasibility of developing a single global dataset documenting the extent of digitised works. It sets out two key areas of work – identification of core use cases, and metadata aggregation and data matching – and identifies a clear value proposition for the development of the dataset….

I’m also excited to say that a new experimental service called OpenTexts.World has recently been launched! OpenTexts is based upon the work of the GDDNetwork, but has built upon and expanded our publicly available dataset to create an experimental service that provides free access to digitised text collections from around the world….”

“What is the GDD Network?

Digital scholarship relies on access to digital sources but finding these sources, whether a large corpus for digital scholarship or a single text for in-depth study, is often difficult.

All around the world, libraries, archives and search providers are digitising collections to make them available. While this is making millions of texts available online, there is still no single place where you can search all of them at once.

The difficulty of discovering digitised texts, including but extending beyond the millions of items digitised by national libraries and mass digitisation programmes, often means that these efforts do not have the impact they could and should.

The Global Digitised Dataset Network (GDD Network) is a research collaboration investigating the feasibility of creating a global catalogue of digitised texts, which would enable people to search and find texts, and access them for reading, digital scholarship, collections analysis, and more….”