It’s No Secret – Millions of Books Are Openly in the Public… | HathiTrust Digital Library

“Since 2008 the HathiTrust Copyright Review Program has been researching hundreds of thousands of books to find ones that are in the public domain and can be opened for view in the HathiTrust Digital Library. Over the past 11 years, 168 people across North America have worked together for a common goal: the ability to share public domain works from our libraries. As of September 2019, the HathiTrust Copyright Review Program has performed copyright reviews on 506,989 US publications; of those, 302,915 (59.7%) have been determined to be in the public domain in the United States. The opening of these works in HathiTrust has brought the total of openly available volumes to 6,540,522.

The Copyright Review Program, now an operational program of HathiTrust, began as a grant-funded ambition of the University of Michigan Library, under the leadership of Melissa Levine. The Institute of Museum and Library Services (IMLS) funded three consecutive grants enabling the University of Michigan Library and grant collaborators to build a copyright review management system. The program is still going strong eleven years later, resulting in hundreds of publications determined to be in the public domain each week.

One way the Copyright Review Program determines the copyright status of items in the HathiTrust corpus is to determine whether they were properly renewed. In the United States, the copyright in works published between 1924 and 1964 had to be renewed about 28 years after the item was published; works could move into the public domain when their initial term of protection expired. The Stanford Copyright Renewal Database was one of the first to host monograph renewal records in an open access database, but much of the initial copyright registration information remains difficult to search. …”

Data-mining reveals that 80% of books published 1924-63 never had their copyrights renewed and are now in the public domain / Boing Boing

“But there’s another source of public domain works: until the 1976 Copyright Act, US works were not copyrighted unless they were registered, and then they quickly became public domain unless that registration was renewed. The problem has been to figure out which of these works were in the public domain, because the US Copyright Office’s records were not organized in a way that made it possible to easily cross-check a work with its registration and renewal.

For many years, the Internet Archive has hosted an archive of registration records, which were partially machine-readable.

Enter the New York Public Library, which employed a group of people to encode all these records in XML, making them amenable to automated data-mining.

Now, Leonard Richardson (previously) has done the magic data-mining work to affirmatively determine which of the 1924-63 books are in the public domain, which turns out to be 80% of those books; what’s more, many of these books have already been scanned by the Hathi Trust (which uses a limitation in copyright to scan university library holdings for use by educational institutions, regardless of copyright status)….”

Data-mining reveals that 80% of books published 1924-63 never had their copyrights renewed and are now in the public domain / Boing Boing

“But there’s another source of public domain works: until the 1976 Copyright Act, US works were not copyrighted unless they were registered, and then they quickly became public domain unless that registration was renewed. The problem has been to figure out which of these works were in the public domain, because the US Copyright Office’s records were not organized in a way that made it possible to easily cross-check a work with its registration and renewal.

For many years, the Internet Archive has hosted an archive of registration records, which were partially machine-readable.

Enter the New York Public Library, which employed a group of people to encode all these records in XML, making them amenable to automated data-mining.

Now, Leonard Richardson (previously) has done the magic data-mining work to affirmatively determine which of the 1924-63 books are in the public domain, which turns out to be 80% of those books; what’s more, many of these books have already been scanned by the Hathi Trust (which uses a limitation in copyright to scan university library holdings for use by educational institutions, regardless of copyright status)….”

New collaborative effort to develop a national digital ebooks platform for libraries announced | DPLA

The Digital Public Library of America (DPLA), The New York Public Library (NYPL), and LYRASIS are pleased to announce a new collaboration to help provide all public libraries with a free, open, library-controlled platform for managing their ebook and audiobook services….

The DPLA Exchange (https://exchange.dp.la), launched in 2017 and now providing access to over 300,000 titles including thousands of openly-licensed works, offers a new model for a library-centered marketplace for ebooks and audiobooks. …”