Abstract: This paper introduces the Archives Unleashed Cloud, a web-based interface for working with web archives at scale. Current access paradigms, largely driven by the scope and scale of web archives, generally involve using the command line and writing code. This access gap means that subject-matter experts, as opposed to developers and programmers, have few options to directly work with web archives beyond the page-by-page paradigm of the Wayback Machine. Drawing on first-hand research and analysis of how scholars use web archives, we present the interface design and underpinning architecture of the Archives Unleashed Cloud. We also discuss the sustainability implications of providing a cloud-based service for researchers to analyze their collections at scale.
“Big bibliographic datasets hold promise for revolutionizing the scientific enterprise when combined with state-of-the-science computational capabilities. Yet, hosting proprietary and open big bibliographic datasets poses significant difficulties for libraries, both large and small. Libraries face significant barriers to hosting such assets, including cost and expertise, which has limited their ability to provide stewardship for big datasets, and thus has hampered researchers’ access to them. What is needed is a solution to address the libraries’ and researchers’ joint needs. This article outlines the theoretical framework that underpins the Collaborative Archive and Data Research Environment project. We recommend a shared cloud-based infrastructure to address this need built on five pillars: 1) Community–a community of libraries and industry partners who support and maintain the platform and a community of researchers who use it; 2) Access–the sharing platform should be accessible and affordable to both proprietary data customers and the general public; 3) Data-Centric–the platform is optimized for efficient and high-quality bibliographic data services, satisfying diverse data needs; 4) Reproducibility–the platform should be designed to foster and encourage reproducible research; 5) Empowerment—the platform should empower researchers to perform big data analytics on the hosted datasets. In this article, we describe the many facets of the problem faced by American academic libraries and researchers wanting to work with big datasets. We propose a practical solution based on the five pillars: The Collaborative Archive and Data Research Environment. Finally, we address potential barriers to implementing this solution and strategies for overcoming them.”
“EOSC-Life has launched its first Digital Life Sciences Open Call: A European Open Science Cloud (EOSC-Life) call for projects sharing data, tools and workflows in the cloud. This call offers financial support for projects, to enable life science researchers to connect their research to the cloud, alongside training, advice and assistance from data experts, tool developers and cloud specialists. Proposals should align with the goals of EOSC (European Open Science Cloud) – ie. enabling data sharing for the purpose of furthering scientific research. The project’s overarching aim is to make life science research data publicly available in a FAIR (Findable, Accessible, Interoperable, Reusable) way in the EOSC….”
“The Library of Congress is the largest library in the world, with millions of books, recordings, photographs, newspapers, maps and manuscripts in its collections. One of the missions of Library of Congress’ Labs (Labs) at the Library of Congress (Library) is to enable transformational experiences between the Library’s digital collections and the American people.
LC Labs (Labs), a division in the Digital Strategy Directorate in the Office of the Chief Information Officer of the Library of Congress, was awarded an Andrew W. Mellon Foundation grant titled “Computing Cultural Heritage in the Cloud” to test a cloud-based approach for interacting with digital collections as data, supporting those researchers who are creatively applying emerging styles of research to Library material. In collaboration with subject matter experts and IT specialists at the Library, the Library is seeking to award contracts to up to four research experts (Research Experts) to experiment with solutions to problems that can only be explored at scale. See attached BAA for details about this opportunity….”
“When big data intersects with highly sensitive data, both opportunity to society and risks abound. Traditional approaches for sharing sensitive data are known to be ineffective in protecting privacy. Differential Privacy, deriving from roots in cryptography, is a strong mathematical criterion for privacy preservation that also allows for rich statistical analysis of sensitive data. Differentially private algorithms are constructed by carefully introducing “random noise” into statistical analyses so as to obscure the effect of each individual data subject. OpenDP is an open-source project for the differential privacy community to develop general-purpose, vetted, usable, and scalable tools for differential privacy, which users can simply, robustly and confidently deploy.
Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others’ work more easily. Researchers, journals, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility. A Dataverse repository is the software installation, which then hosts multiple virtual archives called Dataverses. Each dataverse contains datasets, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data).
This session examines ongoing efforts to realize a combined use case for these projects that will offer academic researchers privacy-preserving access to sensitive data. This would allow both novel secondary reuse and replication access to data that otherwise is commonly locked away in archives. The session will also explore the potential impact of this work outside the academic world.”
“This document sets the draft general framework for future strategic research, development and innovation (RDI) activities to be further defined in the context of the candidate EOSC European Partnership1 proposed under the Horizon Europe Programme. • It uses elements of the candidate Partnership document proposed by the EOSC governing bodies as well as further work by the Executive Board, in order to develop, by October 2020, a first full version of the Strategic Research and Innovation Agenda (SRIA) for EOSC. • With the consultation launched on 20 July, the EOSC governing bodies are seeking the views and contributions of different stakeholders on the content of this document through the accompanying questionnaire. The consultation will remain open until 31 August 2020. • The feedback obtained in the consultation process will serve as input for the SRIA. The draft SRIA will be presented at the EOSC Governance Board meeting on 1 October 2020….”
“Welcome to the EOSC Strategic Research and Innovation Agenda (SRIA) Open Consultation page.
The European Open Science Cloud (EOSC) is the envisioned federation of research (data) infrastructures that will enable the Web of FAIR Data and Services, help researchers to perform Open Science, and open up and exploit their data, publications and code.
The Strategic Research and Innovation Agenda (SRIA) provides general guidelines to help develop the work programmes for EOSC in Horizon Europe. The SRIA is open for public consultation until 31 August 2020, involving stakeholders from inside and outside the EOSC Community. Research infrastructures, universities, researchers, industry, national and international initiatives, policymakers, citizen scientists are all invited to take part in this collective effort across countries and disciplines.
The consultation takes the form of an online questionnaire (below) where respondents can give their views on topics such as EOSC’s guiding principles, action areas and priorities. This includes information relating to rewarding Open Science practices and skills; standards, tools and services to find, access and reuse results; and shared and federated infrastructures to enable open sharing of scientific results.
Before completing the consultation questionnaire below, respondents are advised to first read the SRIA consultation document which sets out a general framework and provides key information. The questionnaire refers directly to the consultation document.
Complete the online questionnaire below and help shape the content of the Agenda….”
“The European Open Science Cloud (EOSC) recently launched an open consultation on its Strategic Research & Innovation Agenda (SRIA) which will provide general guidelines to help develop the work programmes for EOSC in Horizon Europe. Stakeholders from inside and outside the EOSC Community, Research infrastructures, universities, researchers, industry, national and international initiatives, policymakers, citizen scientists are all invited to take part in this collective effort across countries and disciplines.
The consultation takes the form of an online questionnaire (below) where respondents can give their views on topics such as EOSC’s guiding principles, action areas and priorities. This includes information relating to rewarding Open Science practices and skills; standards, tools and services to find, access and reuse results; and shared and federated infrastructures to enable open sharing of scientific results….”
“National Initiatives for Open Science in Europe – NI4OS Europe, aims to be a core contributor to the European Open Science Cloud (EOSC) service portfolio, commit to EOSC governance and ensure inclusiveness on the European level for enabling global Open Science.
Lines of action
Support the development and inclusion of the national Open Science Cloud initiatives in 15 Member States and Associated Countries in the EOSC governance.
Instill within the community the EOSC philosophy and FAIR principles for data Findability, Accessibility, Interoperability and Reusability.
Provide technical and policy support for on-boarding of service providers into EOSC, including generic services (compute, data storage, data management), thematic services, repositories and data sets….”
The Photon and Neutron Open Science Cloud (PaNOSC) is a European project for making FAIR data a reality in 6 European Research Infrastructures (RIs), developing and providing services for scientific data and connecting these to the European Open Science Cloud (EOSC).