Opening Your Scholarship: Why should I DASH and Dataverse?

“Learn practices and platforms to achieve your open access goals!

Highlights on Harvard DASH and Dataverse.

Panelists:

– Sonia Barbosa, Manager of Data Curation, Harvard Dataverse, Manager of the Murray Research Archive

– Julie Goldman, Research Data Services Librarian

– Colin Lukens, Senior Repository Manager, Harvard Library Office for Scholarly Communication

– Katie Mika, Data Services Librarian …”

Dataverse and OpenDP: Tools for Privacy-Protective Analysis in the Cloud | Mercè Crosas

“When big data intersects with highly sensitive data, both opportunity to society and risks abound. Traditional approaches for sharing sensitive data are known to be ineffective in protecting privacy. Differential Privacy, deriving from roots in cryptography, is a strong mathematical criterion for privacy preservation that also allows for rich statistical analysis of sensitive data. Differentially private algorithms are constructed by carefully introducing “random noise” into statistical analyses so as to obscure the effect of each individual data subject.    OpenDP is an open-source project for the differential privacy community to develop general-purpose, vetted, usable, and scalable tools for differential privacy, which users can simply, robustly and confidently deploy. 

Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others’ work more easily. Researchers, journals, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility.  A Dataverse repository is the software installation, which then hosts multiple virtual archives called Dataverses. Each dataverse contains datasets, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data).

This session examines ongoing efforts to realize a combined use case for these projects that will offer academic researchers privacy-preserving access to sensitive data. This would allow both novel secondary reuse and replication access to data that otherwise is commonly locked away in archives.  The session will also explore the potential impact of this work outside the academic world.”

Dataverse Community Meeting 2020

“The annual Dataverse Community Meeting is an opportunity to build, grow, and enrich the global community. Like the open-source Dataverse product itself, the activities of the Dataverse Community Meetings are community-driven. Over three days of presentations, workshops, and working group meetings we aim to promote and learn about behavioral and technical solutions and standards for curating, sharing, and preserving data that can be discovered and reused across disciplines to reproduce and advance research.

The Dataverse Community Meeting is hosted by Harvard’s Institute for Quantitative Social Science. Learn more about The Dataverse Project at our dataverse.org site….”

Advancing computational reproducibility in the Dataverse data repository platform

Abstract:  Recent reproducibility case studies have raised concerns showing that much of the deposited research has not been reproducible. One of their conclusions was that the way data repositories store research data and code cannot fully facilitate reproducibility due to the absence of a runtime environment needed for the code execution. New specialized reproducibility tools provide cloud-based computational environments for code encapsulation, thus enabling research portability and reproducibility. However, they do not often enable research discoverability, standardized data citation, or long-term archival like data repositories do. This paper addresses the shortcomings of data repositories and reproducibility tools and how they could be overcome to improve the current lack of computational reproducibility in published and archived research outputs.

 

COVID-19 Data Collection

“This is a general collection of COVID-19 data deposited in the Harvard Dataverse repository. The list in this collection is maintained by the Harvard Dataverse data curation team (IQSS and Harvard Library). Researchers who deposit their related data into Harvard Dataverse will have their data linked to this collection, to increase discoverability of their data. Please use the contact link if you have any questions about this collection.”

COVID-19 Data Collection

“This is a general collection of COVID-19 data deposited in the Harvard Dataverse repository. The list in this collection is maintained by the Harvard Dataverse data curation team (IQSS and Harvard Library). Researchers who deposit their related data into Harvard Dataverse will have their data linked to this collection, to increase discoverability of their data. Please use the contact link if you have any questions about this collection.”

Dataverse: Sign Up for a One-Day Workshop | Institute for Quantitative Social Science

“The Harvard Dataverse data management and curation team is holding a one day workshop on Monday, March 16th 2020, from 9-12 pm. Come learn about the Harvard Dataverse for data sharing and preservation. You will have an opportunity to discuss your research project and data sharing needs, including:

The purpose of research data sharing
How to organize research data for sharing
Options for sharing deidentified or sensitive data
Data analysis and visualization tools provided by Harvard Dataverse
Dataset and file level DOIs and data citations
How to manage a team project on Harvard Dataverse, and much more.

We will also discuss the importance of curating data to meet FAIR data guidelines when sharing your data on Harvard Dataverse.  Space is limited and laptops are required….”

The Big Data Challenge – Recommendations by Mercè Crosas – Big Data Value

“Currently, Mercè’s team is in the process of implementing datatags for datasets in the Harvard Dataverse repository. This has been a big task due to legal compliance issues, security requirements and the conditions set by various data agreements. These datasets often contain sensitive information about individuals and therefore safeguards need to be put in place to protect these individuals. Policies on data sharing play a critical role in balancing the benefits and risks. The average citizen wants privacy and safety of his data but has little time for data governance. As the amount of data driven products is only expected to increase, so is the demand of citizens for privacy management. It is important to map the data beforehand because the manner in which relevant regulation is to be attached to the data is dependent on the data itself. When regulation changes, the datatags will have to be adopted as well, for instance by providing an updated version of the tag. For these purposes, they teamed up with lawyers helping them with the verification of the datatags. More recently, Mercè has been involved with the OpenDP project as one of the co-PIs, an open-source platform for differential privacy libraries. This work would allow to mine and analyze sensitive datasets while preserving their privacy and never been accessed directly by the researchers. Dataverse, DataTags, and OpenDP will together provide a privacy-preserving platform for sharing and analyzing sensitive data….”

European Dataverse Workshop 2020

“Are you looking for a repository software to run your research data repository?

Are you already using Dataverse and want to exchange experiences and learn more about Dataverse?

>> Join us at the European Dataverse Workshop 2020!

Date: January 23-24, 2020 Venue: UiT The Arctic University of Norway

Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data.

For more information about the European Dataverse Workshop 2020, see the workshop webpage.

 

Save the date!”