Update from Gary King and Nate Persily | SOCIAL SCIENCE ONE

When we created Social Science One to facilitate access for the world’s social scientific community to social media data, we promised to release periodic updates noting our progress and describing the challenges we confront….

Of course, we recognized that working with Facebook would invite heavy scrutiny, given the maelstrom of controversy on many fronts that has engulfed the company since the 2016 election, not the least of it for the Cambridge Analytica scandal, which was an academic scandal as well.  We hoped, however, that rigorous and careful scientific analysis of Facebook data, without funding from or pre-publication approval by Facebook, would provide valuable independent assessment of the conventional wisdom as to the platform’s varied effects on elections and democracy around the world.  We also hoped that we could prove the model we had developed for industry-academic partnerships and show how company data could be made accessible in a legal, trusted, privacy-preserving, and secure fashion that benefits everyone. The potential benefits for the social sciences, and for society at large, are so large that getting this right is critical….

We are close to being able to announce the first set of research teams approved for financial awards and data access….

[W]e plan to release access to data for approved researchers in two stages instead of all at once….

We continue to believe in the critical importance of opening up access for researchers to the most important information private companies possess on the nature of modern society and social interaction.  …”

Congress votes to make open government data the default in the United States | E Pluribus Unum

On December 21, 2018, the United States House of Representatives voted to enact H.R. 4174, the Foundations for Evidence-Based Policymaking Act of 2017, in a historic win for open government in the United States of America.

The Open, Public, Electronic, and Necessary Government Data Act (AKA the OPEN Government Data Act) is about to become law as a result. This codifies two canonical principles for democracy in the 21st century:

  1. public information should be open by default to the public in a machine-readable format, where such publication doesn’t harm privacy or security
  2. federal agencies should use evidence when they make public policy….”

Open Humans: A platform for participant-centered research and personal data exploration | bioRxiv

Background: Many aspects of our lives are now digitized and connected to the internet. As a result, individuals are now creating and collecting more personal data than ever before. This offers an unprecedented chance for fields of human subject research ranging from the social sciences to precision medicine. With this potential wealth of data come practical problems – such as how to merge data streams from various sources – as well as ethical problems – how can people responsibly share their personal information? Results: To address these problems we present Open Humans, a community-based platform that enables personal data collections across data streams, enables individuals to take control of their personal data, and enables academic research as well as patient-led projects. We showcase data streams that Open Humans combines – such as personal genetic data, wearable activity monitors, GPS location records and continuous glucose monitor data – along with use cases of how that data is used by various participants. Conclusions: Open Humans highlights how a community-centric ecosystem can be used to aggregate personal data from various sources as well as how these data can be ethically used by academic and citizen scientists.”

OPEN DATA, GREY DATA, AND STEWARDSHIP: UNIVERSITIES AT THE PRIVACY FRONTIER

Abstract:  As universities recognize the inherent value in the data they collect and hold, they encounter unforeseen challenges in stewarding those data in ways that balance accountability, transparency, and protection of privacy, academic freedom, and intellectual property. Two parallel developments in academic data collection are converging: (1) open access requirements, whereby researchers must provide access to their data as a condition of obtaining grant funding or publishing results in journals; and (2) the vast accumulation of “grey data” about individuals in their daily activities of research, teaching, learning, services, and administration. The boundaries between research and grey data are blurring, making it more difficult to assess the risks and responsibilities associated with any data collection. Many sets of data, both research and grey, fall outside privacy regulations such as HIPAA, FERPA, and PII. Universities are exploiting these data for research, learning analytics, faculty evaluation, strategic decisions, and other sensitive matters. Commercial entities are besieging universities with requests for access to data or for partnerships to mine them. The privacy frontier facing research universities spans open access practices, uses and misuses of data, public records requests, cyber risk, and curating data for privacy protection. This Article explores the competing values inherent in data stewardship and makes recommendations for practice by drawing on the pioneering work of the University of California in privacy and information security, data governance, and cyber risk.

EPA quietly puts controversial ‘secret science’ plan on hold — for now – ThinkProgress

“The Environmental Protection Agency (EPA) appears to have put a deeply controversial plan limiting the use of scientific data in policymaking on hold for the time being. The move follows significant outcry from experts and the agency’s own staff….

On its face, that push for transparency might resonate with some — but experts have repeatedly emphasized that confidential data is private for a reason. Making it public could violate patient privacy or industry confidentiality, in many instances breaking the law and potentially allowing for distortions of the information. Limiting the data government officials can use, meanwhile, could hinder efforts to protect both human health and the environment….”

Patterns of information – clustering books and readers in open access libraries

Abstract:  Open access libraries operate in a continuum between two distinct organisation models: online retailers versus ‘traditional’ libraries. Online retailers such as Amazon.com are successful in recom-mending additional items that match the specific needs of their customers. The success rate of the recommendation depends on knowledge of the individual customer: more knowledge about persons leads to better suggestions. Thus, to optimally profit from the retailers’ offerings, the client must be prepared to share personal information, leading to the question of privacy.

In contrast, protection of privacy is a core value for libraries. The question is how open access librar-ies can offer comparable services while retaining the readers’ privacy. A possible solution can be found in analysing the preferences of groups of like-minded people: communities. According to Lynch (2002), digital libraries are bad at identifying or predicting the communities that will use their collections. It is however our intention to explore the possibility to uncover sets of documents with a meaningful connection for groups of readers – the communities. The solution depends on examining patterns of usage, instead of storing information about individual readers. 

This paper will investigate the possibility to uncover the preferences of user groups within an open access digital library using social networking analysis techniques.

RA21: Resource Access in the 21st Century

[Less about OA than convenient access to non-OA sources.]

 

“Resource Access for the 21st Century (RA21) is a joint STM – NISO initiative aimed at optimizing protocols across key stakeholder groups, with a goal of facilitating a seamless user experience for consumers of scientific communication. In addition, this comprehensive initiative is working to solve long-standing, complex, and broadly distributed challenges in the areas of network security and user privacy. Community conversations and consensus building to engage all stakeholders is currently underway in order to explore potential alternatives to IP-authentication, and to build momentum toward testing alternatives among researcher, customer, vendor, and publisher partners.”

RA21: Resource Access for the 21st Century – Improving Access to Scholarly Resources, from Anywhere, on Any Device

[Less about OA than convenient access to non-OA sources.]

“Publishers, libraries, and consumers have all come to the understanding that authorizing access to content based on IP address no longer works in today’s distributed world. The RA21 project hopes to resolve some of the  fundamental issues that create barriers to moving to federated identity in place of IP address authentication by looking at some of the products and services available in the identity discovery space today, and determining best practice for future implementations going forward.”

Think you have remarkable memory traits? Share them by participating in the Harvard PGP-Lumosity Memory Challenge – Citizen Science Salon : Citizen Science Salon

“In 2005, [George Church] launched the Personal Genome Project (PGP), which collects data on a person’s DNA, environmental background, and relevant health and disease information from consenting participants. The premise of the PGP is grounded in open science, meaning that all this data is publicly available to researchers, who then study the relationship between specific DNA sequences and various displayed traits, like having an especially good memory.

This openness is the hallmark of the PGP, described on their website as “a vision and coalition of projects across the world dedicated to creating public genome, health, and trait data.” The PGP seeks to share data for the “greater good” in ways that have been previously “hampered by traditional research practices.” In other words, by being set up so it’s open-access project that allows individuals to freely share their data with researchers, no single researcher can “control” access to the data. By inviting participants to openly share their own personal data, this project allows individuals to directly impact scientific progress….”