Gardner, Peters Introduce Bill to Keep Government Research Data Publically Available

“U.S. Senators Cory Gardner (R-CO) and Gary Peters (D-MI) today introduced bipartisan legislation to help federal agencies maintain open access to machine-readable databases and datasets created by taxpayer-funded research. The Preserving Data in Government Act would require federal agencies to preserve public access to existing open datasets, and prevent the removal of existing datasets without sufficient public notice. Small businesses rely on a range of publically available machine-readable datasets to launch or grow their companies, and researchers and scientists use data to conduct studies for a variety of fields and industries….”

BEAT: An Open-Source Web-Based Open-Science Platform

“With the increased interest in computational sciences, machine learning (ML), pattern recognition (PR) and big data, governmental agencies, academia and manufacturers are overwhelmed by the constant influx of new algorithms and techniques promising improved performance, generalization and robustness. Sadly, result reproducibility is often an overlooked feature accompanying original research publications, competitions and benchmark evaluations. The main reasons behind such a gap arise from natural complications in research and development in this area: the distribution of data may be a sensitive issue; software frameworks are difficult to install and maintain; Test protocols may involve a potentially large set of intricate steps which are difficult to handle. Given the raising complexity of research challenges and the constant increase in data volume, the conditions for achieving reproducible research in the domain are also increasingly difficult to meet. To bridge this gap, we built an open platform for research in computational sciences related to pattern recognition and machine learning, to help on the development, reproducibility and certification of results obtained in the field. By making use of such a system, academic, governmental or industrial organizations enable users to easily and socially develop processing toolchains, re-use data, algorithms, workflows and compare results from distinct algorithms and/or parameterizations with minimal effort. This article presents such a platform and discusses some of its key features, uses and limitations. We overview a currently operational prototype and provide design insights.”

We need a GitHub for academic research.

“[T]he academic paper has some inherent limitations—chief among them that it can provide only a summary of a given research project. Even an outstanding paper cannot provide direct access to all of the research data collected or to the record of discussions among scientists that is reflected in lab notes. These windows into the messy and halting process of science, which can be extremely valuable learning objects, are not yet part of the official record of a research study.

But it doesn’t have to be this way. If we take advantage of the unique capabilities of the web to tell the full story of a research project—rather than merely using it as a faster printing press as we do today—we can build greater transparency into our approach to reporting science. Besides improving information-sharing among scientists, a push toward transparency could improve public trust in science and scientists. Now, when the very concepts of fact and truth under assault and many scientists feel compelled to march in response, is the perfect time to rethink our approach to scientific communication altogether….”

Release ‘open’ data from their PDF prisons using tabulizer | R-bloggers

“As a political scientist who regularly encounters so-called “open data” in PDFs, this problem is particularly irritating. PDFs may have “portable” in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally.”

Steve Ballmer Serves Up a Fascinating Data Trove – The New York Times

“On Tuesday, Mr. Ballmer plans to make public a database and a report that he and a small army of economists, professors and other professionals have been assembling as part of a stealth start-up over the last three years called USAFacts. The database is perhaps the first nonpartisan effort to create a fully integrated look at revenue and spending across federal, state and local governments….In an age of fake news and questions about how politicians and others manipulate data to fit their biases, Mr. Ballmer’s project may serve as a powerful antidote. Using his website, USAFacts.org, a person could look up just about anything: How much revenue do airports take in and spend? What percentage of overall tax revenue is paid by corporations? At the very least, it could settle a lot of bets made during public policy debates at the dinner table….With an unlimited budget, he went about hiring a team of researchers in Seattle and made a grant to the University of Pennsylvania to help his staff put the information together. Altogether, he has spent more than $10 million between direct funding and grants….Mr. Ballmer is hoping that the website is just the beginning. He hopes to open it up so that individuals and companies can build on top of it and pull out customized reports….”

Sharing Data and Materials in Psychological Science – Apr 17, 2017

“Psychological Science is now introducing some minor changes designed to increase the frequency and ease with which editors and reviewers of submissions can access data and materials as part of the peer-review process. I anticipate that, in addition to enhancing the review process, these changes will further increase the percentage of Psychological Science articles for which researchers can quickly and easily access data and materials postpublication. The changes we are introducing are tweaks and nudges, not radical shifts. In the following, I explain the changes and why they are worth undertaking.”

Guidelines on the Implementation of Open Access to Scientific Publications and Research Data in Projects Supported by the European Research Council under Horizon 2020

“According to the ERC Scientific Council’s Open Access Guidelines : ‘The mission of the European Research Council (ERC) is to support excellent research in all fields of science and scholarship. The main outputs of this research are new knowledge, ideas and understanding, which the ERC expects its researchers to publish in peer-reviewed articles and monographs. The ERC considers that providing free online access to these materials is the most effective way of ensuring that the fruits of the research it funds can be accessed, read, and used as the basis for further research. […] The ERC therefore supports the principle of open access to the published output of research as a fundamental part of its mission.”

Public Attitudes toward Consent and Data Sharing in Biobank Research: A Large Multi-site Experimental Survey in the US

Abstract:  Individuals participating in biobanks and other large research projects are increasingly asked to provide broad consent for open-ended research use and widespread sharing of their biosamples and data. We assessed willingness to participate in a biobank using different consent and data sharing models, hypothesizing that willingness would be higher under more restrictive scenarios. Perceived benefits, concerns, and information needs were also assessed. In this experimental survey, individuals from 11 US healthcare systems in the Electronic Medical Records and Genomics (eMERGE) Network were randomly allocated to one of three hypothetical scenarios: tiered consent and controlled data sharing; broad consent and controlled data sharing; or broad consent and open data sharing. Of 82,328 eligible individuals, exactly 13,000 (15.8%) completed the survey. Overall, 66% (95% CI: 63%–69%) of population-weighted respondents stated they would be willing to participate in a biobank; willingness and attitudes did not differ between respondents in the three scenarios. Willingness to participate was associated with self-identified white race, higher educational attainment, lower religiosity, perceiving more research benefits, fewer concerns, and fewer information needs. Most (86%, CI: 84%–87%) participants would want to know what would happen if a researcher misused their health information; fewer (51%, CI: 47%–55%) would worry about their privacy. The concern that the use of broad consent and open data sharing could adversely affect participant recruitment is not supported by these findings. Addressing potential participants’ concerns and information needs and building trust and relationships with communities may increase acceptance of broad consent and wide data sharing in biobank research.

Public Attitudes toward Consent and Data Sharing in Biobank Research: A Large Multi-site Experimental Survey in the US

Abstract:  Individuals participating in biobanks and other large research projects are increasingly asked to provide broad consent for open-ended research use and widespread sharing of their biosamples and data. We assessed willingness to participate in a biobank using different consent and data sharing models, hypothesizing that willingness would be higher under more restrictive scenarios. Perceived benefits, concerns, and information needs were also assessed. In this experimental survey, individuals from 11 US healthcare systems in the Electronic Medical Records and Genomics (eMERGE) Network were randomly allocated to one of three hypothetical scenarios: tiered consent and controlled data sharing; broad consent and controlled data sharing; or broad consent and open data sharing. Of 82,328 eligible individuals, exactly 13,000 (15.8%) completed the survey. Overall, 66% (95% CI: 63%–69%) of population-weighted respondents stated they would be willing to participate in a biobank; willingness and attitudes did not differ between respondents in the three scenarios. Willingness to participate was associated with self-identified white race, higher educational attainment, lower religiosity, perceiving more research benefits, fewer concerns, and fewer information needs. Most (86%, CI: 84%–87%) participants would want to know what would happen if a researcher misused their health information; fewer (51%, CI: 47%–55%) would worry about their privacy. The concern that the use of broad consent and open data sharing could adversely affect participant recruitment is not supported by these findings. Addressing potential participants’ concerns and information needs and building trust and relationships with communities may increase acceptance of broad consent and wide data sharing in biobank research.

Development of a Suicidal Ideation Detection Tool for Primary Healthcare Settings: Using Open Access Online Psychosocial Data | Abstract

Abstract:  Background: Suicidal patients often visit healthcare professionals in their last month before suicide, but medical practitioners are unlikely to raise the issue of suicide with patients because of time constraints and uncertainty regarding an appropriate approach. Introduction: A brief tool called the e-PASS Suicidal Ideation Detector (eSID) was developed for medical practitioners to help detect the presence of suicidal ideation (SI) in their clients. If SI is detected, the system alerts medical practitioners to address this issue with a client. The eSID tool was developed due to the absence of an easy-to-use, evidence-based SI detection tool for general practice. Material and Methods: The tool was developed using binary logistic regression analyses of data provided by clients accessing an online psychological assessment function. Ten primary healthcare professionals provided advice regarding the use of the tool. Results: The analysis identified eleven factors in addition to the Kessler-6 for inclusion in the model used to predict the probability of recent SI. The model performed well across gender and age groups 18–64 (AUR 0.834, 95% CI 0.828–0.841, N?=?16,703). Healthcare professionals were interviewed; they recommended that the tool be incorporated into existing medical software systems and that additional resources be supplied, tailored to the level of risk identified. Conclusion: The eSID is expected to trigger risk assessments by healthcare professionals when this is necessary. Initial reactions of healthcare professionals to the tool were favorable, but further testing and in situ development are required.