OpenDP Hiring Scientific Staff | OpenDP

“The OpenDP project seeks to hire 1-2 scientists to work with faculty directors Gary King and Salil Vadhan and the OpenDP Community to formulate and advance the scientific goals of OpenDP and solve research problems that are needed for its success. Candidates should have a graduate-level degree (preferably a PhD), familiarity with differential privacy, and one or both of the following: 

Experience with implementing software for data science, privacy, and/or security, and an interest in working with software engineers to develop the OpenDP codebase.
Experience with applied statistics, and an interest in working with domain scientists to apply OpenDP software to data-sharing problems in their field. In particular, we are looking for a researcher to engage on an immediate project on Covid-19 and mobility and epidemiology.  See HDSI Fellow for more details….”

Wikidata, Wikibase and the library linked data ecosystem: an OCLC Research Library Partnership discussion – Hanging Together

“n late July the OCLC Research Library Partnership convened a discussion that reflected on the current state of linked data. The discussion format was (for us) experimental — we invited participants to prepare by viewing a pre-recorded presentation, Re-envisioning the fabric of the bibliographic universe – From promise to reality* The presentation covers experiences of national and research libraries as well as OCLC’s own journey in linked data exploration. OCLC Researchers Annette Dortmund and Karen Smith-Yoshimura looked at relevant milestones in the journey from entity-based description research, prototypes, and on to actual practices, based on work that has been undertaken with library partners right up to the present day….”

Symptom Data Challenge

“Can you develop a novel analytic approach that uses the CMU/UMD COVID-19 Symptom Survey data to enable earlier detection and improved situational awareness of the outbreak by public health authorities and the general public? …

Semi-finalists and finalists are eligible for cash prizes, and finalists will join discussions with partners on how to improve and deploy their submissions….”

A database of zooplankton biomass in Australian marine waters – PubMed

Abstract:  Zooplankton biomass data have been collected in Australian waters since the 1930s, yet most datasets have been unavailable to the research community. We have searched archives, scanned the primary and grey literature, and contacted researchers, to collate 49187 records of marine zooplankton biomass from waters around Australia (0-60°S, 110-160°E). Many of these datasets are relatively small, but when combined, they provide >85 years of zooplankton biomass data for Australian waters from 1932 to the present. Data have been standardised and all available metadata included. We have lodged this dataset with the Australian Ocean Data Network, allowing full public access. The Australian Zooplankton Biomass Database will be valuable for global change studies, research assessing trophic linkages, and for initialising and assessing biogeochemical and ecosystem models of lower trophic levels.

 

What’s Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers | Fantastic Anachronism

[Some recommendations:]

Ignore citation counts. Given that citations are unrelated to (easily-predictable) replicability, let alone any subtler quality aspects, their use as an evaluative tool should stop immediately.
Open data, enforced by the NSF/NIH. There are problems with privacy but I would be tempted to go as far as possible with this. Open data helps detect fraud. And let’s have everyone share their code, too—anything that makes replication/reproduction easier is a step in the right direction.
Financial incentives for universities and journals to police fraud. It’s not easy to structure this well because on the one hand you want to incentivize them to minimize the frauds published, but on the other hand you want to maximize the frauds being caught. Beware Goodhart’s law!
Why not do away with the journal system altogether? The NSF could run its own centralized, open website; grants would require publication there. Journals are objectively not doing their job as gatekeepers of quality or truth, so what even is a journal? A combination of taxonomy and reputation. The former is better solved by a simple tag system, and the latter is actually misleading. Peer review is unpaid work anyway, it could continue as is. Attach a replication prediction market (with the estimated probability displayed in gargantuan neon-red font right next to the paper title) and you’re golden. Without the crutch of “high ranked journals” maybe we could move to better ways of evaluating scientific output. No more editors refusing to publish replications. You can’t shift the incentives: academics want to publish in “high-impact” journals, and journals want to selectively publish “high-impact” research. So just make it impossible. Plus as a bonus side-effect this would finally sink Elsevier….”

Knowledge Infrastructure and the Role of the University · Commonplace

“As open access to research information grows and publisher business models adapt accordingly, knowledge infrastructure has become the new frontier for advocates of open science. This paper argues that the time has come for universities and other knowledge institutions to assume a larger role in mitigating the risks that arise from ongoing consolidation in research infrastructure, including the privatization of community platforms, commercial control of analytics solutions, and other market-driven trends in scientific and scholarly publishing….

The research community is rightfully celebrating more open access and open data, yet there is growing recognition in the academic community that pay-to-publish open access is not the panacea people were hoping for when it comes to affordable, sustainable scholarly and scientific publishing. Publication is, after all, only one step in a flow of research communication activities that starts with the collection and analysis of research data and ends with assessment of research impact. Open science is the movement towards open methods, data, and software, to enhance reproducibility, fairness, and distributed collaboration in science. The construct covers such diverse elements as the use of open source software, the sharing of data sets, open and transparent peer review processes, open repositories for the long-term storage and availability of both data and articles, as well as the availability of open protocols and methodologies that ensure the reproducibility and overall quality of research. How these trends can be reconciled with the economic interests of the publishing industry as it is currently organized remains to be seen, but the time is ripe for greater multi-stakeholder coordination and institutional investment in building and maintaining a diversified open infrastructure pipeline.”

Viral Science: Masks, Speed Bumps, and Guard Rails: Patterns

“With the world fixated on COVID-19, the WHO has warned that the pandemic response has also been accompanied by an infodemic: overabundance of information, ranging from demonstrably false to accurate. Alas, the infodemic phenomenon has extended to articles in scientific journals, including prestigious medical outlets such as The Lancet and NEJM. The rapid reviews and publication speed for COVID-19 papers has surprised many, including practicing physicians, for whom the guidance is intended….

The Allen Institute for AI (AI2) and Semantic Scholar launched the COVID-19 Open Research Dataset (CORD-19), a growing corpus of papers (currently 130,000 abstracts plus full-text papers being used by multiple research groups) that are related to past and present coronaviruses.

Using this data, AI2, working with the University of Washington, released a tool called SciSight, an AI-powered graph visualization tool enabling quick and intuitive exploration
6

 of associations between biomedical entities such as proteins, genes, cells, drugs, diseases, and patient characteristics as well as between different research groups working in the field. It helps foster collaborations and discovery as well as reduce redundancy….

The research community and scientific publishers working together need to develop and make accessible open-source software tools to permit the dual-track submission discussed above. Repositories such as Github are a start….”

OPTIMISING THE OPERATION AND USE OF NATIONAL RESEARCH INFRASTRUCTURES

Abstract:  Research Infrastructures (RIs) play a key role in enabling and developing research in all scientific domains and represent an increasingly large share of research investment. Most RIs are funded, managed and operated at a national or federal level, and provide services mostly to national research communities. This policy report presents a generic framework for improving the use and operation of national RIs. It includes two guiding models, one for portfolio management and one for user-base optimisation. These guiding models lay out the key principles of an  effective national RI portfolio management system and identify the factors that should be considered by RI managers with regards to optimising the user-base of national RIs. Both guiding models take into consideration the diversity of national systems and RI operation approaches.

This report also contains a series of more generic policy recommendations and suggested actions for RI portfolio managers and RI managers.

[From the body of the report:]

As described in Section 8.1.2, data-driven RIs often do not have complex access mechanisms in place, as they mostly provide open access. Such access often means reducing the number of steps needed by a user to gain access to data. This can have knock-on implications for the ability of RIs to accurately monitor user access: for instance, the removal of login portals that were previously used to provide data access statistics….

Requiring users to submit Data Management Plans (DMPs) prior to the provision of access to an RI may encourage users to consider compliance with FAIR (Findable, Accessible, Interoperable, Reusable) data principles whilst planning their project (Wilkinson et al., 2016[12]). The alignment of requirements for Data Management Plans (Science Europe, 2018[13]) used for RI access provision and those used more generally in academic research should be considered to facilitate their adoption by researchers….

The two opposing extremes, described above, of either FAIR / open access or very limited data access provision, highlight the diversity in approaches of national RIs towards data access, and the lack of clear policy guidance…..

It is important that RIs have an open and transparent data policies in line with the FAIR principles to broaden their user base. Collaborating with other RIs to federate repositories and harmonize meta-data may be an important step in standardising open and transparent data policies across the RI community. …

There are a wide variety of pricing policies, both between and also within individual RIs, and the need for some flexibility is recognised. RIs should ensure that their pricing policies for all access modes are clear and cost-transparent, and that merit-based academic usage is provided openly and ‘free-from-costs’, wherever possible. …

ROIS-DS Center for Open Data in the Humanities (CODH)

“Center for Open Data in the Humanities / CODH, Joint Support-Center for Data Science Research, Research Organization of Information and Systems has the following missions toward the promotion of data-driven research and formation of the collaborative center in humanities research.

1. We establish a new discipline of data science-driven humanities, or digital humanities, and establish the center of excellence across organizations through the promotion of openness.
2. We develop “deep access” to the content of humanities data by state-of-the-art technologies in the area of informatics and statistics.
3. We aggregate, process and deliver humanities knowledge from Japan to the world through collaboration across organizations and countries.
4. We promote citizen science and open innovation based on open data and applications….”

Wellcome and Ripeta partner to assess dataset availability in funded research – Digital Science

“Ripeta and Wellcome are pleased to announce a collaborative effort to assess data and code availability in the manuscripts of funded research projects.

The project will analyze papers funded by Wellcome from the year prior to it establishing a dedicated Open Research team (2016) and from the most recent calendar year (2019). It supports Wellcome’s commitment to maximising the availability and re-use of results from its funded research.

Ripeta, a Digital Science portfolio company, aims to make better science easier by identifying and highlighting the important parts of research that should be transparently presented in a manuscript and other materials.

The collaboration will leverage Ripeta’s natural language processing (NLP) technology, which scans articles for reproducibility criteria. For both data availability and code availability, the NLP will produce a binary yes-no response for the presence of availability statements. Those with a “yes” response will then be categorized by the way that data or code are shared….”