# The beginning of the Authors Alliance – Creative Commons

Yesterday marked the launch of the Authors Alliance, a nonprofit organization that supports authors who want “to harness the potential of digital networks to share their creations more broadly in order to serve the public good.”

In an interview with Publisher’s Weekly, Authors Alliance founder Pamela Samuelson explained that the Authors Alliance will have a few different roles. Inwardly, the group will “provide authors with information about copyrights, licensing agreements, alternative contract terms,” and other practical legal information so that they can make their works widely and openly available. And externally, the Alliance will “represent the interests of authors who want to make their works more widely available in public policy debates,” and advocate for these reforms alongside like-minded public interest organizations.

The Authors Alliance was developed by Samuelson and several of her colleagues at the University of California Berkeley including Molly Van Houweling, Carla Hesse, and Thomas Leonard. The Alliance also has an advisory board made up of pre-eminent scholars, writers, and public interest advocates, including several members of the Creative Commons board of directors. The Authors Alliance is now accepting new members.

The Alliance has already developed a set of copyright reform principles, outlining its vision for changes to copyright law to support authors who write to be read.

We have formed an Authors Alliance to represent authors who create to be read, to be seen, and to be heard. We believe that these authors have not been well served by misguided efforts to strengthen copyright. These efforts have failed to provide meaningful financial returns to most authors, while instead unacceptably compromising the preservation of our own intellectual legacies and our ability to tap our collective cultural heritage. We want to harness the potential of global digital networks to share knowledge and products of the imagination as broadly as possible. We aim to amplify the voices of authors and creators in all media who write and create not only for pay, but above all to make their discoveries, ideas, and creations accessible to the broadest possible audience.

The principles include:

1. Further empower authors to disseminate their works.
3. Affirm the vitality of limits on copyright that enable us to do our work and reach our audiences.
4. Ensure that copyright’s remedies and enforcement mechanisms protect our interests.

At the core, the Authors Alliance and Creative Commons share a similar goal: to provide useful resources and tools for creators who aren’t being served well by the existing copyright system. We’re excited to work with the Alliance on issues that support authors who write to be read–and the public interest for whom these authors create.

Access to research results, immediately and without restriction, has always been at the heart of PLOS’ mission and the wider Open Access movement. However, without similar access to the data underlying the findings, the article can be of limited use. For this reason, PLOS has always required that authors make their data available to other academic researchers who wish to replicate, reanalyze, or build upon the findings published in our journals.

In an effort to increase access to this data, we are now revising our data-sharing policy for all PLOS journals: authors must make all data publicly available, without restriction, immediately upon publication of the article. Beginning March 3rd, 2014, all authors who submit to a PLOS journal will be asked to provide a Data Availability Statement, describing where and how others can access each dataset that underlies the findings. This Data Availability Statement will be published on the first page of each article.

What do we mean by data?

“Data are any and all of the digital materials that are collected and analyzed in the pursuit of scientific advances.” Examples could include spreadsheets of original measurements (of cells, of fluorescent intensity, of respiratory volume), large datasets such as

next-generation sequence reads, verbatim responses from qualitative studies, software code, or even image files used to create figures. Data should be in the form in which it was originally collected, before summarizing, analyzing or reporting.

What do we mean by publicly available?

All data must be in one of three places:

• the body of the manuscript; this may be appropriate for studies where the dataset is small enough to be presented in a table
• in the supporting information; this may be appropriate for moderately-sized datasets that can be reported in large tables or as compressed files, which can then be downloaded
• in a stable, public repository that provides an accession number or digital object identifier (DOI) for each dataset; there are many repositories that specialize in specific data types, and these are particularly suitable for very large datasets

Do we allow any exceptions?

Yes, but only in specific cases. We are aware that it is not ethical to make all datasets fully public, including private patient data, or specific information relating to endangered species. Some authors also obtain data from third parties and therefore do not have the right to make that dataset publicly available. In such cases, authors must state that “Data is available upon request”, and identify the person, group or committee to whom requests should be submitted. The authors themselves should not be the only point of contact for requesting data.

The revised data sharing policy, along with more information about the issues associated with public availability of data, can be reviewed in full at:

http://www.plos.org/data-access-for-the-open-access-literature-ploss-data-policy/

http://www.plos.org/update-on-plos-data-policy/

Image: Open Data stickers by Jonathan Gray

The post PLOS’ New Data Policy: Public Access to Data appeared first on EveryONE.

# A Bechdel test for scientific workshops

After attending two recent scientific conferences, one which was gender balanced, and one which was so gender-imbalanced that it engendered snarky out-of-band twitter comments, it struck me that we might need a Bechdel Test for scientific workshops.  The Bechdel test is a simple test for movies.  To pass the test, a movie has to have:

1. at least two [named] women in it,
2. who talk to each other,
3. about something besides a man.

Seems simple, right?  You’d be amazed at just how few popular movies pass the test, including some set in universes that were originally designed for equality. (I’m talking about you, Star Trek reboot.)

Here’s an analogous test for scientific workshops or conference symposia.  Does the workshop have:

1. at least two female invited speakers,
2. who are asked questions by female audience members,

Again, this seems simple, right?  But you’d be shocked how few scientific conference symposia or workshops can live up to this standard.  I suspect this depends strongly on specific research fields.

Rigoberto Hernandez has been talking about advancing science through diversity for quite a while.  I finally got to hear him speak about the OXIDE project on this latest trip, and he’s got a lot of great things to say about how diversity can strengthen science. I think one great way to help is to point out the good conferences we attend which live up to this standard.

Rigoberto also happened to be one of the organizers of the gender-balanced conference, which was also one of the best meetings I’ve ever attended.

# OpenScience comes of age

In 1998, Open Science seemed like a pretty obvious projection of basic scientific principles into the digital age.  I didn’t think the ideas would meet much, if any, resistance from the scientific community.   And in October 1999, Brookhaven National Lab sponsored a meeting called Open Source / Open Science that, in retrospect, was a pretty utopian gathering.  There were a lot of the current OpenScience community members present at the meeting (notably Brian Glanz and Greg Wilson).   It felt like everyone would be convinced to do Open Source & Open Data science in short order.

The past 14 years have been instructive in just how long it can take to make cultural changes in the scientific community.

So, it was an amazing experience to be present when the Office of Science and Technology Policy (OSTP) announced the Champions of Change for Open Science.  These are 13 incredible individuals and organizations with great stories about sharing their science.  It feels like we’ve made significant motion on implementing policies that are friendly to Open Science.   I should note that we’re particularly happy to see OSTP use the phrase Open Science, and not the more narrow terms: Open Data or Open Access.  I’m hopeful that Open Source will also be part of science policy going forward.

There was a second group who got the opportunity to present at this event at a poster session later that day.  I haven’t seen the list publicized elsewhere, but these are some sharp folks who deserve recognition for their work.  I’m going to highlight some of these in the coming week.  Here’s the list of posters:

1. Richard Judson & Ann Richard from the National Center for Computational Toxicology presented on “ACToR & DSSTox: EPA Open Information Tools for Chemicals in the Environment”
2. Tom Bleier, Clark Dunson & Michael Lencioni from the QuakeFinder project presented on “Electromagnetic Earthquake Forecasting Research”
3. David C. Van Essen from WUSTL presented on the “Human Connectome Project
4. Heather Piwowar & Jason Priem presented a poster on “ImpactStory: Open Carrots for Open Science”
5. Jean-Claude Bradley (Drexel) and Andrew Lang (Oral Roberts University) presented a poster on “Open Notebook Science“.
6. Dan Gezelter (that’s me) presented on “The OpenScience Project“.
7. John Wilbanks from Sage Bionetworks presented on “Portable Legal Consent – Let Patients Donate Data to Science
8. Matt Martin from the National Center for Computational Toxicology presented on “ToxRefDB & ToxCastDB: High-Throughput Toxicology Resources”
9. Brian Athey and Christoph Brockel presented on “The tranSMART Platform: Accelerating Open Science, Data Analytics and Data Sharing”
10. Alexander Wait Zaranek, Ward Vandewege & Jonathan Sheffi from Clinical Future, Inc. presented on “Transparent Informatics: A Foundation for Precision Medicine

It was an intense day, and I’m delighted that Open Science has finally come of age.

# OpenScience poster

I’m giving a poster in a few days about openscience.org, and it has been a very long time since I’ve had to make a poster.  This one turned out quite text-heavy, but I wanted to make a few arguments that seemed difficult or impossible to translate into graphics.   A PDF (9.3 MB) of the draft is available by clicking the image on the right…

Comments and suggestions, as always, are quite welcome.

# Not a kickstarter for science, a prize clearinghouse

Yesterday’s post on the reversible random number generators received some interesting reactions from my colleagues.  They were uniformly impressed with the solution to what everyone thought was a hard problem, but surprisingly, most of the scientists I talked to were most excited about the fact that dangling a $500 reward for solving a hard problem generated nearly instantaneous results. Typical comments: I wonder if I similarly spent my startup how much science I could get done… Also, it is amazing what$500 buys these days!

Think how many problems we could solve if we dangled a few prizes for other knotty problems.

• The problem itself was well-framed and finite:  ”We need a time-reversible random number generator.”  It was something that a lot of people in the field could agree was interesting when framed to them properly.
• The group offering the prize was widely-respected for previous work on related problems.
• The prize and the solution were both posted on a highly visible physics site (arXiv).
• The reward was about fame and recognition by the community more than it was about money.

I’m now wondering if all of  the attempts to get a kickstarter or crowdsourced funding model for science (e.g. sciflies, petridish, scifundchallenge, fundageek) are just a bit misguided.  Science is darned expensive, and for better or worse, we’re going to be wedded to federal and foundation funding for science for a long time.  All funding models have an aspect of salesmanship to them – a scientist must convince the funder that the problem itself is interesting enough to need solving, and that their lab is the one to solve it.   In the NSF-style funding model, scientific communities do have significant input into what the “good problems” are, but the necessary delays in funding and the scarcity of funds means that we’re not very agile.

Perhaps we need a clearinghouse where scientific communities can agree on a tough challenge, pool some minimal award money (like $500 or$1000) and let their young colleagues have a go at winning fame by solving them.

# Why aren’t voting machines required to be Open Source?

If ever there was a need for the transparency that open source software brings it is in the realm of voting machine technology.    This story makes that point crystal clear.   There may or may not be shenanigans going on in Ohio.  The point is that we have no way of knowing what the patches on those Ohio voting machines actually do, and no faith in the code reading, debugging, and auditing ability of elected officials.   If we want to be confident in the workings of our democracy, closed-source voting machines should be banned.

For that matter, why aren’t voting systems required to leave a physical paper trail so that we can check up on the tabulating algorithms?

# On Reproducibility

I just got back from a fascinating one-day workshop on “Data and Code Sharing in Computational Sciences” that was organized by Victoria Stodden of the Yale Internet Society Project. The workshop had a wide-ranging collection of contributors including representatives of the computational and data-driven science communities (everything from Astronomy, and Applied Math to Theoretical Chemistry and Bioinformatics), intellectual property lawyers, the publishing industry (Nature Publishing Group and Seed Media, but no society journals), foundations, funding agencies, and the open access community. The general recommendations of the workshop are going to be closely aligned with open science suggestions, as any meaningful definition of reproducibility requires public access to the code and data.

There were some fascinating debates at the workshop on foundational issues; What does reproducibility mean? How stringent of a reproducibility test should be required of scientific work? Reproducible by whom? Should resolution of reproducibility problems be required for publication? What are good roles for journals and funding agencies in encouraging reproducible research? Can we agree on a set of reproducible science guidelines which we can encourage our colleagues and scientific communities to take up?

Each of the attendees was asked to prepare a thought piece on the subject, and I’ll be breaking mine down into a couple of single-topic posts in the next few days / weeks.

The topics are roughly:

• Being Scientific: Fasifiability, Verifiability, Empirical Tests, and Reproducibility
• Barriers to Computational Reproducibility
• Data vs. Code vs. Papers (they aren’t the same)
• Simple ideas to increase openness and reproducibility

Before I jump in with the first piece, I thought it would be helpful to jot down a minimal idea about science that most of us can agree on, which is “Scientific theories should be universal”. That is, multiple independent scientists should be able to subject these theories to similar tests in different locations, on different equipment, and at different times and get similar answers. Reproducibility of scientific observations is therefore going to be required for scientific universality. Once we agree on this, we can start to figure out what reproducibility really means.