PLOS’ New Data Policy: Public Access to Data

Access to research results, immediately and800px-Open_Data_stickers without restriction, has always been at the heart of PLOS’ mission and the wider Open Access movement. However, without similar access to the data underlying the findings, the article can be of limited use. For this reason, PLOS has always required that authors make their data available to other academic researchers who wish to replicate, reanalyze, or build upon the findings published in our journals.

In an effort to increase access to this data, we are now revising our data-sharing policy for all PLOS journals: authors must make all data publicly available, without restriction, immediately upon publication of the article. Beginning March 3rd, 2014, all authors who submit to a PLOS journal will be asked to provide a Data Availability Statement, describing where and how others can access each dataset that underlies the findings. This Data Availability Statement will be published on the first page of each article.

What do we mean by data?

“Data are any and all of the digital materials that are collected and analyzed in the pursuit of scientific advances.” Examples could include spreadsheets of original measurements (of cells, of fluorescent intensity, of respiratory volume), large datasets such as

next-generation sequence reads, verbatim responses from qualitative studies, software code, or even image files used to create figures. Data should be in the form in which it was originally collected, before summarizing, analyzing or reporting.

What do we mean by publicly available?

All data must be in one of three places:

  • the body of the manuscript; this may be appropriate for studies where the dataset is small enough to be presented in a table
  • in the supporting information; this may be appropriate for moderately-sized datasets that can be reported in large tables or as compressed files, which can then be downloaded
  • in a stable, public repository that provides an accession number or digital object identifier (DOI) for each dataset; there are many repositories that specialize in specific data types, and these are particularly suitable for very large datasets

Do we allow any exceptions?

Yes, but only in specific cases. We are aware that it is not ethical to make all datasets fully public, including private patient data, or specific information relating to endangered species. Some authors also obtain data from third parties and therefore do not have the right to make that dataset publicly available. In such cases, authors must state that “Data is available upon request”, and identify the person, group or committee to whom requests should be submitted. The authors themselves should not be the only point of contact for requesting data.

Where can I go for more information?

The revised data sharing policy, along with more information about the issues associated with public availability of data, can be reviewed in full at:

http://www.plos.org/data-access-for-the-open-access-literature-ploss-data-policy/

http://www.plos.org/update-on-plos-data-policy/

Image: Open Data stickers by Jonathan Gray

The post PLOS’ New Data Policy: Public Access to Data appeared first on EveryONE.

A way with words: Data mining uncloaks authors’ stylistic flair

First_Folio_-_Folger_Shakespeare_Library_-_DSC09660

As any writer or wordsmith knows, searching for the right word can be a painful struggle. Here’s comforting news: word choice may be the key to understanding your stylistic flair.

New research in the field of text mining suggests that distinct writing styles are discernible by word selection and frequency. Even the use of common words, such as “you” and “say,” can help distinguish one writer from another. To learn more about style, the authors of a recent PLOS ONE paper turned to the famed lord of language, William Shakespeare.

The researchers assembled a pool of 168 plays written during the 16th and 17th centuries. After accounting for duplicates, 55,055 unique words were identified and then cross-referenced against the work of four writers from that time period: William Shakespeare, Ben Jonson, Thomas Middleton, and John Fletcher. The researchers counted how often these writers used words from the pool and ranked words by their frequency. Lists of twenty of the most-used and least-used words were then compiled for each writer and considered “markers” of their individual styles.

Fletcher, for one, frequently used the word “ye” in his plays, so a relatively high frequency of “ye” would be a strong marker of Fletcher’s particular writing style. Similarly, Middleton often used “that” in the demonstrative sense, and Jonson favored the word “or.” Shakespeare himself used “thou” the most frequently, and the word “all” the least.

In addition to looking at individual word use, the researchers analyzed specific works where the writer’s style changed significantly, such as in Middleton’s political satire “A Game at Chess,” which was notably different from his other works. They also compared word choice between writers. Their findings indicate that, unlike his contemporaries, Shakespeare’s style was marked more by his underuse of words rather than his overuse. Take, for example, Shakespeare’s use of “ye.” Unlike Fletcher, who used this word liberally, “ye” is one of Shakespeare’s least frequently used words.

Such analyses, the researchers suggest, may help with authorship controversies and disputes, but they can also address other concerns. In a post in The Conversation, the authors of this paper suggest that the mathematical method used to identify words as markers of style may also be helpful to identify biomarkers in medical research. In fact, the research team currently uses these methods to study cancer and the selection of therapeutic combinations, multiple sclerosis, and Alzheimer’s disease.

 

Citation: Marsden J, Budden D, Craig H, Moscato P (2013) Language Individuation and Marker Words: Shakespeare and His Maxwell’s Demon. PLoS ONE 8(6): e66813. doi:10.1371/journal.pone.0066813

Image: First Folio – Folger Shakespeare Library – DSC09660, Wikimedia Commons