“The European Commission has identified the opportunity to save €10.2 billion per year by using FAIR data (Findable, Accessible, Interoperable, Reusable). As policies begin to emerge requiring FAIR data, it’s timely to consider the open infrastructure needed to make embed FAIRness into the research and research communication workflows and outputs.
Coko recently received a grant from the Sloan Foundation to build DataSeer, an web service that uses Natural Language Processing to identify and call out datasets associated with research articles. Datasets are often not explicitly identified, let alone made FAIR and accessible. The first step is knowing how many datasets were used in a body of work. DataSeer “reads” documents and finds mentions of dataset creation and use. Based on the context, DataSeer can offer recommendations to curate, deposit, add metadata too, or otherwise better handle datasets. DataSeer can fit into the workflows of researchers, publishers, aggregators, funders, and institutions….
Before FAIR compliance can be assessed, the full range of datasets associated with a research project must first be identified. There are often ‘hidden’ datasets mentioned in the text that are included among the ‘official’ outputs. DataSeer finds these mentions and help to authors to identify and share all of the datasets involved in their work. …”