“Therefore, we need better tools for data discovery. But I do not believe that Google Dataset Discovery is the right answer. It represents a proprietary and closed system on top of our own data. This is a system that benefits massively from researchers’ labour, but where researchers will have no say in. Google is capitalizing on a movement that they have contributed nothing to. Therefore, we need an open alternative. However, at the moment it seems to me that funders, research administrators and infrastructures are content to leave it to Google. This is highly problematic, especially since we have discussed the problems of lock-in effects and other negative outcomes of proprietary infrastructure for years now….”
“Repository Finder, a pilot project of the Enabling FAIR Data Project led by the American Geophysical Union (AGU) in partnership with DataCite and the earth and space sciences community, can help you find an appropriate repository to deposit your research data. The tool is hosted by DataCite and queries the re3data registry of research data repositories….”
“In today’s world, scientists in many disciplines and a growing number of journalists live and breathe data. There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. To enable easy access to this data, we [Google] launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the data required for their work and their stories, or simply to satisfy their intellectual curiosity.
Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page. To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem….”
“On Wednesday, Google launched Dataset Search, a new search engine specifically geared toward collections of data. The company hopes the platform will help scientists to locate datasets quickly and painlessly.
READABLE DATA. According to a blog post, Google started the project by creating guidelines for dataset providers to ensure the search engine could understand the content of a dataset. For example, they suggested that providers should include particular information in the dataset’s metadata, such as how the provider collected the data and who can use it.
Data that follows these guidelines is easier for Google to index the datasets so that the relevant ones show up in search queries.
JUST THE BEGINNING. This first version of Dataset Search includes datasets focused on the environmental and social sciences, as well as datasets from government websites and various news organizations focused on other topics.
According to Google, the number and type of datasets included in the search engine will continue to grow as more dataset providers adopt the company’s metadata guidelines. …”
“Google has unveiled a search engine to help researchers locate online data that is freely available for use. The company launched the service on 5 September, saying that it is aimed at “scientists, data journalists, data geeks, or anyone else”.
Dataset Search, now available alongside Google’s other specialized search engines, such as those for news and images — as well as Google Scholar and Google Books — locates files and databases on the basis of how their owners have classified them. It does not read the content of the files themselves in the way search engines do for web pages.
Experts say that it fills a gap and could contribute significantly to the success of the open-data movement, which aims to make data openly available for use and re-use.”
“The progressive RCUK policy on open access has recently come under fire, particularly from humanities scholars, for favouring Gold OA over Green. For various reasons — and I won’t, for now, go into the question of which of these reasons are and aren’t sound — they favour an approach to open access where publishers keep final versions of their papers behind paywalls, but drafts are deposited in institutional repositories (IRs) and people who want to read the paper can have access to the drafts.
It’s appealing to think that this relatively lightweight way of solving the access problem can work. Unfortunately, I’m not convinced it can, for several reasons. I’ll discuss these below, not so much with the intention of persuading people that Gold is a better approach, but with the hope that those of you who are Green advocates have seen things that I’ve missed and you’ll be able to explain why it can work after all….”
“You know you have access and you have access now. However, the discovery process for open access articles isn’t necessarily the same as subscription searching. Especially if you do not have access to specific subscription databases.
This guide is meant to help individuals, of any background, search more easily for open access articles….”
Creative Commons is pleased to announce an award of new funding in the amount of $800,000 over two years from Arcadia, a charitable fund of Lisbet Rausing and Peter Baldwin, in support of CC Search, a Creative Commons technology project designed to maximize discovery and use of openly licensed content in the Commons.