“In the early 2000s, my role at Google was running web indexing: the system that crawls the web, making pages and content discoverable and accessible through search. Nowadays, there’s an assumption that looking for something via Google searches everything, but that wasn’t the case in the early days. Part of my role was to expand the index by reaching out to many different types of organizations – government, business, publishers – to make sure their web sites were included in the index.
A key group among these was scholarly publishers hosting journals and conferences. Having grown up on a university campus, scholarly articles had been all around and I wanted to make sure that they were as easy to find as everything else.
As a part of this, I reached out to HighWire to explore the possibility of indexing the hosted journals. I remember our first call in the Fall of 2002 with John Sack, Todd McGee and several others. A few quick calls, a couple of meetings in person and we were off….”
Abstract: The pursuit of simple, yet fair, unbiased, and objective measures of researcher performance has occupied bibliometricians and the research community as a whole for decades. However, despite the diversity of available metrics, most are either complex to calculate or not readily applied in the most common assessment exercises (e.g., grant assessment, job applications). The ubiquity of metrics like the h-index (h papers with at least h citations) and its time-corrected variant, the m-quotient (h-index ÷ number of years publishing) therefore reflect the ease of use rather than their capacity to differentiate researchers fairly among disciplines, career stage, or gender. We address this problem here by defining an easily calculated index based on publicly available citation data (Google Scholar) that corrects for most biases and allows assessors to compare researchers at any stage of their career and from any discipline on the same scale. Our ??-index violates fewer statistical assumptions relative to other metrics when comparing groups of researchers, and can be easily modified to remove inherent gender biases in citation data. We demonstrate the utility of the ??-index using a sample of 480 researchers with Google Scholar profiles, stratified evenly into eight disciplines (archaeology, chemistry, ecology, evolution and development, geology, microbiology, ophthalmology, palaeontology), three career stages (early, mid-, late-career), and two genders. We advocate the use of the??-index whenever assessors must compare research performance among researchers of different backgrounds, but emphasise that no single index should be used exclusively to rank researcher capability.
Abstract: New sources of citation data have recently become available, such as Microsoft Academic, Dimensions, and the OpenCitations Index of CrossRef open DOI-to-DOI citations (COCI). Although these have been compared to the Web of Science Core Collection (WoS), Scopus, or Google Scholar, there is no systematic evidence of their differences across subject categories. In response, this paper investigates 3,073,351 citations found by these six data sources to 2,515 English-language highly-cited documents published in 2006 from 252 subject categories, expanding and updating the largest previous study. Google Scholar found 88% of all citations, many of which were not found by the other sources, and nearly all citations found by the remaining sources (89–94%). A similar pattern held within most subject categories. Microsoft Academic is the second largest overall (60% of all citations), including 82% of Scopus citations and 86% of WoS citations. In most categories, Microsoft Academic found more citations than Scopus and WoS (182 and 223 subject categories, respectively), but had coverage gaps in some areas, such as Physics and some Humanities categories. After Scopus, Dimensions is fourth largest (54% of all citations), including 84% of Scopus citations and 88% of WoS citations. It found more citations than Scopus in 36 categories, more than WoS in 185, and displays some coverage gaps, especially in the Humanities. Following WoS, COCI is the smallest, with 28% of all citations. Google Scholar is still the most comprehensive source. In many subject categories Microsoft Academic and Dimensions are good alternatives to Scopus and WoS in terms of coverage.
“In this blog post, I will talk specifically on a very important source of data used by Academic Search engines – Microsoft Academic Graph (MAG) and do a brief review of four academic search engines – Microsoft Academic, Lens.org, Semantic Scholar and Scinapse ,which uses MAG among other sources….
We live in a time, where large (>50 million) Scholarly discovery indexes are no longer as hard to create as in the past, thanks to the availability of freely available Scholarly article index data like Crossref and MAG.”
Abstract: Rigorous evidence identification is essential for systematic reviews and meta?analyses (evidence syntheses), because the sample selection of relevant studies determines a review’s outcome, validity, and explanatory power. Yet, the search systems allowing access to this evidence provide varying levels of precision, recall, and reproducibility and also demand different levels of effort. To date, it remains unclear which search systems are most appropriate for evidence synthesis and why. Advice on which search engines and bibliographic databases to choose for systematic searches is limited and lacking systematic, empirical performance assessments.
This study investigates and compares the systematic search qualities of 28 widely used academic search systems, including Google Scholar, PubMed and Web of Science. A novel, query?based method tests how well users are able to interact and retrieve records with each system. The study is the first to show the extent to which search systems can effectively and efficiently perform (Boolean) searches with regards to precision, recall and reproducibility. We found substantial differences in the performance of search systems, meaning that their usability in systematic searches varies. Indeed, only half of the search systems analysed and only a few Open Access databases can be recommended for evidence syntheses without adding substantial caveats. Particularly, our findings demonstrate why Google Scholar is inappropriate as principal search system.
We call for database owners to recognise the requirements of evidence synthesis, and for academic journals to re?assess quality requirements for systematic reviews. Our findings aim to support researchers in conducting better searches for better evidence synthesis.
Abstract: In the last 3 years, several new (free) sources for academic publication and citation data have joined the now well-established Google Scholar, complementing the two traditional commercial data sources: Scopus and the Web of Science. The most important of these new data sources are Microsoft Academic (2016), Crossref (2017) and Dimensions (2018). Whereas Microsoft Academic has received some attention from the bibliometric commu-nity, there are as yet very few studies that have investigated the coverage of Crossref or Dimensions. To address this gap, this brief letter assesses Crossref and Dimensions cover-age in comparison to Google Scholar, Microsoft Academic, Scopus and the Web of Science through a detailed investigation of the full publication and citation record of a single academic, as well as six top journals in Business & Economics. Overall, this first small-scale study suggests that, when compared to Scopus and the Web of Science, Crossref and Dimensions have a similar or better coverage for both publications and citations, but a substantively lower coverage than Google Scholar and Microsoft Academic. If our find-ings can be confirmed by larger-scale studies, Crossref and Dimensions might serve as good alternatives to Scopus and the Web of Science for both literature reviews and citation analysis. However, Google Scholar and Microsoft Academic maintain their position as the most comprehensive free sources for publication and citation data
Abstract: Closed and proprietary infrastructures limit the accessibility of research, often putting paywalls in front of scientific knowledge. But they also severely limit reuse, preventing other tools from building on top of their software, data, and content. Using the example of Google Scholar, I will show how these characteristics of closed infrastructures impede innovation in the research workflow and create lock-in effects. I will also demonstrate how open infrastructures can help us move beyond this issue and create an ecosystem that is community-driven and community-owned. In this ecosystem, innovation thrives, as entry barriers are removed and systems can make use of each other’s components. Specific consideration will be given to open source services and non-profit frontends, as they are often overlooked by funders, but represent the way researchers engage with open science.
“You know you have access and you have access now. However, the discovery process for open access articles isn’t necessarily the same as subscription searching. Especially if you do not have access to specific subscription databases.
This guide is meant to help individuals, of any background, search more easily for open access articles….”