Free articles – accounting for the timing effect

Abstract: “Various studies have attempted to assess the amount of free full text available on the web and recent work have suggested that we are close to the 50% mark for freely available articles (Archambault et al. 2013; Björk et al. 2010; Jamali and Nabavi 2015). Our paper contributes to the literature by taking into account the timing issue by studying when the papers were made free. We sampled citations made by researchers who published in 2015 (based on records in the Singapore Management University Institution repository), checked the number of cited papers that were free at the time of the study and then attempted to “carbon date” the freely available papers to determine when they were first made available. This allows us to estimate the length of time the free cited article was made available before the citing paper was published. We find that in our sample of cited papers in Economics, the median freely available cited paper (oldest variant) was made available 7-8 years before the citing paper was published. Of these papers found free via Google Scholar, the majority 67% (n=47) was made available via University websites (not including Institutional repositories) and 32.8% (n=23) were final published versions.”

Open Access Week 2016: researcher spotlight – science and technology | Library Matters: RGU Library Blog

“OpenAIR was a very early open access repository at a university, putting RGU at the leading edge of what we know now as Green Open Access publishing.”

Scraping Scientific Web Repositories: Challenges and Solutions for Automated Content Extraction

Abstract: “Aside from improving the visibility and accessibility of scientific publications, many scientific Web repositories also assess researchers’ quantitative and qualitative publication performance, e.g., by displaying metrics such as the h-index. These metrics have become important for research institutions and other stakeholders to support impactful decision making processes such as hiring or funding decisions. However, scientific Web repositories typically offer only simple performance metrics and limited analysis options. Moreover, the data and algorithms to compute performance metrics are usually not published. Hence, it is not transparent or verifiable which publications the systems include in the computation and how the systems rank the results. Many researchers are interested in accessing the underlying scientometric raw data to increase the transparency of these systems. In this paper, we discuss the challenges and present strategies to programmatically access such data in scientific Web repositories. We demonstrate the strategies as part of an open source tool (MIT license) that allows research performance comparisons based on Google Scholar data. We would like to emphasize that the scraper included in the tool should only be used if consent was given by the operator of a repository. In our experience, consent is often given if the research goals are clearly explained and the project is of a non-commercial nature.”