[1707.04207] Incidental or influential? – Challenges in automatically detecting citation importance using publication full texts

“This work looks in depth at several studies that have attempted to automate the process of citation importance classification based on the publications full text. We analyse a range of features that have been previously used in this task. Our experimental results confirm that the number of in text references are highly predictive of influence. Contrary to the work of Valenzuela et al. we find abstract similarity one of the most predictive features. Overall, we show that many of the features previously described in literature are not particularly predictive. Consequently, we discuss challenges and potential improvements in the classification pipeline, provide a critical review of the performance of individual features and address the importance of constructing a large scale gold standard reference dataset.” 

Understanding Open Science: Definitions and framework

Understanding Open Science: Definitions and framework

  1. 1. Understanding Open Science: Definitions and framework Dr. Nancy Pontika Open Access Aggregation Officer CORE Twitter: @nancypontika
  2. 2. What is Open Science
  3. 3. Research Lifecycle: as simple as it gets Idea Methodology Data Collection Analysis Publish
  4. 4. Idea Methodology Data Collection Analysis Publish Journal article, Dissertation, Book, Source Code, etc. Experiments, Interviews, Observations, etc. Numbers, Code, Text, Images, sound records, etc. Statistics, processes, analysis, documentation, etc. Research Lifecycle: focus on the steps”

Additional support for RCR: A validated article-level measure of scientific influence

“In their comment, Janssens et al. [1] offer a critique of the Relative Citation Ratio (RCR), objecting to the construction of both the numerator and denominator of the metric. While we strongly agree that any measure used to assess the productivity of research programs should be thoughtfully designed and carefully validated, we believe that the specific concerns outlined in their correspondence are unfounded.

Our original article acknowledged that RCR or, for that matter, any bibliometric measure has limited power to quantify the influence of any very recently published paper, because citation rates are inherently noisy when the absolute number of citations is small [2]. For this reason, in our iCite tool, we have not reported RCRs for papers published in the calendar year previous to the current year [3]. However, while agreeing with our initial assertion that RCR cannot be used to conclusively evaluate recent papers, Janssens et al. also suggest that the failure to report RCRs for new publications might unfairly penalize some researchers. While it is widely understood that it takes time to accurately assess the influence that new papers have on their field, we have attempted to accommodate this concern by updating iCite so that RCRs are now reported for all papers in the database that have at least 5 citations and by adding a visual indicator to flag values for papers published in the last 18 months, which should be considered provisional [3]. This modified practice will be maintained going forward.

Regarding article citation rates of older articles, we have added data on the stability of RCR values to the “Statistics” page of the iCite website [4, 5]. We believe that these new data, which demonstrate that the vast majority of influential papers retain their influence over the period of an investigator’s career, should reassure users that RCR does not unfairly disadvantage older papers. Our analysis of the year-by-year changes in RCR values of National Institutes of Health (NIH)-funded articles published in 1991 reinforces this point (Fig 1). From 1992–2014, both on the individual level and in aggregate, RCR values are remarkably stable. For cases in which RCRs change significantly, the values typically increase. That said, we strongly believe that the potential for RCR to decrease over time is necessary and important; as knowledge advances and old models are replaced, publications rooted in those outdated models naturally become less influential….”

Study Suggests Publisher Public Access Outpacing Open Access; Gold OA Decreases Citation Performance – The Scholarly Kitchen

“A recent study, out as a preprint, offers something of a muddled bag of methodological choices and compromises, but presents several surprising data points, namely that voluntary publisher efforts may be providing broader access to the literature than Gold or Green open access (OA), and some confounding shifts in claims of an open access citation advantage.”

[0906.5418] Citing and Reading Behaviours in High-Energy Physics. How a Community Stopped Worrying about Journals and Learned to Love Repositories

Abstract:  Contemporary scholarly discourse follows many alternative routes in addition to the three-century old tradition of publication in peer-reviewed journals. The field of High- Energy Physics (HEP) has explored alternative communication strategies for decades, initially via the mass mailing of paper copies of preliminary manuscripts, then via the inception of the first online repositories and digital libraries. 

This field is uniquely placed to answer recurrent questions raised by the current trends in scholarly communication: is there an advantage for scientists to make their work available through repositories, often in preliminary form? Is there an advantage to publishing in Open Access journals? Do scientists still read journals or do they use digital repositories? 

The analysis of citation data demonstrates that free and immediate online dissemination of preprints creates an immense citation advantage in HEP, whereas publication in Open Access journals presents no discernible advantage. In addition, the analysis of clickstreams in the leading digital library of the field shows that HEP scientists seldom read journals, preferring preprints instead.

John Dove, Overcoming Inertia in Green Open Access Adoption

“Here’s a little exercise which I’ve now done looking at research papers in a wide variety of disciplines. Look at the referenced sources in a recently published paper. Unless you are reading….this paper at one of the few fully-funded research libraries, you will find that a significant number of the referenced sources are unavailable to you. Open access is simply not there….Lots of the referenced sources will have to be obtained by inter-library loan or not at all. Your ability to participate in the scholarly inquiries of your field are highly constrained….I think there’s a case to be made that journal publishers may be missing a trick. There is a point in time when a publisher’s self-interest in the quality of their about-to-be-published work would be well-served by encouraging authors of referenced sources to share their past articles. This is also a moment in time at which the authors of referenced sources are also missing a trick but are unaware of it…..”

Article visibility: journal impact factor and availability of full text in PubMed Central and open access

Abstract:  Both the impact factor of the journal and immediate full-text availability in Pubmed Central (PMC) have featured in editorials before.1-3 In 2004, the editor of the Cardiovascular Journal of Africa (CVJA) lamented, like so many others, the injustice of not having an impact factor, its validity as a tool for measuring science output, and the negative effect of a low perceived impact in drawing attention from publications from developing countries.1,4

Since then, after a selection process, we have been indexed by the Web of Science® (WoS) and Thomson Reuters (Philadelphia, PA, USA), and have seen a growing impact factor. In the case of PMC, our acceptance to this database was announced in 2012,2 and now we are proud that it is active and full-text articles are available dating back to 2009. The journal opted for immediate full open access (OA), which means that full-text articles are available on publication date for anybody with access to the internet.

Elsevier Embraces Data-Sharing Standards, in Step Toward Scientific Openness – The Chronicle of Higher Education

“The cause of scientific transparency and accuracy got a boost on Tuesday with the decision by the publishing giant Elsevier to endorse a broad set of standards for open articles and data. Elsevier agreed to add its 1,800 journals to the 3,200 that already accept the “Transparency and Openness Promotion” guidelines drafted in 2005 by a group of university researchers, funders and publishers. The standards expand article-citation practices so that authors get credit for making clear the data, methods, and materials needed for replicating their work….”