Web analytics for open access academic journals: justification, planning and implementation | BiD: textos universitaris de biblioteconomia i documentació

Abstract:  An overview is presented of resources and web analytics strategies useful in setting solutions for capturing usage statistics and assessing audiences for open access academic journals. A set of complementary metrics to citations is contemplated to help journal editors and managers to provide evidence of the performance of the journal as a whole, and of each article in particular, in the web environment. The measurements and indicators selected seek to generate added value for editorial management in order to ensure its sustainability. The proposal is based on three areas: counts of visits and downloads, optimization of the website alongside with campaigns to attract visitors, and preparation of a dashboard for strategic evaluation. It is concluded that, from the creation of web performance measurement plans based on the resources and proposals analysed, journals may be in a better position to plan the data-driven web optimization in order to attract authors and readers and to offer the accountability that the actors involved in the editorial process need to assess their open access business model.

 

 

Library Vendor Platforms Need a Strategic Reboot to Meet Librarian Curriculum Development Needs – The Scholarly Kitchen

“Our industry must create an equal handshake between paid and open content if our platform is to solve the problem that brings a user to the platform. If I am seeking the best aligned and most comprehensive set of resources to design a course, I must have equal access to open and paid content. To achieve this handshake, I propose three key principles:

Platforms need full-text, complete video files, audiobooks, etc. of the relevant content, paid and open, to improve the metadata searched for discovery and the user experience once an item is selected as appropriate.
The search results pages and content entity pages must  clearly display the open access/OER symbol, and the Creative Commons license applied to the content for future uses. In addition, an explanation of the license will often be required to reduce faculty uncertainty about reuse. For example, CC BY-NC 2.0 allows for remixing and re-use but not for commercial gain. A patron may struggle to understand this rights limitation without clear guidance from the platform.
Content providers, publishers, distributors, etc. are the lifeblood of the platform. Platforms invest heavily in services and functionality, but without content there is no user experience. To this end, and especially for providers of open content, we need to deliver robust data and insight into usage, engagement, and impact. Publishers need to see open and paid content usage by account, to include time viewed/pages turned, etc. Publishers need to see how the content is engaged with and when (time of day, device used) and publishers need to see how the content has impacted the recipient, e.g., student performance metrics….”

Coleridge Initiative – Show US the Data | Kaggle

“This competition challenges data scientists to show how publicly funded data are used to serve science and society. Evidence through data is critical if government is to address the many threats facing society, including; pandemics, climate change, Alzheimer’s disease, child hunger, increasing food production, maintaining biodiversity, and addressing many other challenges. Yet much of the information about data necessary to inform evidence and science is locked inside publications.

Can natural language processing find the hidden-in-plain-sight data citations? Can machine learning find the link between the words used in research articles and the data referenced in the article?

Now is the time for data scientists to help restore trust in data and evidence. In the United States, federal agencies are now mandated to show how their data are being used. The new Foundations of Evidence-based Policymaking Act requires agencies to modernize their data management. New Presidential Executive Orders are pushing government agencies to make evidence-based decisions based on the best available data and science. And the government is working to respond in an open and transparent way.

This competition will build just such an open and transparent approach. …”

An analysis of use statistics of electronic papers in a Korean scholarly information repository

Abstract:  Introduction. This study aimed to analyse the current use status of Korean scholarly papers accessible in the repository of the Korea Institute of Science and Technology Information in order to assess the economic validity of the maintenance and operation of the repository.

Method. This study used the modified historical cost method and performed regression analysis on the use of Korean scholarly papers by year and subject area.

Analysis. The development cost of the repository and the use volumes were analysed based on 1,154,549 Korean scholarly papers deposited in the Institute repository.

Results. Approximately 86% of the deposited papers were downloaded at least once and on average, a paper was downloaded over twenty-six times. Regression analysis showed that the ratio of use of currently deposited papers is likely to decrease by 7.6% annually, as new ones are added.

Conclusions. The need to manage currently deposited papers for at least thirteen years into the future and provide empirical proof that the repository has contributed to Korean researchers conducting research and development in the fields of science and technology. The benefit-cost ratio was above nineteen, confirming the economic validity of the repository.

What We Talk About When We Talk About… Book Usage Data

“Over the last two-and-a-half years, we have been working as part of the EU-funded HIRMEOS (High Integration of Research Monographs in the European Open Science Infrastructure) project to create open source software and databases to collectively gather and host usage data from various platforms for multiple publishers. As part of this work, we have been thinking deeply about what the data we collect actually means. Open Access books are read on, and downloaded from, many different platforms – this availability is one of the benefits of making work available Open Access, after all – but each platform has a different way of counting up the number of times a book has been viewed or downloaded.

Some platforms count a group of visits made to a book by the same user within a continuous time frame (known as a session) as one ‘view’ – we measure usage in this way ourselves on our own website – but the length of a session might vary from platform to platform. For example, on our website we use Google Analytics, according to which one session (or ‘view’) lasts until there is thirty minutes of inactivity. But platforms that use COUNTER-compliant figures (the standard that libraries prefer) have a much shorter time-frame for a single session – and such a platform would record more ‘views’ than a platform that uses Google Analytics, even if it was measuring the exact same pattern of use.[2]

Other platforms simply count each time a book is accessed (known as a visit) as one ‘view’. There might be multiple visits by the same user within a short time frame – which our site would count as one session, or one ‘view’ – but which a platform counting visits rather than sessions would record as multiple ‘views’.

Downloads (which we also used to include in the number of ‘views’) also present problems. For example, many sites only allow chapter downloads (e.g. JSTOR), others only whole book downloads (e.g. OAPEN), and some allow both (e.g. our own website). How do you combine these different types of data? Somebody who wants to read the whole book would need only one download from OAPEN, but as many downloads as there are chapters from JSTOR – thus inflating the number of downloads for a book that has many chapters.

So aggregating this data into a single figure for ‘views’ isn’t only comparing apples with oranges – it’s mixing apples, oranges, grapes, kiwi fruit and pears. It’s a fruit salad….”

Visualizing Book Usage Statistics with Metabase · punctum books

“There is an inherent contradiction between publishing open access books and gathering usage statistics. Open access books are meant to be copied, shared, and spread without any limit, and the absence of any Digital Rights Management (DRM) technology in our PDFs makes it indeed impossible to do so. Nevertheless, we can gather an approximate impression of book usage among certain communities, such as hardcopy readers and those connected to academic infrastructures, by gathering data from various platforms and correlating them. These data are useful for both our authors and supporting libraries to gain insight into the usage of punctum publications.undefined

As there exists no ready-made open-source solution that we know of to accomplish this, for many years we struggled to import these data from various sources into ever-growing spreadsheets, with ever more complicated formulas to extract meaningful data and visualize them. This year, we decided to split up the database and correlation/visualization aspects, by moving the data into a MySQL database managed via phpMyAdmin, while using Metabase for the correlation and visualization part. This allows us to expose our usage data publicly, while also keeping them secure….”

Visualizing Book Usage Statistics with Metabase · punctum books

“There is an inherent contradiction between publishing open access books and gathering usage statistics. Open access books are meant to be copied, shared, and spread without any limit, and the absence of any Digital Rights Management (DRM) technology in our PDFs makes it indeed impossible to do so. Nevertheless, we can gather an approximate impression of book usage among certain communities, such as hardcopy readers and those connected to academic infrastructures, by gathering data from various platforms and correlating them. These data are useful for both our authors and supporting libraries to gain insight into the usage of punctum publications.undefined

As there exists no ready-made open-source solution that we know of to accomplish this, for many years we struggled to import these data from various sources into ever-growing spreadsheets, with ever more complicated formulas to extract meaningful data and visualize them. This year, we decided to split up the database and correlation/visualization aspects, by moving the data into a MySQL database managed via phpMyAdmin, while using Metabase for the correlation and visualization part. This allows us to expose our usage data publicly, while also keeping them secure….”

When to Hold Them, When to Fold Them: Reassessing “Big Deals” in 2020: The Serials Librarian: Vol 0, No 0

Abstract:  While cancellations of “Big Deals” at research institutions are making the headlines, small- and medium-sized schools are also addressing the issue of managing their journal packages by cancelling or unbundling major publishers’ journal packages. Although “Big Deals” were advantageous when first acquired, as the years passed, large publishers absorbed more publications annually, which brought higher costs and titles of lower relevance to the library. Each year librarians at Pepperdine University have analyzed cost per use, and each year the cost per use increased on many packages until these increases became unsustainable. Coinciding with this tipping point, alternatives to licensing entire packages emerged or became more viable. Libraries across the country realize that they no longer need to own everything. The authors go into details for each of the publishers’ “Big Deals,” present reasons why they were cancelled or restructured, the alternative solutions implemented, and what the reaction has been.

 

ResearchGate Partnership and COUNTER 5 Usage Reporting

“In the first section of this 1-hour webinar, Sebastian Bock, Senior Product Manager in the Product & Platform Group at Springer Nature, will introduce details concerning the finalized agreement between Springer Nature and ResearchGate. In the second section guest speaker Michael Häusler, Head of Engineering Architecture at ResearchGate, will explain the technical aspects of the Springer Nature partnership with ResearchGate including data exchange, authentication and authorization processes. Sebastian will finish with a look at Springer Nature processes to support the agreement including a look at COUNTER 5 usage statistics reporting.”

[2011.11940] Preprints as accelerator of scholarly communication: An empirical analysis in Mathematics

Abstract:  In this study we analyse the key driving factors of preprints in enhancing scholarly communication. To this end we use four groups of metrics, one referring to scholarly communication and based on bibliometric indicators (Web of Science and Scopus citations), while the others reflect usage (usage counts in Web of Science), capture (Mendeley readers) and social media attention (Tweets). Hereby we measure two effects associated with preprint publishing: publication delay and impact. We define and use several indicators to assess the impact of journal articles with previous preprint versions in arXiv. In particular, the indicators measure several times characterizing the process of arXiv preprints publishing and the reviewing process of the journal versions, and the ageing patterns of citations to preprints. In addition, we compare the observed patterns between preprints and non-OA articles without any previous preprint versions in arXiv. We could observe that the “early-view” and “open-access” effects of preprints contribute to a measurable citation and readership advantage of preprints. Articles with preprint versions are more likely to be mentioned in social media and have shorter Altmetric attention delay. Usage and capture prove to have only moderate but stronger correlation with citations than Tweets. The different slopes of the regression lines between the different indicators reflect different order of magnitude of usage, capture and citation data.