Deep Learning in Mining Biological Data | SpringerLink

Abstract:  Recent technological advancements in data acquisition tools allowed life scientists to acquire multimodal data from different biological application domains. Categorized in three broad types (i.e. images, signals, and sequences), these data are huge in amount and complex in nature. Mining such enormous amount of data for pattern recognition is a big challenge and requires sophisticated data-intensive machine learning techniques. Artificial neural network-based learning systems are well known for their pattern recognition capabilities, and lately their deep architectures—known as deep learning (DL)—have been successfully applied to solve many complex pattern recognition problems. To investigate how DL—especially its different architectures—has contributed and been utilized in the mining of biological data pertaining to those three types, a meta-analysis has been performed and the resulting resources have been critically analysed. Focusing on the use of DL to analyse patterns in data from diverse biological domains, this work investigates different DL architectures’ applications to these data. This is followed by an exploration of available open access data sources pertaining to the three data types along with popular open-source DL tools applicable to these data. Also, comparative investigations of these tools from qualitative, quantitative, and benchmarking perspectives are provided. Finally, some open research challenges in using DL to mine biological data are outlined and a number of possible future perspectives are put forward.

 

 

Automated screening of COVID-19 preprints: can we help authors to improve transparency and reproducibility? | Nature Medicine

“Although automated screening is not a replacement for peer review, automated tools can identify common problems. Examples include failure to state whether experiments were blinded or randomized2, failure to report the sex of participants2 and misuse of bar graphs to display continuous data3. We have been using six tools4,5,6,7,8 to screen all new medRxiv and bioRxiv COVID-19 preprints (Table 1). New preprints are screened daily9. By this means, reports on more than 8,000 COVID preprints have been shared using the web annotation tool hypothes.is (RRID:SCR_000430) and have been tweeted out via @SciScoreReports (https://hypothes.is/users/sciscore). Readers can access these reports in two ways. The first option is to find the link to the report in the @SciScoreReports tweet in the preprint’s Twitter feed, located in the metrics tab. The second option is to download the hypothes.is bookmarklet. In addition, readers and authors can reply to the reports, which also contain information on solutions….”

Automated screening of COVID-19 preprints: can we help authors to improve transparency and reproducibility? | Nature Medicine

“Although automated screening is not a replacement for peer review, automated tools can identify common problems. Examples include failure to state whether experiments were blinded or randomized2, failure to report the sex of participants2 and misuse of bar graphs to display continuous data3. We have been using six tools4,5,6,7,8 to screen all new medRxiv and bioRxiv COVID-19 preprints (Table 1). New preprints are screened daily9. By this means, reports on more than 8,000 COVID preprints have been shared using the web annotation tool hypothes.is (RRID:SCR_000430) and have been tweeted out via @SciScoreReports (https://hypothes.is/users/sciscore). Readers can access these reports in two ways. The first option is to find the link to the report in the @SciScoreReports tweet in the preprint’s Twitter feed, located in the metrics tab. The second option is to download the hypothes.is bookmarklet. In addition, readers and authors can reply to the reports, which also contain information on solutions….”

How data sharing is accelerating railway safety research

“Andre?’s dataset was shortlisted for the Mendeley Data FAIRest Datasets Award, which recognizes researchers who make their data available for the research community in a way that exemplifies the FAIR Data Principles – Findable, Accessible, Interoperable, Reusable. The dataset was applauded for a number of reasons, not least the provision of clear steps to reproduce the data. What’s more, the data was clearly catalogued and stored in sub folders, with additional links to Blender and GitHub, making the dataset easily available and reproducible for all….”

Caltech Open-Sources AI for Solving Partial Differential Equations

“Researchers from Caltech’s DOLCIT group have open-sourced Fourier Neural Operator (FNO), a deep-learning method for solving partial differential equations (PDEs). FNO outperforms other existing deep-learning techniques for solving PDEs and is three orders of magnitude faster than traditional solvers….”

Accessing early scientific findings | Early Evidence Base

“Early Evidence Base (EEB) is an experimental platform that combines artificial intelligence with human curation and expert peer-review to highlight results posted in preprints. EEB is a technology experiment developed by EMBO Press and SourceData.

Preprints provide the scientific community with early access to scientific evidence. For experts, this communication channel is an efficient way to accesss research without delay and thus to accelerate scientific progress. But for non-experts, navigating preprints can be challenging: in absence of peer-review and journal certification, interpreting the data and evaluating the strength of the conclusions is often impossible; finding specific and relevant information in the rapidly accumulating corpus of preprints is becoming increasingly difficult.

The current COVID-19 pandemic has made this tradeoff even more visible. The urgency in understanding and combatting SARS-CoV-2 viral infection has stimulated an unprecedented rate of preprint posting. It has however also revealed the risk resulting from misinterpretation of preliminary results shared in preprint and with amplification or perpetuating prelimature claims by non-experts or the media.

To experiment with ways in which technology and human expertise can be combined to address these issues, EMBO has built the EEB. The platform prioritizes preprints in complementary ways:

Refereed Preprints are preprints that are associated with reviews. EEB prioritizes such preprints and integrates the content of the reviews as well as the authors’ response, when available, to provide rich context and in-depth analyses of the reported research.
To highlight the importance of experimental evidence, EEB automatically highlights and organizes preprints around scientific topics and emergent areas of research.
Finally, EEB provides an automated selection of preprints that are enriched in studies that were peer reviewed, may bridge several areas of research and use a diversity of experimental approaches….”

 

Machine-generated summaries of three articles are published for the first time as part of a Nature Index supplement | Corporate Affairs Homepage | Springer Nature

“Escalating computing power, expanding data sets, and algorithms of unprecedented sophistication have led to a massive increase in the number of journal and conference papers referring to AI in recent years. The Nature Index AI supplement, published today, draws on Nature Index data and the larger Dimensions* from Digital Science database to analyse this rapidly advancing and controversial topic. For the first time, the supplement also includes summaries of research articles created using AI, and it looks more broadly at how AI is being used in scholarly publishing. …”

ASReview – Active learning for Systematic Reviews

“Anyone who goes through the process of screening large amounts of texts such as newspapers, scientific abstracts for a systematic review, or ancient texts, knows how labor intensive this can be. With the rapidly evolving field of Artificial Intelligence (AI), the large amount of manual work can be reduced or even completely replaced by software using active learning.

By using our AI-aided tool, you can not only save time, but you can also increase the quality of your screening process. ASReview enables you to screen more texts than the traditional way of screening in the same amount of time. Which means that you can achieve a higher quality than when you would have used the traditional approach.

Consider the example of systematic reviews, which are “top of the bill” in research. However, the number of scientific papers on any topic is skyrocketing. Since it is of crucial importance for the advancement of science to produce high-quality systematic review articles, sometimes as quickly as possible in times of crisis, we need to find a way to effectively automate this screening process. Before Elas* was there to help you, systematic reviewing was an exhaustive task, often very boring….”