704
Views
0
CrossRef citations to date
0
Altmetric
Research Paper

MRM-BERT: a novel deep neural network predictor of multiple RNA modifications by fusing BERT representation and sequence features

&
Pages 1-10 | Accepted 02 Feb 2024, Published online: 15 Feb 2024
 

ABSTRACT

RNA modifications play crucial roles in various biological processes and diseases. Accurate prediction of RNA modification sites is essential for understanding their functions. In this study, we propose a hybrid approach that fuses a pre-trained sequence representation with various sequence features to predict multiple types of RNA modifications in one combined prediction framework. We developed MRM-BERT, a deep learning method that combined the pre-trained DNABERT deep sequence representation module and the convolutional neural network (CNN) exploiting four traditional sequence feature encodings to improve the prediction performance. MRM-BERT was evaluated on multiple datasets of 12 commonly occurring RNA modifications, including m6A, m5C, m1A and so on. The results demonstrate that our hybrid model outperforms other models in terms of area under receiver operating characteristic curve (AUC) for all 12 types of RNA modifications. MRM-BERT is available as an online tool (http://117.122.208.21:8501) or source code (https://github.com/abhhba999/MRM-BERT), which allows users to predict RNA modification sites and visualize the results. Overall, our study provides an effective and efficient approach to predict multiple RNA modifications, contributing to the understanding of RNA biology and the development of therapeutic strategies.

Acknowledgments

We thank Leibo Liu at Peking University for his technical assistance in configuring the server environment.

Disclosure statement

There is no relevant conflict of financial or non-financial interest.

Authors’ contributions

YZ and LW conceptualized the study; LW and YZ designed the methodology; LW performed the analysis; LW and YZ established the online software; LW drafted the manuscript; YZ supervised the study and revised the manuscript. All authors read and approved the final manuscript.

Data availability statement

The data that support the findings of this study are openly available in FigShare at doi: 10.6084/m9.figshare.24873195, reference number 24,873,195. The source code of this study is openly available in Github at https://github.com/abhhba999/MRM-BERT

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/15476286.2024.2315384

Additional information

Funding

This study was supported by the National Key Research and Development Program of China (2021YFF1201201 to YZ) and National Natural Science Foundation of China (32070658 to YZ).