ABSTRACT
RNA modifications play crucial roles in various biological processes and diseases. Accurate prediction of RNA modification sites is essential for understanding their functions. In this study, we propose a hybrid approach that fuses a pre-trained sequence representation with various sequence features to predict multiple types of RNA modifications in one combined prediction framework. We developed MRM-BERT, a deep learning method that combined the pre-trained DNABERT deep sequence representation module and the convolutional neural network (CNN) exploiting four traditional sequence feature encodings to improve the prediction performance. MRM-BERT was evaluated on multiple datasets of 12 commonly occurring RNA modifications, including m6A, m5C, m1A and so on. The results demonstrate that our hybrid model outperforms other models in terms of area under receiver operating characteristic curve (AUC) for all 12 types of RNA modifications. MRM-BERT is available as an online tool (http://117.122.208.21:8501) or source code (https://github.com/abhhba999/MRM-BERT), which allows users to predict RNA modification sites and visualize the results. Overall, our study provides an effective and efficient approach to predict multiple RNA modifications, contributing to the understanding of RNA biology and the development of therapeutic strategies.
Acknowledgments
We thank Leibo Liu at Peking University for his technical assistance in configuring the server environment.
Disclosure statement
There is no relevant conflict of financial or non-financial interest.
Authors’ contributions
YZ and LW conceptualized the study; LW and YZ designed the methodology; LW performed the analysis; LW and YZ established the online software; LW drafted the manuscript; YZ supervised the study and revised the manuscript. All authors read and approved the final manuscript.
Data availability statement
The data that support the findings of this study are openly available in FigShare at doi: 10.6084/m9.figshare.24873195, reference number 24,873,195. The source code of this study is openly available in Github at https://github.com/abhhba999/MRM-BERT
Supplementary material
Supplemental data for this article can be accessed online at https://doi.org/10.1080/15476286.2024.2315384