704
Views
0
CrossRef citations to date
0
Altmetric
Research Paper

MRM-BERT: a novel deep neural network predictor of multiple RNA modifications by fusing BERT representation and sequence features

&
Pages 1-10 | Accepted 02 Feb 2024, Published online: 15 Feb 2024

References

  • Zhao LY, Song J, Liu Y, et al. Mapping the epigenetic modifications of DNA and RNA. Protein Cell. 2020;11(11):792–808. doi: 10.1007/s13238-020-00733-7
  • Frye M, Harada BT, Behm M, et al. RNA modifications modulate gene expression during development. Science. 2018;361(6409):1346–9. doi: 10.1126/science.aau1646
  • Vu LP, Pickering BF, Cheng Y, et al. The N(6)-methyladenosine (m(6)A)-forming enzyme METTL3 controls myeloid differentiation of normal hematopoietic and leukemia cells. Nat Med. 2017;23(11):1369–1376. doi: 10.1038/nm.4416
  • Jana S, Hsieh AC, Gupta R. Reciprocal amplification of caspase-3 activity by nuclear export of a putative human RNA-modifying protein, PUS10 during TRAIL-induced apoptosis. Cell Death Dis. 2017;8(10):e3093. doi: 10.1038/cddis.2017.476
  • Roundtree IA, Evans ME, Pan T, et al. Dynamic RNA Modifications in Gene Expression Regulation. Cell. 2017;169(7):1187–200. doi: 10.1016/j.cell.2017.05.045
  • Zhang Y, Hua W, Dang Y, et al. Validated impacts of N6-methyladenosine methylated mRnas on apoptosis and angiogenesis in myocardial infarction based on MeRIP-seq analysis. Front Mol Biosci. 2021;8:789923. doi: 10.3389/fmolb.2021.789923
  • Hocq R, Paternina J, Alasseur Q, et al. Monitored eCLIP: high accuracy mapping of RNA-protein interactions. Nucleic Acids Res. 2018;46(21):11553–65. doi: 10.1093/nar/gky858
  • Gao Y, Liu X, Wu B, et al. Quantitative profiling of N(6)-methyladenosine at single-base resolution in stem-differentiating xylem of populus trichocarpa using nanopore direct RNA sequencing. Genome Biol. 2021;22(1):22. doi: 10.1186/s13059-020-02241-7
  • Pandey RR, Pillai RS. Counting the cuts: MAZTER-Seq quantifies m(6)A levels using a methylation-sensitive ribonuclease. Cell. 2019;178(3):515–517. doi: 10.1016/j.cell.2019.07.006
  • Meyer KD. DART-seq: an antibody-free method for global m(6)A detection. Nat Methods. 2019;16(12):1275–1280. doi: 10.1038/s41592-019-0570-0
  • Zhou Y, Zeng P, Li YH, et al. SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res. 2016;44(10):e91. doi: 10.1093/nar/gkw104
  • Rehman MU, Tayara H, Chong KT. DL-m6A: identification of N6-methyladenosine sites in mammals using deep learning based on different encoding schemes. IEEE/ACM Trans Comput Biol Bioinform. 2023;20(2):904–911. doi: 10.1109/TCBB.2022.3192572
  • Chen K, Wei Z, Zhang Q, et al. WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Res. 2019;47(7):e41. doi: 10.1093/nar/gkz074
  • Zou Q, Xing P, Wei L, et al. Gene2vec: gene subsequence embedding for prediction of mammalian N(6) -methyladenosine sites from mRNA. RNA. 2019;25(2):205–218. doi: 10.1261/rna.069112.118
  • Wang H, Zhao S, Cheng Y, et al. MTDeepM6A-2S: a two-stage multi-task deep learning method for predicting RNA N6-methyladenosine sites of Saccharomyces cerevisiae. Front Microbiol. 2022;13:999506. doi: 10.3389/fmicb.2022.999506
  • Feng P, Chen W. iRNA-m5U: a sequence based predictor for identifying 5-methyluridine modification sites in Saccharomyces cerevisiae. Methods. 2022;203:28–31. doi: 10.1016/j.ymeth.2021.04.013
  • Xiao X, Shao YT, Luo ZT, et al. m5C-HPromoter: an ensemble deep learning predictor for identifying 5-methylcytosine sites in human promoters. Curr Bioinform. 2022;17(5):452–461. doi: 10.2174/1574893617666220330150259
  • Liu Z, Xiao X, Yu DJ, et al. pRnam-PC: predicting N(6)-methyladenosine sites in RNA sequences via physical-chemical properties. Anal Biochem. 2016;497:60–67. doi: 10.1016/j.ab.2015.12.017
  • Song Z, Huang D, Song B, et al. Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nat Commun. 2021;12(1):4011. doi: 10.1038/s41467-021-24313-3
  • Desai A, Zumbo A, Giordano M, et al. Word2vec word embedding-based artificial intelligence model in the triage of patients with suspected diagnosis of major ischemic stroke: a feasibility study. Int J Environ Res Public Health. 2022;19(22):15295. doi: 10.3390/ijerph192215295
  • Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning–based sequence model. Nat Methods. 2015;12(10):931–934. doi: 10.1038/nmeth.3547
  • Devlin J, Chang MW, Lee K, et al. Pre-training of deep bidirectional transformers for language understanding. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Naacl Hlt 2019); Minneapolis, Minnesota, USA. 2019;1:4171–4186.
  • Spitale G, Biller-Andorno N, Germani F. AI model GPT-3 (dis)informs us better than humans. Sci Adv. 2023;9(26). doi: 10.1126/sciadv.adh1850
  • Ji Y, Zhou Z, Liu H, et al. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics. 2021;37(15):2112–20. doi: 10.1093/bioinformatics/btab083
  • Tang Y, Chen K, Song B, et al. m6A-Atlas: a comprehensive knowledgebase for unraveling the N6-methyladenosine (m6A) epitranscriptome. Nucleic Acids Res. 2021;49(D1):D134–D43. doi: 10.1093/nar/gkaa692
  • Ma J, Song B, Wei Z, et al. m5C-Atlas: a comprehensive database for decoding and annotating the 5-methylcytosine (m5C) epitranscriptome. Nucleic Acids Res. 2022;50(D1):D196–D203. doi: 10.1093/nar/gkab1075
  • Sun WJ, Li JH, Liu S, et al. Rmbase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data. Nucleic Acids Res. 2016;44(D1):D259–65. doi: 10.1093/nar/gkv1036
  • Xuan J, Chen L, Chen Z, et al. Rmbase v3.0: decode the landscape, mechanisms and functions of RNA modifications. Nucleic Acids Res. 2023;52(D1):D273–D284. doi: 10.1093/nar/gkad1070
  • Liu B, Liu F, Wang X, et al. Pse-in-one: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Res. 2015;43(W1):W65–71. doi: 10.1093/nar/gkv458
  • Chen Z, Zhao P, Li F, et al. iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data. Brief Bioinform. 2020;21(3):1047–57. doi: 10.1093/bib/bbz041
  • Lee D, Karchin R, Beer MA. Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011;21(12):2167–80. doi: 10.1101/gr.121905.111
  • Manavalan B, Basith S, Shin TH, et al. 4mCpred-EL: an ensemble learning framework for identification of DNA N(4)-methylcytosine sites in the mouse genome. Cells. 2019;8(11):1332. doi: 10.3390/cells8111332
  • Chen W, Tran H, Liang Z, et al. Identification and analysis of the N(6)-methyladenosine in the Saccharomyces cerevisiae transcriptome. Sci Rep. 2015;5(1):13859. doi: 10.1038/srep13859
  • Chen Z, Chen YZ, Wang XF, et al. Prediction of ubiquitination sites by using the composition of k-spaced amino acid pairs. PloS One. 2011;6(7):e22930. doi: 10.1371/journal.pone.0022930
  • Lalovic D, Veljkovic V. The global average DNA base composition of coding regions may be determined by the electron-ion interaction potential. Biosystems. 1990;23(4):311–6. doi: 10.1016/0303-2647(90)90013-Q
  • Chen TQ, Guestrin C Xgboost: a scalable tree boosting system. Kdd’16: Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining; San Francisco, California, USA. 2016:785–794.