376
Views
0
CrossRef citations to date
0
Altmetric
Research Paper

Nm-Nano: a machine learning framework for transcriptome-wide single-molecule mapping of 2´-O-methylation (Nm) sites in nanopore direct RNA sequencing datasets

, , , &
Pages 1-15 | Accepted 01 May 2024, Published online: 17 May 2024

References

  • Xavier D, Beáta EJ, ArnoldMK, et al. Cajal body-specific small nuclear RNAs: a novel class of 20-O-methylation and pseudouridylation guide RNAs. The EMBO Journal. 2002;21(11):2746–2756. doi: 10.1093/emboj/21.11.2746
  • Rebane ARH, Metspalu A, Metspalu A. Locations of several novel 2´-O-methylated nucleotides in human 28S rRNA. BMC Mol Biol. 2002;3(1):1. doi: 10.1186/1471-2199-3-1
  • Somme J, Roovers M VLB, Steyaert J, Versées W, Droogmans L. Characterization of two homologous 2’-O-methyltransferases showing different specificities for their tRNA substrates. RNA. 2014 Aug;20(8):1257–12571. doi: 10.1261/rna.044503.114 Epub 2014 Jun 20. PMID: 24951554; PMCIDPMC4105751
  • Kurth HM, Mochizuki K. 2’-O-methylation stabilizes Piwi-associated small RNAs and ensures DNA elimination in Tetrahymena. RNA. 2009 Apr;15(4):675–685. doi: 10.1261/rna.1455509 Epub 2009 Feb 24. PMID: 19240163; PMCID: PMC2661841
  • Elliott BA, Ranganathan HH, Vangaveti SV, et al. Modification of messenger RNA by 2’-O-methylation regulates gene expression in vivo. Nat Commun. 2019 Jul 30;10(1):3401. doi: 10.1038/s41467-019-11375-7 PMID: 31363086; PMCID :PMC6667457
  • Guy MP, Weiner SM, Hobson CL, et al. Defects in tRNA anticodon loop 2′-O-methylation are implicated in nonsyndromic X-linked intellectual disability due to mutations in FTSJ1. Hum Mutat. 2015;36(12):1176–1187. doi: 10.1002/humu.22897
  • Picard-Jean F, Brand C, Tremblay-Letourneau M, et al. 2′-omethylation of the mRNA cap protects RNAs from decapping and degradation by DXO. PLOS ONE. 2018;13(3:e0193804.
  • Hengesbach M, Schwalbe H. Structural basis for regulation of ribosomal RNA 2′-o-methylation. Angew Chem Int Ed Engl. 2014;53(7):1742–1744. doi: 10.1002/anie.201309604
  • Erales J, Marchand V, Panthu B, et al. Evidence for rRNA 2′-omethylation plasticity: control of intrinsic translational capabilities of human ribosomes. Proc Natl Acad Sci USA. 2017;114(49):12934–9. doi: 10.1073/pnas.1707674114
  • Dimitrova DG, Teysset L, Carré C. RNA 2’-O-Methylation (nm) modification in human diseases. Genes (Basel). 2019;10(2):117. doi: 10.3390/genes10020117
  • Krogh NB, Nielsen H. RiboMeth-seq: profiling of 20 -O-Me in RNA. Methods Mol Biol. 2017;1562:189–209.
  • Motorin Y, Marchand V. Detection and analysis of RNA ribose 2’-O-Methylations: challenges and solutions. Genes (Basel). 2018 Dec 18;9(12):642. doi: 10.3390/genes9120642 PMID: 30567409; PMCID: PMC6316082
  • Yinzhou Zhu SPPAGGC, Pirnie SP, Carmichael GG. High-throughput and site-specific identification of 2′- O -methylation sites using ribose oxidation sequencing (RibOxi-seq). RNA. 2017 May 11;23(8):1303–1314. doi: 10.1261/rna.061549.117
  • Yuan B-F. Liquid chromatography–mass spectrometry for analysis of RNA adenosine methylation. In: Lusser, editor. RNA methylation: methods and protocols. New York: Springer; 2017. pp. 33–42.
  • Jora M, Lobue PA, Ross RL, et al. Detection of ribonucleoside modifications by liquid chromatography coupled with mass spectrometry. Biochim Biophys Acta, Gene Regul Mech. 2019 Mar;1862(3):280–290. doi: 10.1016/j.bbagrm.2018.10.012
  • Anreiter I, Mir Q, JT S, SC J, Soller M. New twists in detecting mRNA modification dynamics. Trends Biotechnol. 2020 Jul 1; 39(1):72–89. doi: 10.1016/j.tibtech.2020.06.002
  • Dai Q, Moshitch-Moshkovitz S, Han D, et al. Erratum: corrigendum: nm-seq maps 2′-O-methylation sites in human mRNA with base precision. Nat Methods. 2018;15(3):226–227. doi: 10.1038/nmeth0318-226c
  • Chen W, Feng P, Tang H, et al. Identifying 2′-O-methylationation sites by integrating nucleotide chemical properties and nucleotide compositions. Genomics. 2016;107(6):255–258. doi: 10.1016/j.ygeno.2016.05.003
  • Milad Mostavi SSAYH. Deep-2′-O-Me: predicting 2′-O-methylation sites by convolutional neural networks. In: proceedings of Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Honolulu, HI, USA; 2018 July.
  • Zhou Y, Cui Q, Zhou Y. NmSEER V2.0: a prediction tool for 2′-O-methylation sites based on random forest and multi-encoding combination. BMC Bioinf. 2019;20(S25):690. doi: 10.1186/s12859-019-3265-8
  • YK W, Hendra C, PN P, et al. Beyond sequencing: machine learning algorithms extract biology hidden in Nanopore signal data. Trends Genet. 2022 Mar;38(3):246–257. doi: 10.1016/j.tig.2021.09.001
  • Begik O, Lucas MC, LP P, JM R, Medina R, et al. Quantitative profiling of pseudouridylation dynamics in native RNAs with nanopore sequencing. Nat Biotechnol. 2021 Oct;39(10):1278–1291. doi: 10.1038/s41587-021-00915-6 Epub 2021 May 13. PMID: 33986546.
  • Pan S, Zhang Y, Wei Z, et al. Prediction and motif analysis of 2’-O-methylation using a hybrid deep learning model from RNA primary sequence and nanopore signals. Curr Bioinf. 2022;17(9):873–882. doi: 10.2174/1574893617666220815153653
  • Dagnew BHSAG. Grid search-based hyperparameter tuning and classification of microarray cancer data. In: Proceedings of Second International Conference on Advanced Computational and Communication Paradigms (ICACCP), Gangtok, India, 2019.
  • Hassan D, Acevedo D, Daulatabad SV, et al. Penguin: a tool for predicting pseudouridine sites in direct RNA nanopore sequencing data. Methods. 2022 Jul;203:478–487: Epub 2022 Feb 16. PMID: 35182749; PMCID: PMC9232934
  • Bindea G, Mlecnik B, Hackl H, et al. ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009 Apr 15;25(8):1091–1093. doi: 10.1093/bioinformatics/btp101
  • Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005 Oct 25;102(43):15545–50. doi: 10.1073/pnas.0506580102
  • Liao Y, Wang J, Jaehnig EJ, et al. WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Res. 2019 Jul 2;47(W1):W199–W205. doi: 10.1093/nar/gkz401 PMID: 31114916; PMCID: PMC6602449
  • YM H, Zhang X, JA H, Davis DR. An important 2’-OH group for an RNA-protein interaction. Nucleic Acids Res. 2001 Feb 15;29(4):976–85. doi: 10.1093/nar/29.4.976 PMID: 11160931; PMCID: PMC29614
  • Lacoux C, Di Marino D, Pilo Boyl P, et al. BC1-FMRP interaction is modulated by 2’-O-methylation: RNA-binding activity of the tudor domain and translational regulation at synapses. Nucleic Acids Res. 2012 May;40(9):4086–96. doi: 10.1093/nar/gkr1254 Epub 2012 Jan 11. PMID: 22238374; PMCID: PMC3351191
  • Zhang JX, Yordanov B, Gaunt A, et al. A deep learning model for predicting next-generation sequencing depth from DNA sequence. Nat Commun. 2021;12(1):4387. doi: 10.1038/s41467-021-24497-8
  • Huang D, Chen K, Song B, et al. Geographic encoding of transcripts enabled high-accuracy and isoform-aware deep learning of RNA methylation. Nucleic Acids Res. 2022 Oct 14;50(18):10290–10310. doi: 10.1093/nar/gkac830 PMID: 36155798; PMCID: PMC9561283
  • Garalde D, Snell E, Jachimowicz D, et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat Methods. 2018;15(3):201–206. doi: 10.1038/nmeth.4577
  • Basecalling using Guppy. Workflows and tutorials for longread analysis with specific focus on oxford nanopore data. Available from: https://timkahlke.github.io/LongRead_tutorials/BS_G.html
  • Heng L. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018 Sep 15;34(18):3094–3100. doi: 10.1093/bioinformatics/bty191
  • Danecek P, Bonfield JK, Liddle J, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10(2). doi: 10.1093/gigascience/giab008
  • BED file format - Genome Browser FAQ. Available from: https://genome.ucsc.edu/FAQ/FAQformat.html#format1
  • Loman N, Quick J, Simpson J. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods. 2015;12(8):733–735. doi: 10.1038/nmeth.3444
  • Simpson J. Aligning nanopore events to a reference. 2015 Apr 8.
  • Nanopolish. Available from: https://github.com/jts/nanopolish
  • Tomás M, Kai C, Greg C, et al. Efficient estimation of word representations in vector space. ICLR (Workshop Poster). 2013. arXiv preprint arXiv:1301.3781. Available from: https://simpsonlab.github.io/2015/04/08/eventalign/
  • Ng P. dna2vec- consistent vector representations of variable-length k-mers. doi: 10.48550/arXiv.1701.06279
  • Milad Mostavi YH. Machine learning and deep learning challenges for building 2′o site prediction. bioRxiv 2020.05.10.087189. doi:10.1101/2020.05.10.087189
  • Guestrin TCAC. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16); San Francisco, CA; 2016 Aug 13–17.
  • Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. doi: 10.1023/A:1010933404324
  • Jain A. In complete guide to parameter tuning in XGBoost with codes in Python. 2016 Mar. Available from: https://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-XGBoost-with-codes-python/
  • scikit-learn Machine Learning in Python. Available from: https://scikit-learn.org/stable/
  • Grover P. Gradient boosting from scratch. 2017 Dec 8. Available from: https://blog.mlreview.com/gradient-boosting-from-scratch-1e317ae4587d
  • Qi Y. Random forest for bioinformatics. In ensemble machine learning. US: Springer; 2012. p. 307–323.
  • Genism topic modelling for humans. Available from: https://radimrehurek.com/gensim/models/word2vec.html
  • Bradley AE. The use of the area under the Roc curve in the evaluation of machine learning algorithms. Pattern Recognition. 1997;30(7):1145–1159. doi: 10.1016/S0031-3203(96)00142-2
  • Receiver operating characteristic. Available from: https://en.wikipedia.org/wiki/Receiver_operating_characteristic