197
Views
0
CrossRef citations to date
0
Altmetric
Review

Use of profile hidden Markov models in viral discovery: current insights

, , &
Pages 29-45 | Published online: 14 Jul 2017

References

  • Durbin R, Eddy SR, Krogh A, Mitchison G. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University Press; 1998.
  • Eisen JA. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 1998;8(3):163–167.
  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402.
  • Krogh A, Brown M, Mian IS, Sjolander K, Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol. 1994;235(5):1501–1531.
  • Grose JH, Casjens SR. Understanding the enormous diversity of bacteriophages: the tailed phages that infect the bacterial family Enterobacteriaceae. Virology. 2014;468–470:421–443.
  • Koonin EV. Temporal order of evolution of DNA replication systems inferred by comparison of cellular and viral DNA polymerases. Biol Direct. 2006;1:39.
  • Caston JR, Carrascosa JL. The basic architecture of viruses. Subcell Biochem. 2013;68:53–75.
  • Borderia AV, Stapleford KA, Vignuzzi M. RNA virus population diversity: implications for inter-species transmission. Curr Opin Virol. 2011;1(6):643–648.
  • Simmonds P. Genetic diversity and evolution of hepatitis C virus – 15 years on. J Gen Virol. 2004;85(Pt 11):3173–3188.
  • Jachiet PA, Colson P, Lopez P, Bapteste E. Extensive gene remodeling in the viral world: new evidence for nongradual evolution in the mobilome network. Genome Biol Evol. 2014;6(9):2195–2205.
  • Brenner SE, Chothia C, Hubbard TJ. Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc Natl Acad Sci U S A. 1998;95(11):6073–6078.
  • Marz M, Beerenwinkel N, Drosten C, et al. Challenges in RNA virus bioinformatics. Bioinformatics. 2014;30(13):1793–1799.
  • Fancello L, Raoult D, Desnues C. Computational tools for viral metagenomics and their application in clinical research. Virology. 2012;434(2):162–174.
  • Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol. 1998;284(4):1201–1210.
  • Paez-Espino D, Eloe-Fadrosh EA, Pavlopoulos GA, et al. Uncovering Earth’s virome. Nature. 2016;536(7617):425–430.
  • Sakowski EG, Munsell EV, Hyatt M, et al. Ribonucleotide reductases reveal novel viral diversity and predict biological and ecological features of unknown marine viruses. Proc Natl Acad Sci U S A. 2014;111(44):15786–15791.
  • Schmidt HF, Sakowski EG, Williamson SJ, Polson SW, Wommack KE. Shotgun metagenomics indicates novel family A DNA polymerases predominate within marine virioplankton. ISME J. 2014;8(1):103–114.
  • Rowe JM, Fabre MF, Gobena D, Wilson WH, Wilhelm SW. Application of the major capsid protein as a marker of the phylogenetic diversity of Emiliania huxleyi viruses. FEMS Microbiol Ecol. 2011;76(2):373–380.
  • Hopkins M, Kailasan S, Cohen A, et al. Diversity of environmental single-stranded DNA phages revealed by PCR amplification of the partial major capsid protein. ISME J. 2014;8(10):2093–2103.
  • Dwivedi B, Xue B, Lundin D, Edwards RA, Breitbart M. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes. BMC Evol Biol. 2013;13:33.
  • Hevroni G, Enav H, Rohwer F, Beja O. Diversity of viral photosystem-I psaA genes. ISME J. 2015;9(8):1892–1898.
  • Goldsmith DB, Parsons RJ, Beyene D, Salamon P, Breitbart M. Deep sequencing of the viral phoH gene reveals temporal variation, depth-specific composition, and persistent dominance of the same viral phoH genes in the Sargasso Sea. PeerJ. 2015;3:e997.
  • Greninger AL, Naccache SN, Federman S, et al. Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis. Genome Med. 2015;7:99.
  • Boltz VF, Rausch J, Shao W, et al. Ultrasensitive single-genome sequencing: accurate, targeted, next generation sequencing of HIV-1 RNA. Retrovirology. 2016;13(1):87.
  • Zou X, Tang G, Zhao X, et al. Simultaneous virus identification and characterization of severe unexplained pneumonia cases using a metagenomics sequencing technique. Sci China Life Sci. 2017;60(3):279–286.
  • Rodgers MA, Wilkerson E, Vallari A, et al. Sensitive next generation sequencing method reveals deep genetic diversity of HIV-1 in the Democratic Republic of the Congo. J Virol. 2017;91(6):pii:e01841-16.
  • Trebbien R, Pedersen SS, Vorborg K, Franck KT, Fischer TK. Development of oseltamivir and zanamivir resistance in influenza A(H1N1)pdm09 virus, Denmark, 2014. Euro Surveill. 2017;22(3):30445.
  • Hernandez D, Yu F, Huang X, Kirov S, Pant S, McPhee F. Impact of pre-existing NS5A-L31 or -Y93H minor variants on response rates in patients infected with HCV genotype-1b treated with daclatasvir/asunaprevir. Adv Ther. 2016;33(7):1169–1179.
  • Zhao Q, Wen Y, Jiang Y, et al. Next generation sequencing-based investigation of potential patient-to-patient hepatitis C virus transmission during hemodialytic treatment. PLoS One. 2016;11(1):e0147566.
  • Campo DS, Dimitrova Z, Yamasaki L, et al. Next-generation sequencing reveals large connected networks of intra-host HCV variants. BMC Genomics. 2014;15:(Suppl 5):S4.
  • Qiu P, Stevens R, Wei B, et al. HCV genotyping from NGS short reads and its application in genotype detection from HCV mixed infected plasma. PLoS One. 2015;10(4):e0122082.
  • Sharma D, Priyadarshini P, Vrati S. Unraveling the web of viroinformatics: computational tools and databases in virus research. J Virol. 2015;89(3):1489–1501.
  • Brister JR, Ako-Adjei D, Bao Y, Blinkova O. NCBI viral genomes resource. Nucleic Acids Res. 2015;43(Database issue):D571–D577.
  • Finn RD, Coggill P, Eberhardt RY, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44(D1):D279–D285.
  • Skewes-Cox P, Sharpton TJ, Pollard KS, DeRisi JL. Profile hidden Markov models for the detection of viruses within metagenomic sequence data. PLoS One. 2014;9(8):e105067.
  • Grazziotin AL, Koonin EV, Kristensen DM. Prokaryotic Virus Orthologous Groups (pVOGs): a resource for comparative genomics and protein family annotation. Nucleic Acids Res. 2017;45(Database issue):D491–D498.
  • Huerta-Cepas J, Szklarczyk D, Forslund K, et al. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. 2016;44(D1):D286–D293.
  • Kristensen DM, Cai X, Mushegian A. Evolutionarily conserved orthologous families in phages are relatively rare in their prokaryotic hosts. J Bacteriol. 2011;193(8):1806–1814.
  • Kristensen DM, Waller AS, Yamada T, Bork P, Mushegian AR, Koonin EV. Orthologous gene clusters and taxon signature genes for viruses of prokaryotes. J Bacteriol. 2013;195(5):941–950.
  • Galperin MY, Makarova KS, Wolf YI, Koonin EV. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res. 2015;43(Database issue):D261–269.
  • Alves JM, de Oliveira AL, Sandberg TO, et al. GenSeed-HMM: a tool for progressive assembly using profile HMMs as seeds and its application in Alpavirinae viral discovery from metagenomic data. Front Microbiol. 2016;7:269.
  • Sobreira TJ, Gruber A. Sequence-specific reconstruction from fragmentary databases using seed sequences: implementation and validation on SAGE, proteome and generic sequencing data. Bioinformatics. 2008;24(15):1676–1680.
  • Zhang Y, Sun Y, Cole JR. A scalable and accurate targeted gene assembly tool (SAT-Assembler) for next-generation sequencing data. PLoS Comput Biol. 2014;10(8):e1003737.
  • Gregor I, Schonhuth A, McHardy AC. Snowball: strain aware gene assembly of metagenomes. Bioinformatics. 2016;32(17):i649–i657.
  • Li D, Huang Y, Leung HCM, Luo R, Ting HF, Lam TW. MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs. Lect Notes Comput Sci. 2016;9683:309.
  • Wang Q, Fish JA, Gilman M, Sun Y, Brown CT, Tiedje JM, Cole JR. Xander: employing a novel method for efficient gene-targeted metagenomic assembly. Microbiome. 2015;3:32.
  • Eddy SR. Accelerated profile HMM searches. PLoS Comput Biol. 2011;7(10):e1002195.
  • Zhong C, Edlund A, Yang Y, McLean JS, Yooseph S. Metagenome and metatranscriptome analyses using protein family profiles. PLoS Comput Biol. 2016;12(7):e1004991.
  • Hunt M, Gall A, Ong SH, et al. IVA: accurate de novo assembly of RNA virus genomes. Bioinformatics. 2015;31(14):2374–2376.
  • Ruby JG, Bellare P, Derisi JL. PRICE: software for the targeted assembly of components of (Meta) genomic sequence data. G3 (Bethesda). 2013;3(5):865–880.
  • Smits SL, Bodewes R, Ruiz-Gonzalez A, Baumgärtne W, Koopmans MP, Osterhaus ADME, Schürch AC. Recovering full-length viral genomes from metagenomes. Front Microbiol. 2015;6:1069.
  • Prosperi MC, Ciccozzi M, Fanti I, et al. A novel methodology for large-scale phylogeny partition. Nat Commun. 2011;2:321.
  • de Juan D, Pazos F, Valencia A. Emerging methods in protein co-evolution. Nat Rev Genet. 2013;14(4):249–261.
  • Yin Y, Fischer D. Identification and investigation of ORFans in the viral world. BMC Genomics. 2008;9:24.
  • Llorens C, Futami R, Covelli L, et al. The Gypsy Database (GyDB) of mobile genetic elements: release 2.0. Nucleic Acids Res. 2011;39(Database issue):D70–D74.
  • Foley B, Leitner T, Apetrei C, et al, editors. HIV Sequence Compendium 2013. New Mexico: Theoretical Biology and Biophysics Group, Los Alamos National Laboratory; 2013.
  • Larkin MA, Blackshields G, Brown NP, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947–2948.
  • Gibson T, Higgins D, Thompson J [homepage on the Internet]. General help for CLUSTAL X (2.0). Available from: http://www.clustal.org/download/clustalx_help.html. Accessed May 22, 2017.

References

  • Skewes-Cox P, Sharpton TJ, Pollard KS, DeRisi JL. Profile hidden Markov models for the detection of viruses within metagenomic sequence data. PLoS One. 2014;9(8):e105067.
  • Grazziotin AL, Koonin EV, Kristensen DM. Prokaryotic Virus Orthologous Groups (pVOGs): a resource for comparative genomics and protein family annotation. Nucleic Acids Res. 2017;45(Database issue):D491–D498.