Search in:

Advanced search

Journal of Computational and Graphical Statistics Latest Articles

Submit an article Journal homepage

Views

CrossRef citations to date

Altmetric

Research Article

Variance-Reduced Stochastic Optimization for Efficient Inference of Hidden Markov Models

Evan Sidrowa Department of Statistics, University of British Columbia, Vancouver, CanadaCorrespondence[email protected]

https://orcid.org/0000-0001-5280-8376 View further author information

Nancy Heckmana Department of Statistics, University of British Columbia, Vancouver, CanadaView further author information

Alexandre Bouchard-Côtéa Department of Statistics, University of British Columbia, Vancouver, CanadaView further author information

Sarah M. E. Fortuneb Department of Oceanography, Dalhousie University, Halifax, CanadaView further author information

Andrew W. Tritesc Department of Zoology, Institute for the Oceans and Fisheries, University of British Columbia, Vancouver, CanadaView further author information

Marie Auger-Méthéd Department of Statistics, Institute for the Oceans and Fisheries, University of British Columbia, Vancouver, Canada

https://orcid.org/0000-0003-3550-4930 View further author information

Received 06 Oct 2023, Accepted 24 Apr 2024, Published online: 07 Jun 2024

Cite this article
https://doi.org/10.1080/10618600.2024.2350476
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Adam, T., Griffiths, C., Leos Barajas, V., Meese, E., Lowe, C., et al. (2019), “Joint Modelling of Multi-scale Animal Movement Data using Hierarchical Hidden Markov Models,” Methods in Ecology and Evolution, 10, 1536–1550. DOI: 10.1111/2041-210X.13241.
Web of Science ®Google Scholar
Agarwal, N., Bullins, B., and Hazan, E. (2017), “Second-order Stochastic Optimization for Machine Learning in Linear Time,” The Journal of Machine Learning Research, 18, 4148–4187.
Web of Science ®Google Scholar
Baldi, P., and Chauvin, Y. (1993), “Smooth On-Line Learning Algorithms for Hidden Markov Models,” Neural Computation, 6, 307–318. DOI: 10.1162/neco.1994.6.2.307.
Web of Science ®Google Scholar
Baum, L. E., Petrie, T., Soules, G., and Weiss, N. (1970), “A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains,” The Annals of Mathematical Statistics, 41, 164–171. DOI: 10.1214/aoms/1177697196.
Google Scholar
Bebbington, M. S. (2007), “Identifying Volcanic Regimes using Hidden Markov Models,” Geophysical Journal International, 171, 921–942. DOI: 10.1111/j.1365-246X.2007.03559.x.
Web of Science ®Google Scholar
Boyd, S., and Vandenberghe, L. (2004), Convex Optimization, Cambridge: Cambridge University Press.
Google Scholar
Cade, D. E., Gough, W. T., Czapanskiy, M. F., Fahlbusch, J. A., Kahane-Rapport, S. R., Linsky, J. M. J., et al. (2021), “Tools for Integrating Inertial Sensor Data With Video Bio-Loggers, Including Estimation of Animal Orientation, Motion, and Position,” Animal Biotelemetry, 9, 34. DOI: 10.1186/s40317-021-00256-w.
Google Scholar
Chen, J., and Li, P. (2009), “Hypothesis Test for Normal Mixture Models: The EM Approach,” The Annals of Statistics, 37, 2523–2542. DOI: 10.1214/08-AOS651.
Web of Science ®Google Scholar
Defazio, A., Bach, F., and Lacoste-Julien, S. (2014), “SAGA: A Fast Incremental Gradient Method with Support for Non-Strongly Convex Composite Objectives,” in Advances in Neural Information Processing Systems, NIPS ’14.
Google Scholar
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society, Series B, 39, 1–38. DOI: 10.1111/j.2517-6161.1977.tb01600.x.
Google Scholar
Fang, C., Li, C. J., Lin, Z., and Zhang, T. (2018), “SPIDER: Near-optimal Non-convex Optimization via Stochastic Path-integrated Differential Estimator,” in Advances in Neural Information Processing Systems (Vol. 31).
Google Scholar
Fletcher, R. (2000), Newton-Like Methods, Wiley.
Google Scholar
Fletcher, R., and Reeves, C. M. (1964), “Function Minimization by Conjugate Gradients,” The Computer Journal, 7, 149–154. DOI: 10.1093/comjnl/7.2.149.
Web of Science ®Google Scholar
Gales, M., and Young, S. (2007), “The Application of Hidden Markov Models in Speech Recognition,” Foundations Trends Signal Processing, 1, 195–304.
Google Scholar
Glennie, R., Adam, T., Leos-Barajas, V., Michelot, T., Photopoulou, T., and McClintock, B. T. (2023), “Hidden Markov Models: Pitfalls and Opportunities in Ecology,” Methods in Ecology and Evolution, 14, 43–56. DOI: 10.1111/2041-210X.13801.
Web of Science ®Google Scholar
Gotoh, Y., Hochberg, M., and Silverman, H. (1998), “Efficient Training Algorithms for HMMs Using Incremental Estimation,” IEEE Transactions on Speech and Audio Processing, 6, 539–548. DOI: 10.1109/89.725320.
Google Scholar
Gürbüzbalaban, M., Ozdaglar, A., and Parrilo, P. (2021), “Why Random Reshuffling Beats Stochastic Gradient Descent,” Mathematical Programming, 186, 49–84. DOI: 10.1007/s10107-019-01440-w.
Web of Science ®Google Scholar
Jensen, F. H., Tervo, O. M., Heide-Jørgensen, M. P., and Ditlevsen, S. (2023), “Detecting Narwhal Foraging Behaviour From Accelerometer and Depth Data Using Mixed-Effects Logistic Regression,” Animal Biotelemetry, 11, 14. DOI: 10.1186/s40317-023-00325-2.
Google Scholar
Johnson, R., and Zhang, T. (2013), “Accelerating Stochastic Gradient Descent Using Predictive Variance Reduction,” in Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 1, neurIPS, Red Hook, NY, USA: Curran Associates Inc.
Google Scholar
Karimi, B., Wai, H.-T., Moulines, E., and Lavielle, M. (2019), “On the Global Convergence of (Fast) Incremental Expectation Maximization Methods,” in Advances in Neural Information Processing Systems (Vol. 32), Curran Associates, Inc.
Google Scholar
Khreich, W., Granger, E., Miri, A., and Sabourin, R. (2012), “Survey of Techniques for Incremental Learning Of HMM Parameters,” Information Sciences, 197, 105–130. DOI: 10.1016/j.ins.2012.02.017.
Web of Science ®Google Scholar
Kingma, D., and Ba, J. (2015), “Adam: A Method for Stochastic Optimization,” in International Conference on Learning Representations (ICLR).
Google Scholar
Kleinberg, B., Li, Y., and Yuan, Y. (2018), “An Alternative View: When Does SGD Escape Local Minima?” in Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, eds. J. Dy and A. Krause, PMLR.
Google Scholar
Kottaram, A., Johnston, L., Cocchi, L., Ganella, E., Everall, I., Pantelis, C., et al. (2019), “Brain Network Dynamics in Schizophrenia: Reduced Dynamism of the Default Mode Network,” Human Brain Mapping, 40, 2212–2228. DOI: 10.1002/hbm.24519.
PubMed Web of Science ®Google Scholar
Langrock, R., Adam, T., Leos Barajas, V., Mews, S., Miller, D. L., and Papastamatiou, Y. P. (2018), “Spline-Based Nonparametric Inference in General State-Switching Models,” Statistica Neerlandica, 72, 179–200. DOI: 10.1111/stan.12133.
Web of Science ®Google Scholar
Lawler, E., Whoriskey, K., Aeberhard, W. H., Field, C., and Mills Flemming, J. (2019), “The Conditionally Autoregressive Hidden Markov Model (CarHMM): Inferring Behavioural States from Animal Tracking Data Exhibiting Conditional Autocorrelation,” Journal of Agricultural, Biological and Environmental Statistics, 24, 651–668. DOI: 10.1007/s13253-019-00366-2.
Web of Science ®Google Scholar
Leos Barajas, V., Gangloff, E. J., Adam, T., Langrock, R., van Beest, F. M., Nabe-Nielsen, J., and Morales, J. M. (2017), “Multi-Scale Modeling of Animal Movement and General Behavior Data Using Hidden Markov Models with Hierarchical Structures,” Journal of Agricultural, Biological and Environmental Statistics, 22, 232–248. DOI: 10.1007/s13253-017-0282-9.
Web of Science ®Google Scholar
Li, Y. (2005), “Hidden Markov Models with States Depending on Observations,” Pattern Recognition Letters, 26, 977–984. DOI: 10.1016/j.patrec.2004.09.050.
Web of Science ®Google Scholar
Li, Z., Han, J., and Song, Y. (2020), “On the Forecasting of High-Frequency Financial Time Series Based on Arima Model Improved by Deep Learning,” Journal of Forecasting, 39, 1081–1097. DOI: 10.1002/for.2677.
Web of Science ®Google Scholar
Liu, S., Wu, H., and Meeker, W. Q. (2015), “Understanding and Addressing the Unbounded ‘Likelihood’ Problem,” The American Statistician, 69, 191–200. DOI: 10.1080/00031305.2014.1003968.
Web of Science ®Google Scholar
Mamon, R. S., and Elliott, R. J. (2007), Hidden Markov Models in Finance, New York: Springer.
Google Scholar
McClintock, B. T., Langrock, R., Gimenez, O., Cam, E., Borchers, D. L., Glennie, R., and Patterson, T. A. (2020), “Uncovering Ecological State Dynamics With Hidden Markov Models,” Ecology Letters, 23, 1878–1903. DOI: 10.1111/ele.13610.
PubMed Web of Science ®Google Scholar
McRae, T., Volpov, B. E., Sidrow, S. F., Auger-Méthé, M., Heckman, N., and Trites, A. (2024), “Killer Whale Respiration Rates,” PloS One, 19, 1–26. DOI: 10.1371/journal.pone.0302758.
Google Scholar
Neal, R. M., and Hinton, G. E. (1998), A View of the EM Algorithm that Justifies Incremental, Sparse, and other Variants, chapter 12, pp. 355–368, Dordrecht: Springer Netherlands.
Google Scholar
Nguyen, L. M., Liu, J., Scheinberg, K., and Takáč, M. (2017), “SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient,” in International Conference on Machine Learning, PMLR.
Google Scholar
Patterson, T. A., Parton, A., Langrock, R., Blackwell, P. G., Thomas, L., and King, R. (2017), “Statistical Modelling of Individual Animal Movement: An Overview of Key Methods and a Discussion of Practical Challenges,” Advances in Statistical Analysis, 101, 399–438.
Web of Science ®Google Scholar
Pearl, J. (1982), “Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach,” in Proceedings of the Second AAAI Conference on Artificial Intelligence, AAAI’82, AAAI Press.
Google Scholar
———(1988), “Chapter 4: Belief Updating by Network Propagation,” in Probabilistic Reasoning in Intelligent Systems, San Francisco (CA): Morgan Kaufmann, pp. 143–237.
Google Scholar
Pirotta, E., Edwards, E. W. J., New, L., and Thompson, P. M. (2018), “Central Place Foragers and Moving Stimuli: A Hidden-State Model to Discriminate the Processes Affecting Movement,” Journal of Animal Ecology, 87, 1116–1125. DOI: 10.1111/1365-2656.12830.
PubMed Web of Science ®Google Scholar
Pohle, J., Langrock, R., van Beest, M., and Schmidt, N. M. (2017), “Selecting the Number of States in Hidden Markov Models: Pragmatic Solutions Illustrated Using Animal Movement,” Journal of Agricultural, Biological and Environmental Statistics, 22, 1–24. DOI: 10.1007/s13253-017-0283-8.
Web of Science ®Google Scholar
Robbins, H., and Monro, S. (1951), “A Stochastic Approximation Method,” The Annals of Mathematical Statistics, 22, 400–407. DOI: 10.1214/aoms/1177729586.
Google Scholar
Schmidt, M., Le Roux, N., and Bach, F. (2017), “Minimizing Finite Sums With the Stochastic Average Gradient,” Mathematical Programming, 162, 83–112. DOI: 10.1007/s10107-016-1030-6.
Web of Science ®Google Scholar
Shamir, O. (2016), “Without-Replacement Sampling for Stochastic Gradient Methods,” in Advances in Neural Information Processing Systems (Vol. 29), Curran Associates, Inc.
Google Scholar
Sidrow, E., Heckman, N., Fortune, S. M. E., Trites, A. W., Murphy, I., and Auger-Méthé, M. (2022), “Modelling Multi-Scale, State-Switching Functional Data with Hidden Markov Models,” Canadian Journal of Statistics, 50, 327–356. DOI: 10.1002/cjs.11673.
Web of Science ®Google Scholar
Tamposis, I. A., Theodoropoulou, M. C., Tsirigos, K. D., and Bagos, P. G. (2018), “Extending Hidden Markov Models to Allow Conditioning on Previous Observations,” Journal of Bioinformatics and Computational Biology, 16, 1850019. DOI: 10.1142/S0219720018500191.
PubMed Web of Science ®Google Scholar
Tennessen, J. B., Holt, M. M., Hanson, M. B., Emmons, C. K., Giles, D. A., and Hogan, J. T. (2019a), “Kinematic Signatures of Prey Capture From Archival Tags Reveal Sex Differences in Killer Whale Foraging Activity,” Journal of Experimental Biology, 222, jeb191874. DOI: 10.1242/jeb.191874.
PubMed Web of Science ®Google Scholar
Tennessen, J. B., Holt, M. M., Ward, E. J., Hanson, M. B., Emmons, C. K., Giles, D. A., and Hogan, J. T. (2019b), “Hidden Markov Models Reveal Temporal Patterns and Sex Differences in Killer Whale Behavior,” Scientific Reports, 9, 14951. DOI: 10.1038/s41598-019-50942-2.
PubMed Web of Science ®Google Scholar
Tennessen, J. B., Holt, M. M., Wright, B. M., Hanson, M. B., Emmons, C. K., Giles, D. A., Hogan, J. T., Thornton, S. J., and Deecke, V. B. (2023), “Divergent Foraging Strategies Between Populations of Sympatric Matrilineal Killer Whales,” Behavioral Ecology, 34, 373–386. DOI: 10.1093/beheco/arad002.
PubMed Web of Science ®Google Scholar
Thiesson, B., Meek, C., and Heckerman, D. (2001), “Accelerating EM For Large Databases,” Machine Learning, 45, 279–299. DOI: 10.1023/A:1017986506241.
Web of Science ®Google Scholar
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., et al. (2019), “SciPy 1.0–Fundamental Algorithms for Scientific Computing in Python,” arXiv e-prints, arXiv:1907.10121.
Google Scholar
Viviant, M., Monestiez, P., and Guinet, C. (2014), “Can We Predict Foraging Success in a Marine Predator from Dive Patterns Only? Validation with Prey Capture Attempt Data,” PloS One, 9, 1–10. DOI: 10.1371/journal.pone.0088503.
Web of Science ®Google Scholar
Wright, B. M., Ford, J. K. B., Ellis, G. M., Deecke, V. B., Shapiro, A. D., Battaile, B. C., and Trites, A. W. (2017), “Fine-Scale Foraging Movements by Fish-Eating Killer Whales (Orcinus Orca) Relate to the Vertical Distributions and Escape Responses of Salmonid Prey (Oncorhynchus Spp.),” Movement Ecology, 5, 3. DOI: 10.1186/s40462-017-0094-0.
PubMed Web of Science ®Google Scholar
Wu, C. F. J. (1983), “On the Convergence Properties of the EM Algorithm,” The Annals of Statistics, 11, 95–103. DOI: 10.1214/aos/1176346060.
Web of Science ®Google Scholar
Ye, F. X., Ma, Y., and Qian, H. (2017), “Estimate Exponential Memory Decay in Hidden Markov Model and its Applications,” arXiv preprint arXiv:1710.06078.
Google Scholar
Zhang, G., Poupart, P., and Trimponias, G. (2020), “Comparing EM with GD in Mixture Models of Two Components,” in Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, volume 115 of Proceedings of Machine Learning Research, eds. R. P. Adams and V. Gogate, PMLR.
Google Scholar
Zhu, R., Wang, L., Zhai, C., and Gu, Q. (2017), “High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm,” in Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, PMLR.
Google Scholar
Zinkevich, M., Weimer, M., Li, L., and Smola, A. (2010), “Parallelized Stochastic Gradient Descent,” in Advances in Neural Information Processing Systems (Vol. 23), eds. J. Lafferty, C. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, Curran Associates, Inc.
Google Scholar
Zucchini, W., Macdonald, I. L., and Langrock, R. (2016), Hidden Markov Models for Time Series - An Introduction Using R, Boca Raton, FL: CRC Press.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Variance-Reduced Stochastic Optimization for Efficient Inference of Hidden Markov Models

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Variance-Reduced Stochastic Optimization for Efficient Inference of Hidden Markov Models

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date