Bayesian Semiparametric Local Clustering of Multiple Time Series Data: Technometrics: Vol 66, No 2

191

Views

CrossRef citations to date

Altmetric

Abstract

In multiple time series data, clustering the component profiles can identify meaningful latent groups while also detecting interesting change points in their trajectories. Conventional time series clustering methods, however, suffer the drawback of requiring the co-clustered units to have the same cluster membership throughout the entire time domain. In contrast to these “global” clustering methods, we develop a Bayesian “local” clustering method that allows the functions to flexibly change their cluster memberships over time. We design a Markov chain Monte Carlo algorithm to implement our method. We illustrate the method in several real-world datasets, where time-varying cluster memberships provide meaningful inferences about the underlying processes. These include a public health dataset to showcase the more detailed inference our method can provide over global clustering alternatives, and a temperature dataset to demonstrate our method’s utility as a flexible change point detection method. Supplemental materials for this article, including R codes implementing the method, are available online.

Keywords:

Supplementary Materials

The supplementary materials detail the choice of hyper-parameters and the MCMC algorithm used to sample from the posterior. We also include additional figures demonstrating the local clustering method’s ability to recover individual-specific curves. The data for our simulation experiment can be accessed as a separate csv file from the online supplementary materials accompanying this article. R codes implementing and demonstrating the methods developed in this article are also included in the online supplementary materials. Manuals for the codes and a ReadMe file providing additional details on how data should be formatted for compatibility with our codes are also included.

Acknowledgments

We thank the Editor, Dr. Robert Gramacy, an anonymous Associate Editor, and three anonymous referees for their thorough review of the originally submitted manuscript and their many constructive comments and suggestions which led to a significantly improved final article.

Disclosure Statement

There are no relevant financial or non-financial competing interests to report here.

Notes

1 This article concentrates specifically on the analysis of multiple time series data where each constituent series pertains to the same variable. Such data may be obtained, for example, as (a) multiple univariate time series from a set of different but comparable sources over the same time period; or (b) multiple records collected from the same source over different recurrent time cycles of same length. The two real datasets analyzed in Section 3 of the main article here correspond to one each of these two scenarios.

Additional information

Funding

This work was supported in part by grant DMS-1953712 from the National Science Foundation.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Bayesian Semiparametric Local Clustering of Multiple Time Series Data

Information for

Open access

Opportunities

Help and information