276
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Signal Classification in Large-Scale Multi-Sequence Integrative Analysis Under the HMM Dependence

ORCID Icon, , & ORCID Icon
Pages 182-195 | Received 26 Feb 2023, Accepted 06 Sep 2023, Published online: 18 Oct 2023
 

Abstract

The integrative analysis of multiple sequences of multiple tests has enjoyed increasing popularity in many applications, especially in large-scale genomics. In the context of large-scale multiple testing, the concept of signal classification has been developed recently for cases when the same features are involved in several independent studies, with the goal of classifying each feature into one of several classes. This article considers the problem of such signal classification in a generalized compound decision-making framework, where the observed data are assumed to be generated from an underlying four-state Cartesian hidden Markov model. Two oracle procedures are proposed for the total and set-specific control of misclassification rates, respectively, while the number of correct classifications is maximized. Optimal data-driven procedures are also proposed, with their asymptotic properties derived. It is shown that signal-classification could be improved significantly by taking into account the dependence structure among features, and the proposed procedures could have a better performance than their competitors that ignore the dependence structure. The proposed methods are applied to a psychiatric genetics study for detecting genetic variants that affect either or both of bipolar disorder and schizophrenia.

Supplementary Materials

Supplementary.pdf: The supplementary file contains the proofs of the theoretical results presented in this article.

CodeAndData.zip: Some computer codes for implementing the proposed methods and the real data used in Section 5.

Acknowledgments

The authors want to thank the Editor, the Associate Editor, and anonymous referees for their constructive comments and suggestions that improved the quality of the article significantly.

Disclosure Statement

The authors report there are no competing interests to declare.

Additional information

Funding

This work was supported by the National Key R&D Program of China [2022YFA1003801; 2021YFA1000101; 2021YFA1000102], National Natural Science Foundation of China [12201382; 12071144; 71931004], Basic Research Project of Shanghai Science and Technology Commission (22JC1400800), and an NSF grant [DMS-1914639].

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.