Comparison and Bayesian Estimation of Feature Allocations

David B. Dahla Department of Statistics, Brigham Young University, Provo, UTCorrespondence[email protected]

https://orcid.org/0000-0002-8173-1547

Devin J. Johnsonb Department of Statistical Science, Duke University, Durham, NC

https://orcid.org/0000-0003-2619-6649

R. Jacob Androsa Department of Statistics, Brigham Young University, Provo, UT

https://orcid.org/0000-0002-1289-385X

ABSTRACT

Feature allocation models postulate a sampling distribution whose parameters are derived from shared features. Bayesian models place a prior distribution on the feature allocation, and Markov chain Monte Carlo is typically used for model fitting, which results in thousands of feature allocations sampled from the posterior distribution. Based on these samples, we propose a method to provide a point estimate of a latent feature allocation. First, we introduce FARO loss, a function between feature allocations which satisfies quasi-metric properties and allows for comparing feature allocations with differing numbers of features. The loss involves finding the optimal feature ordering among all possible orderings, but computational feasibility is achieved by framing this task as a linear assignment problem. We also introduce the FANGS algorithm to obtain a Bayes estimate by minimizing the Monte Carlo estimate of the posterior expected FARO loss using the available samples. FANGS can produce an estimate other than those visited in the Markov chain. We provide an investigation of existing methods and our proposed methods. Our loss function and search algorithm are implemented in the fangs package in R.

KEYWORDS:

Disclosure Statement

The authors report there are no competing interests to declare.

Notes

1 This should not be confused with the definition of generalized Hamming distance given by Bookstein, Kulyukin, and Raita (Citation2002) in the computer science literature to compare bitmaps and bitstrings.

2 In the DFA study, Z followed a complex structure with six features. Some information in the first two features was treated as fixed, so we ignore these first two features and assume $K = 4$ .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Comparison and Bayesian Estimation of Feature Allocations

Information for

Open access

Opportunities

Help and information

Comparison and Bayesian Estimation of Feature Allocations

ABSTRACT

Disclosure Statement

Notes

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature