EDI-Graphic: A Tool To Study Parameter Discrimination and Confirm Identifiability in Black-Box Models, and to Select Data-Generating Machines: Journal of Computational and Graphical Statistics: Vol 33, No 1

483

Views

CrossRef citations to date

Altmetric

Abstract

In a Data-Generating Experiment (DGE), the data, $X$ , is often obtained from a Black-Box and is approximated with a learning machine/sampler, $f (Y, θ); θ \in Θ,$ $Y$ is random, f is known. When $X$ has unknown cdf, $F_{θ},$ nonidentifiability of θ cannot be confirmed and may limit the predictive accuracy of the learned model, $f (Y, \hat{θ}); \hat{θ}$ estimate of $θ .$ Using properties of the Expected p-value for the Kolmogorov-Smirnov test, the Empirical Discrimination Index (EDI) and the Proportion of p-Values Index (PPVI) are introduced: (i) to confirm almost surely, discrimination of θ from $θ^{*}, that is, F_{θ} \neq F_{θ^{*}},$ (ii) to confirm with EDI-graphics identifiability of $θ (\in Θ)$ by repeating (i) for $θ^{*}$ in a fine sieve of $Θ,$ and (iii) to compare EDI-graphics and PPVIs of DGEs and select to use the DGE with the greater parameter discrimination and the smaller number of $θ^{*}$ violating identifiability of $θ .$ Among the applications, EDI and PPVI explain why the g-estimate in Tukey’s g-and-h model is better than that for the g-and-k model, unless the sample size is extremely large; $h = h_{0} = k .$ EDI-graphics indicate that Normal learning machines have better parameter discrimination than Sigmoid learning machines and their parameters are nonidentifiable. Supplementary materials for this article are available online.

KEYWORDS:

Supplementary Materials

Proofs and R-functions used in Examples 4.1–4.6.

Acknowledgments

Many thanks are due to Professor Faming Liang and Professor Galin Jones, Editors, who have handled, respectively, the original submission and the revisions. Thanks are due to the referees, for their comments that improved the presentation of the paper, and to Mr. Yongzhen Feng, Tsinghua University, for the suggestions to improve readability.

Disclosure Statement

The authors report there are no competing interests to declare.

Notes

1 When $Θ = R^{p},$ without preliminary estimation, $Θ^{*}$ has countably infinite elements.

2 Since one DGE is studied.

EDI-Graphic: A Tool To Study Parameter Discrimination and Confirm Identifiability in Black-Box Models, and to Select Data-Generating Machines

Information for

Open access

Opportunities

Help and information

EDI-Graphic: A Tool To Study Parameter Discrimination and Confirm Identifiability in Black-Box Models, and to Select Data-Generating Machines

Abstract

Supplementary Materials

Acknowledgments

Disclosure Statement

Notes

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature