62
Views
0
CrossRef citations to date
0
Altmetric
Article

When are there too many collisions? Variants of the birthday problem

Pages 4487-4497 | Received 22 Jun 2022, Accepted 16 Feb 2023, Published online: 12 Mar 2023
 

Abstract

Due to restrictions on the use of unique identifiers of individuals in data sets, there may be instances in which two or more data sets have some of the individuals in common, with no direct way to detect such occurrences. More generally, a collision occurs when two or more observations are in agreement with respect to variables associated with the observations. This article discusses several possible statistical/probabilistic approaches to determining when the number of collisions (or near-collisions) exceeds what would be expected by chance if in fact the observations are all distinct. The methods and results are related to the Birthday Problem and to Occupancy Problems.

Notes

1 Note that this formula is valid even when L > N, since 0!=1, and fi0=1.

2 In fact birthdays appear not to be evenly distributed. In a sample of 481,040 people listed on insurance applications in the United States, the percents of various birthdays (excluding February 29) ranged from 0.23% (December 26) to 0.31% (September 15). See: http://www.panix.com/ ∼murphy/bday.html.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,069.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.