Both the binomial distribution and the hyper geometric distribution are concerned with the number of events of interest in a sample containing n observations. One of the differences in these two probability distributions is in the way the samples are selected. For the binomial distribution, the sample data are selected with replacement from a finite population or without replacement from an infinite population. Thus, the probability of an event of interest is constant over all observations, and the outcome of any particular observation is independent of any other. For the hyper geometric distribution, the sample data are selected without replacement from a finite population. Thus, the outcome of one observation is dependent on the outcomes of the previous observations. Consider a population of size N. Let A represent the total number of events of interest in the population. The hyper geometric distribution is then used to find the probability of X events of interest in a sample of size n, selected without replacement. This represents the mathematical expression of the hyper geometric distribution for finding x events of interest, given a knowledge of n, N, and A. Because the number of events of interest in the sample, represented by x, cannot be greater than the number of events of interest in the population, A, nor can x be greater than the sample size, n, the range of the hyper geometric random variable is limited to the sample size or to the number of events of interest in the population, whichever is smaller.
Finite population correction factors those results from sampling without replacement from a finite population. Spurious correlation refers to the apparent relationship between variables that either have no true relationship or are related to other variables that have not been measured. One widely publicized stock market indicator in the United States that is an example of spurious correlation is the relationship between the winner of the National Football League Super Bowl and the performance of the Dow Jones Industrial Average in that year. To illustrate the hyper geometric distribution, suppose that you are forming a team of 8 managers from different departments within your company. Your company has a total of 30 managers, and 10 of these people are from the finance department. In deciding whether to use the binomial or hyper geometric distribution, is the probability of an event of interest constant over all trials? If yes, you can use the binomial distribution. If no, you can use the hyper geometric distribution. If there are a fixed number of observations, n, each of which is classified as an event of interest or not an event of interest, you use the binomial or hyper geometric distribution. If there is an area of opportunity, you use the Poisson distribution.