# Sampling Fraction

Sampling Fraction -- or -- why do we usually just care about the sample size, not the population size?
sample (N) vs. population (M)
sampling fraction = N/M

the actual formula for the standard error (standard deviation of the sampling distribution) is:

where f = sampling fraction = N / M

but since typically M >> N
then f --> 0

so 1-f becomes 1, and so the formula for the standard error becomes:

Comparison of Corrected and Uncorrected Standard Error Calculations of a Hypothetical Population of 38,000 (and standard deviation of 20,000).

 sample size (n) Population size (m) sampling fraction (f) standard deviation of the sample std error (corrected) std error (uncorrected) Percent difference between corrected and uncorrected standard error 1 38,000 0.00003 20,000 19999.7 20000.0 0.00% 100 38,000 0.00263 20,000 1997.4 2000.0 0.13% 200 38,000 0.00526 20,000 1410.5 1414.2 0.26% 400 38,000 0.01053 20,000 994.7 1000.0 0.53% 800 38,000 0.02105 20,000 699.6 707.1 1.06% 1,600 38,000 0.04211 20,000 489.4 500.0 2.13% 3,200 38,000 0.08421 20,000 338.3 353.6 4.30% 6,400 38,000 0.16842 20,000 228.0 250.0 8.81% 12,800 38,000 0.33684 20,000 144.0 176.8 18.57% 25,600 38,000 0.67368 20,000 71.4 125.0 42.88% 37,999 38,000 0.99997 20,000 0.5 102.6 99.49% 38,000 38,000 1.00000 20,000 0.0 102.6 100.00%

Note that there is very little difference in using the corrected vs. uncorrected standard error until the sampling fraction gets large.  For example, even with a sample of 800 (out of a total population of 38,000), the difference is only 1 percent.  The two estimates of standard error only begin to deviate significantly when the sample size is more than several thousand (that is, when the sampling fraction approaches about 10% or more).

Moral of the story:   it is fine -- and more conservative -- to use the uncorrected estimate, which is easier to calculate anyway.