UP504
(Scott Campbell)

Sampling Fraction

return to UP504 main page

Sampling Fraction -- or -- why do we usually just care about the sample size, not the population size?
sample (N) vs. population (M)
sampling fraction = N/M

the actual formula for the standard error (standard deviation of the sampling distribution) is:

where f = sampling fraction = N / M

but since typically M >> N
then f --> 0

so 1-f becomes 1, and so the formula for the standard error becomes:

Comparison of Corrected and Uncorrected Standard Error Calculations of a Hypothetical Population of 38,000 (and standard deviation of 20,000).
 
sample size (n)
Population size (m)
sampling fraction (f)
standard deviation of the sample
std error (corrected)
std error (uncorrected)
Percent difference between corrected and uncorrected standard error
38,000 
0.00003
20,000 
19999.7
20000.0
0.00%
100 
38,000 
0.00263
20,000 
1997.4
2000.0
0.13%
200 
38,000 
0.00526
20,000 
1410.5
1414.2
0.26%
400 
38,000 
0.01053
20,000 
994.7
1000.0
0.53%
800 
38,000 
0.02105
20,000 
699.6
707.1
1.06%
1,600 
38,000 
0.04211
20,000 
489.4
500.0
2.13%
3,200 
38,000 
0.08421
20,000 
338.3
353.6
4.30%
6,400 
38,000 
0.16842
20,000 
228.0
250.0
8.81%
12,800 
38,000 
0.33684
20,000 
144.0
176.8
18.57%
25,600 
38,000 
0.67368
20,000 
71.4
125.0
42.88%
37,999 
38,000 
0.99997
20,000 
0.5
102.6
99.49%
38,000 
38,000 
1.00000
20,000 
0.0
102.6
100.00%

Note that there is very little difference in using the corrected vs. uncorrected standard error until the sampling fraction gets large.  For example, even with a sample of 800 (out of a total population of 38,000), the difference is only 1 percent.  The two estimates of standard error only begin to deviate significantly when the sample size is more than several thousand (that is, when the sampling fraction approaches about 10% or more).

Moral of the story:   it is fine -- and more conservative -- to use the uncorrected estimate, which is easier to calculate anyway.