Thursday, July 24, 2008

I will begin my blogging by first discussing the Random Groups Standard Error Estimation procedure, used by the US Census Bureau in the 1990 Census (in in previous Censuses), and which I described here: http://losinger.110mb.com/documents/Random_Groups.pdf

In keeping with the example of my paper, suppose that there is a geographic area that has people of only two races: White and Black. You know the total number of people. You want to get an idea of how many are White and how many are Black, without asking everyone in the area. So, you ask a sample of the people in the area whether they are White or Black. Within your sample, you know how the proportion of Whites and Blacks. To estimate the total number of Whites in the area, you simply multiply the proportion of Whites in your sample by the total number of people in area. Similarly for Blacks.

Suppose that P represents the true proportion of Whites in the population, and that Q represents the true proportion of Blacks in the population. The population of this area consists only of Blacks and Whites. Hence, the two proportions add to one (P + Q = 1).

If your sample was a simple random sample of the population, then your best estimate of P is p, i.e. the number of Whites in your sample, divided by your total sample size

p = nw / n

where nw is the number of Whites in your sample, and n is the total number of people (both Black and White) in your sample. Similarly,


q = nb / n,

where nb is the number of Blacks in your sample. Note here that p + q = 1.

Your estimate of the total number of Whites (Nw) in the population is N x p, and your estimate of the total number of Blacks (Nb) in the population is N x q.

The basic formula for the standard error of p is sqrt(p x q/n), which is the same as the formula for the standard error of q.

Sorry I don't have the ability to create and edit equations on this blog. Take sqrt to mean the square root.

If N is a fixed constant, then the standard error of the estimate of the total number of Whites in the population is:

se(Nw) = N x se(p) = N x sqrt(p x q/n).

The standard error of the estimate of the total number of Blacks in the population turns out to be identical:

se(Nb) = N x se(q) = N x sqrt(p x q/n).

However, one should bear in mind that, in sampling theory, there is something called the finite population correction factor, which comes into play especially when your sample is relatively large compared to your population. If you completed a census of everyone in your population, then you know exactly the number of Whites and Blacks in your population. Your numbers are not based on sampling, and there is no sampling error.

More on the Finite Population Correction Factor next time.

No comments: