Data Matters with Excel®
Activity 3.2
Section 3.2 makes two claims: that the estimated standard error is a pretty good estimate and that the margin of error will create a confidence interval that will include the populations proportion 95% of the time.
The project in Section 3.2 requires these steps.
- Find a proportion in the population, pick a sample size, and then calculate the standard error. This is the real standard error for that population.
- Take random samples and calculate the estimated standard error for each sample.
- Get a histogram of the estimated standard errors and check how they compare with the actual standard error.
Step 1: Find a proportion in the population, pick a sample size, and then calculate the standard error. This is the real standard error for that population.
Youre calculating the standard error from the populations proportion. Open RepUSSample.xls and use pivotTables to find the proportions. Check Data Matters if you arent sure how to calculate the standard error.
Step 2: Take random samples and calculate the estimated standard error for each sample.
Now that you have a standard error for a proportion, you will take random samples to test the claim.
This project uses the workbook from Section 3.1, so go to that workbook. If you dont have it available, go back and follow the instructions to create it.
In the Proportions worksheet, label the seventh column (G) Standard Error. In the top data cells type =sqrt(F2*(1-F2)/ , then whatever your sample size is, and finally, a closing parenthesis ). For example, if your sample size is 22, then type =sqrt(F2*(1-F2)/22) . Press Enter. Click on that cell to select it, click on the little black square and drag down to fill the cells below.
That column contains the estimated standard errors that appear from random samples of the population. To take new samples, run the macro.
Step 3: Get a histogram of the estimated standard errors and check how they compare with the actual standard error.
Get a histogram of the data in column G. Select Tools, Data Analysis.
In the Data Analysis window, click on Histogram, OK. In the worksheet, select the top estimated standard error and drag down until you have selected all of the estimated standard errors (and only their column). Click OK. Click on the chart button, then Finish.
What do you see? How do the estimated standard errors compare with the actual standard error? How much would the variation in the estimated standard error mess up the confidence interval?
Did you get estimated standard errors at 0? Those show up only if the sample proportion is 0 or 100%, and no one would use them. As you think about whether this estimated standard error does a good job, you might want to ignore the 0s.
Try different sample sizes. Is the estimated standard error better or worse for different sample sizes? When does it work well? When could it cause a lot of trouble?
Label the eighth column (column H) Bottom of Confidence Interval. In the second row of column H, type =F2 2*G2 . (The asterisk means times.) This equation means Take the proportion that is in column F and subtract two of the standard errors that are in column G.
Label the ninth column (column I) Top of Confidence Interval. In the second row of column I, type =F2 + 2 * G2 .
Label the 10th column (column J) In Interval. In the second row of column J, with the appropriate population proportion inside, type the formula =([Population Proportion]>H2)+([Population Proportion]<I2)=2 . For example, if the population proportion is .51 then type =(.51>H2)+(.51<I2)=2 . Excel codes a true statement as 1 and a false statement as 0. So two true statements add up to 2.
Click and drag to select the top data cells of columns H, I, and J. Click on the black square and drag down to fill the columns. Look at the worksheet to see how the equation in column J works.
Get a PivotTable to find the proportion of confidence intervals that include the population proportion.
How are the confidence intervals? Do they include the population proportion roughly 95% of the time? Or to look at it the other way, do the confidence intervals miss the populations proportion 5% of the time?
Try other proportions in the data and other sample sizes. When does the confidence interval do badly?
When the Sample Proportion is 0 or 100%
If you work with a fairly small proportion and a small sample size, the samples too often will fail to include the population proportion. The reason is that the equation above creates a standard error of 0 when the sample proportion is 0. When the sample proportion is 0, we use the maximum possible standard error. To fix things, from the Proportions worksheet, get into the macro editor and run the following macro:
Sub fixStandardError()
For SampleRow = 2 To 2222
If Cells(SampleRow, 6).Value = 0 Then Cells(SampleRow, 7).Value = 0.5 / Sqr(11)
Next SampleRow
End Sub
|
You will have to replace the 11 on the far right with your sample size.
That macro looks at each of the first 2,222 rows (For SampleRow = 2 to 2222). In each row for which the proportion (in column 6) is 0, it sets the standard error to the maximum possible standard error. In the macro language, sqr() asks for the square root. In a worksheet equation, it is sqrt(). Thats kind of like the difference between rnd and RAND().
| ©2008 Key College Publishing. All rights reserved. |
|