Data Matters with Fathom! Dynamic Statistics software
Activity 3.2
Section 3.2 makes two claims: that the estimated standard error is a pretty good estimate and that the margin of error will create a confidence interval that will include the populations proportion 95% of the time.
The project in Section 3.2 requires these steps.
- Find a proportion in the population, pick a sample size, and calculate the standard error. This is the real standard error for that population.
- Take random samples and calculate the estimated standard error for each sample.
- Get a histogram of the estimated standard errors and check how they compare with the actual standard error.
Heres how to do each step.
Step 1: Find a proportion in the population, pick a sample size, and calculate the standard error. This is the real standard error for that population.
Youre calculating the standard error from the populations proportion. Open RepUSSampleMarch2001.ftm and select Analyze, Estimate Parameters.
Check the Data Matters text if you arent sure how to calculate the standard error.
Step 2: Take random samples and calculate the estimated standard error for each sample.
Select Analyze, Sample Cases, Edit, Inspect Collection. Pick a sample size and enter it in the Sample Size box.
Click on the Measures tab and create a measure to calculate the proportion for the attribute you are interested in. If you want to know the proportion of each sample that is female, for example, the formula would be Proportion(Gender=Female) . To make things easy, lets call the proportion measure prop.
Now were going to add a new measure, the standard error. Click on <new> and enter the name standard_error. Then right-click on the name and select Edit Formula. The formula is Sqrt(prop(1prop)/[your sample size]) . For example, if your sample size is 10, type in Sqrt(prop(1prop)/10) . And if your sample size is 55, type in Sqrt(prop(1prop)/55) .
Fathom will rearrange things a little so they look something like this.
Select Analyze, Collect Measures, then drag a case table onto the workspace. To get more samples, click on the Measures Collection, then select Edit, Inspect Collection. Enter whatever number of measures you would like to work with.
Step 3: Get a histogram of the estimated standard errors and check how they compare with the actual standard error.
Click on the Measures Collection, then drag a case table onto the workspace. Drag a graph onto the workspace and drag Standard_Error from the Measures Collection to Drop an attribute name here.
What do you see? How do the estimated standard errors compare to the actual standard error? How much would the variation in the estimated standard error mess up the confidence interval?
Did you get estimated standard errors at 0? Those show up only if the sample proportion is 0 or 100%, and no one would use them. As you think about whether this estimated standard error does a good job, you might want to ignore the 0s.
Try different sample sizes. Is the estimated standard error better or worse for different sample sizes? When does it work well? When could it cause a lot of trouble?
To test the 95% confidence interval, we are going to add a measure. Lets call it In_Interval. It will hold whether or not the population proportion was inside the sample proportion.
Click on the Sample Collection, then select Edit, Inspect Collection. Add the measure In_Interval. Right-click on the new variable and select Edit Formula. This formula will do the trick: InRange([Population Proportion],prop 2standard_error,prop+2standard_error) . For example, if you are looking at the proportion of the population that is female, the population proportion you would be working with is .51, so the formula would be InRange(0.51,prop 2standard_error,prop+2standard_error) .
Click on the Measures Collection and open the Collection Inspector. Select Collect More Measures on the Collect Measures page. Drag a case table onto the workspace, select Analyze, Estimate Parameters, Empty Estimate, Estimate Proportion. Drag In_Interval onto Attribute (categorical): <unassigned>.
How did the confidence intervals do? Did they include the population proportion roughly 95% of the time? Or, to look at it the other way, did the confidence intervals miss the populations proportion 5% of the time?
Try other proportions in the data and other sample sizes. When does the confidence interval do badly?
When the Sample Proportion Is 0 or 100%
If you work with a fairly small proportion and a small sample size, far too often, the samples will fail to include the population proportion. The reason is that the equation creates a standard error of 0 when the sample proportion is 0. When the sample proportion is 0, we use the maximum possible standard error. To fix things, click on the Sample Collection, then select Edit, Inspect Collection. Right-click on standard_error and edit the formula to look like this.
| ©2008 Key College Publishing. All rights reserved. |
|