Data Matters with SPSS®
Activity 3.2
Section 3.2 makes two claims: that the estimated standard error is a pretty good estimate and that the margin of error will create a confidence interval that will include the populations proportion 95% of the time.
The project in Section 3.2 requires these steps.
- Find a proportion in the population, pick a sample size, and calculate the standard error. This is the real standard error for that population.
- Take random samples and calculate the estimated standard error for each sample.
- Get a histogram of the estimated standard errors and check how they compare with the actual standard error.
Step 1: Find a proportion in the population, pick a sample size, and calculate the standard error. This is the real standard error for that population.
Youre calculating the standard error from the populations proportion. Select Analyze, Descriptives, Frequencies. Check the Data Matters text if you arent sure how to calculate the standard error.
Step 2: Take random samples and calculate the estimated standard error for each sample.
This project uses the file you ended up with when you finished the project in Section 3.1. Youre going to calculate a new variable to test the estimated standard error. In your data file, there is a variable that holds the proportions that were found in the random samples. The calculations have to refer to that variable. In the numeric expression below, I will refer to that variable as Proportion_Variable. As youre following the instructions, replace Proportion_Variable with the name of the variable in your data that holds the proportions.
SPSS saved the proportions as percents. To work with them, we need them as proportions.
Click on Transform, Compute. Name the target variable Prop. For the numeric expression, type Proportion_Variable / 100 . Click OK.
Now we can create the standard errors. Click on Transform, Compute. Name the target variable StandErr. For the numeric expression, type Sqrt(prop*(1-prop)/[Sample Size]) . Replace [Sample Size] with the sample size you used in creating the proportions. Click OK, OK.
Now there is a column of estimated standard errors.
Step 3: Get a histogram of the estimated standard errors and check how they compare with the actual standard error.
Click on Graphs and select Histogram. Double-click on StandErr and click OK.
What do you see? How do the estimated standard errors compare with the actual standard error? How much would the variation in the estimated standard error mess up the confidence interval?
Did you get estimated standard errors at 0? Those show up only if the sample proportion is 0 or 100%, and no one would use them. As you think about whether this estimated standard error does a good job, you might want to ignore the 0s.
Try different sample sizes. You will have to redo the steps of Section 3.1. Is the estimated standard error better or worse for different sample sizes? When does it work well? When could it cause a lot of trouble?
Testing the 95% Confidence Interval
To test the confidence interval, you calculate the top and the bottom and a variable that checks whether the population proportion is inside the confidence interval.
Use Compute to calculate a new variable, bottom. For the numeric expression, use Prop 2*standerr .
Compute a variable, top, using the numeric expression prop + 2*standerr .
Next compute a variable, inside, that indicates whether the population proportion is inside the confidence interval.
First compute a variable, inside, that is equal to 0. (The numeric expression is 0.)
Then get back into Compute and set the numeric expression to 1 . Click on If and select Include if case satisfies condition: . In the condition box, type Population_Proportion > bottom AND Population_Proportion < top . (Replace Population_Proportion with the population proportion of the variable you are working with.) For example, if you are working with the proportion of the population that is female, the population proportion is 51%, so the numeric equation would be .51 > bottom AND .51 < top . Click on Continue, OK, OK. Now there is a column of 1s and 0s. The 1s are rows in which the confidence interval included the population proportion. You could find out how many there are of each by selecting Analyze, Descriptive Statistics, Frequencies, and so on.
How did the confidence intervals do? Did they include the population proportion roughly 95% of the time? Or, to look at it the other way, did the confidence intervals miss the populations proportion 5% of the time?
Try other proportions in the data and other sample sizes. You will have to go back and follow the steps of the project in Section 3.1. When does the confidence interval do badly?
When the Sample Proportion Is 0 or 100%
If you work with a fairly small proportion and a small sample size, the samples will too often fail to include the population proportion. The reason is that the equation creates a standard error of 0 when the sample proportion is 0. And when the sample proportion is 0, we use the maximum possible standard error. To fix things, recalculate the standard error using the equation for the maximum possible margin of error and use If to change the standard error only when the proportion is 0.
That is, the numeric expression is .5/sqrt([sampleSize]) (replace [sampleSize] with the sample size you are working with). Before you okay the computation, click on If and enter the condition prop=0 .
| ©2008 Key College Publishing. All rights reserved. |
|