Data Matters with Excel®
Activity 2.2
Section 2.2 claims that if you create random samples from some population and create a histogram of the proportions in the samples, the histograms will show shapes that are roughly bell-shaped. The bell-shaped histograms will have centers at roughly the probabilities, or population proportions, involved in generating the samples proportions. In this project, you will see for yourself whether that is true.
The project in Section 2.2 requires these steps.
- Set the probability where you would like it, set the sample size where you would like it, and then use the random number generator to simulate taking samples.
- Record the proportions that appear in the samples and make a histogram of those proportions. (Check whether the histograms are roughly bell-shaped with centers at the probability that you chose.)
- Repeat the process with larger samples. Youll notice that your histograms are narrower with larger samples. Thats the law of large numbers.
Heres how to do each step.
Step 1: Set the probability where you would like it, set the sample size where you would like it, and then use the random number generator to simulate taking samples.
You are going to create a sample of random draws and calculate the proportions that come out whichever way you care about. Then youll set Excel to do that repeatedly.
Start by typing these column labels into the top row.
Column A: Random_Number
Column B: Outcome
Column C: Number_True
Column D: Sample_Size
Column E: Sample_Proportion
Column F: Proportions
To set up the Random_Number column, type =RAND() into the second row of column A.
To set the Outcome column and your probability, into the second row of column B type = , click on the top random number, then type <.5 . When youre done, the cell will read =A2<.5. Either press Enter to set this equation or edit it to change your probability from .5. If you would like to use 10% for your probability, change the equation to =A2<.1. If you want to use 75%, change it to =A2<.75. Use whatever probability you would like.
To fill the sample, click on the second row of column A and drag to the second row of column B. Let go, and both cells have a thick black border. Click on the small black square on the bottom right and drag down to fill some cells with random numbers. However far down you drag will determine the sample size. For example, if you drag to row 7, there will be six observations in each sample. If you drag to row 101, there will be 100 observations in each sample. But start with small samples so you can check what Excel is doing.
To set the Number_True column, type this equation into the second row: =COUNTIF(B:B,TRUE) . That will fill the cell with the number of TRUEs in column B. Check for yourself that this is what it does.
Type the number of observations in your sample into the second row in the Sample_Size column (column D). The sample size is one less than the row you dragged to.
To calculate the proportion in the Sample_Proportion column, in the second row, type = , click on the number of TRUEs that appears in the Number_True column, type / (a forward slash), click on the number of observations in the Sample_Size column, and press Enter.
Check what you got. From left to right, you should see a sample of random numbers, a collection of TRUE and FALSE indicating whether the random numbers are less than your cutoff, the number of TRUEs, the number of observations, and the proportion of the sample that is TRUE.
To collect a new sample, press F9 to replace all of the random numbers and recalculate all the values.
Calculate
There is another way to take a new sample, and you should see it before we proceed. It uses a macro. Because we will be using macros in later projects, its worth your while to see how a macro would do this task. That way, you get more comfortable with macros.
To create the macro, select Tools, Macro, Visual Basic Editor. That pulls up a new window. Click on Insert and select Module. In the white workspace, type this macro:
Sub Sampling()
Calculate
End Sub
|
Run the macro by clicking on the black triangle at the top-center of the screen. It does the same thing that pressing F9 does; it pulls new random numbers and recalculates everything. Thats what Calculate does.
Step 2: Record the proportions that appear in the samples and make a histogram of those proportions. (Check whether the histograms are roughly bell-shaped with centers at the probability that you chose.)
Were going to expand our macro to calculate the worksheet several times and save the proportions that appear in the samples. Heres the macro that will do the trick:
Sub Sampling()
For recordingRow = 2 To 5
Calculate
Cells(recordingRow, 6).Value = Cells(2, 5).Value
Next recordingRow
End Sub
|
What we added to the macro was a set of For-To/Next statements to do the two lines in the center a bunch of times. The first task on each go-around is Calculate. Thats what we had before. The second task is Cells(recordingRow, 6).Value = Cells(2, 5).Value.
That line starts with the right-hand side of the equation. On the right side, it says Cells(2,5).value. That refers to the value of what is in the cell in the second row of the fifth column. If you look back at the worksheet, youll see that the second row of the fifth column is the sample proportion. The equation above puts that value into a cell in the sixth column. The value gets put into whichever row recordingRow is pointing at. The first time through the For-To/Next loop, recordingRow is 2 and the proportion is stored in the second row. The second time through, recordingRow is 3 and the proportion is stored in the third row.
Run the macro to see it in action.
Now we have five proportions. If you would like more proportions, edit the For-To statement in the macro.
Getting the Histogram
To create histograms in Excel you will need the Data Analysis add-in. To check whether you have the add-in, select a cell in the spreadsheet, then select Tools. If one of the options is Data Analysis, then you can proceed.
If Data Analysis is not an option under Tools, select Add-Ins. Click on the box next to Data Analysis. If that box is not there, you or a computer administrator will have to use the installation discs for Excel to add the Data Analysis tools. Once the Data Analysis tools are available, Data Analysis will be an option under Tools.
In the Data Analysis window, click on Histogram, OK. In the worksheet, select the top saved proportion, then drag down until you have selected all of the proportions (and only their column). Click OK. Excel then presents you with a table in a new worksheet. It looks as if Microsoft thinks that is a histogram. Dont be fooled. You have another step left.
The table is selected. Leave it selected and click on the chart button at the top of the screen (the chart button has a little picture of a blue, yellow, and red bar chart). Click on Finish.
Theres the histogram. At this point, you are welcome to say, Well, thats not a bell shape. Yours might be a little bell-shaped, but I doubt it. To see the bell shape, you will have to use more than five samples.
To increase the number of samples, get back into the macro editor and edit the For-To line of the macro so it looks like this: For recordingRow = 2 To 25 . Run the macro. If it doesnt run completely, click on End, go back to the worksheet that contains your data, select a blank cell, then go back to the macro editor and try running the macro again.
To see the histogram, get back to the chart and click on the red exclamation point in the PivotChart box. That will update the PivotTable and PivotChart. Now you can see a bell-shaped histogram, I bet. Jot down for yourself how wide the histogram is. What proportion is at the bottom (left) of the histogram and what proportion is at the top?
Step 3: Repeat the process with larger samples. Youll notice that your histograms are narrower with larger samples. Thats the law of large numbers.
To have a larger sample size, go back to the worksheet that contains your data. Select some cells in the first two columns by clicking and dragging. Click on the little black square and drag down to use a larger sample. Change the value in the Sample_Size column. Then rerun the macro and update the PivotChart by clicking on the red exclamation point.
Now how wide is the histogram? You used a larger sample size. Was the histogram narrower?
| ©2008 Key College Publishing. All rights reserved. |
|