General Information
  Home
Author Bio
Product/Purchase Info

Instructor Resources
Registration Required
  Register
Download Instructor Resources

Computer Activities
and Data Sets
  Table of Contents
Excel
Fathom
SPSS

Community
  Contact the Author
Ideas/Comments for Publisher
Testimonials
Coming soon!

Other Key Sites
  Key Curriculum Press

Key College Publishing

Data Matters with Excel®

Activity 2.2

Section 2.2 claims that if you create random samples from some population and create a histogram of the proportions in the samples, the histograms will show shapes that are roughly bell-shaped. The bell-shaped histograms will have centers at roughly the probabilities, or population proportions, involved in generating the samples’ proportions. In this project, you will see for yourself whether that is true.

The project in Section 2.2 requires these steps.

  1. Set the probability where you would like it, set the sample size where you would like it, and then use the random number generator to simulate taking samples.
  2. Record the proportions that appear in the samples and make a histogram of those proportions. (Check whether the histograms are roughly bell-shaped with centers at the probability that you chose.)
  3. Repeat the process with larger samples. You’ll notice that your histograms are narrower with larger samples. That’s the law of large numbers.

Here’s how to do each step.

Step 1: Set the probability where you would like it, set the sample size where you would like it, and then use the random number generator to simulate taking samples.

You are going to create a sample of random draws and calculate the proportions that come out whichever way you care about. Then you’ll set Excel to do that repeatedly.

Start by typing these column labels into the top row.

Column A: Random_Number
Column B: Outcome
Column C: Number_True
Column D: Sample_Size
Column E: Sample_Proportion
Column F: Proportions

To set up the Random_Number column, type =RAND() into the second row of column A.

To set the Outcome column and your probability, into the second row of column B type = , click on the top random number, then type <.5 . When you’re done, the cell will read =A2<.5. Either press “Enter” to set this equation or edit it to change your probability from .5. If you would like to use 10% for your probability, change the equation to =A2<.1. If you want to use 75%, change it to =A2<.75. Use whatever probability you would like.

To fill the sample, click on the second row of column A and drag to the second row of column B. Let go, and both cells have a thick black border. Click on the small black square on the bottom right and drag down to fill some cells with random numbers. However far down you drag will determine the sample size. For example, if you drag to row 7, there will be six observations in each sample. If you drag to row 101, there will be 100 observations in each sample. But start with small samples so you can check what Excel is doing.

To set the Number_True column, type this equation into the second row: =COUNTIF(B:B,TRUE) . That will fill the cell with the number of TRUE’s in column B. Check for yourself that this is what it does.

Type the number of observations in your sample into the second row in the Sample_Size column (column D). The sample size is one less than the row you dragged to.

To calculate the proportion in the Sample_Proportion column, in the second row, type = , click on the number of TRUE’s that appears in the Number_True column, type / (a forward slash), click on the number of observations in the Sample_Size column, and press “Enter.”

Check what you got. From left to right, you should see a sample of random numbers, a collection of TRUE and FALSE indicating whether the random numbers are less than your cutoff, the number of TRUE’s, the number of observations, and the proportion of the sample that is TRUE.

To collect a new sample, press “F9” to replace all of the random numbers and recalculate all the values.

Calculate

There is another way to take a new sample, and you should see it before we proceed. It uses a macro. Because we will be using macros in later projects, it’s worth your while to see how a macro would do this task. That way, you get more comfortable with macros.

To create the macro, select Tools, Macro, Visual Basic Editor. That pulls up a new window. Click on Insert and select Module. In the white workspace, type this macro:

Sub Sampling()
  Calculate
End Sub

Run the macro by clicking on the black triangle at the top-center of the screen. It does the same thing that pressing “F9” does; it pulls new random numbers and recalculates everything. That’s what Calculate does.

Step 2: Record the proportions that appear in the samples and make a histogram of those proportions. (Check whether the histograms are roughly bell-shaped with centers at the probability that you chose.)

We’re going to expand our macro to calculate the worksheet several times and save the proportions that appear in the samples. Here’s the macro that will do the trick:

Sub Sampling()
  For recordingRow = 2 To 5
    Calculate
    Cells(recordingRow, 6).Value = Cells(2, 5).Value
  Next recordingRow
End Sub

What we added to the macro was a set of For-To/Next statements to do the two lines in the center a bunch of times. The first task on each go-around is Calculate. That’s what we had before. The second task is Cells(recordingRow, 6).Value = Cells(2, 5).Value.

That line starts with the right-hand side of the equation. On the right side, it says Cells(2,5).value. That refers to the value of what is in the cell in the second row of the fifth column. If you look back at the worksheet, you’ll see that the second row of the fifth column is the sample proportion. The equation above puts that value into a cell in the sixth column. The value gets put into whichever row recordingRow is pointing at. The first time through the For-To/Next loop, recordingRow is 2 and the proportion is stored in the second row. The second time through, recordingRow is 3 and the proportion is stored in the third row.

Run the macro to see it in action.

Now we have five proportions. If you would like more proportions, edit the For-To statement in the macro.

Getting the Histogram

To create histograms in Excel you will need the Data Analysis add-in. To check whether you have the add-in, select a cell in the spreadsheet, then select Tools. If one of the options is Data Analysis, then you can proceed.

If Data Analysis is not an option under Tools, select Add-Ins. Click on the box next to Data Analysis. If that box is not there, you or a computer administrator will have to use the installation discs for Excel to add the Data Analysis tools. Once the Data Analysis tools are available, Data Analysis will be an option under Tools.

In the Data Analysis window, click on Histogram, OK. In the worksheet, select the top saved proportion, then drag down until you have selected all of the proportions (and only their column). Click OK. Excel then presents you with a table in a new worksheet. It looks as if Microsoft thinks that is a histogram. Don’t be fooled. You have another step left.

The table is selected. Leave it selected and click on the chart button at the top of the screen (the chart button has a little picture of a blue, yellow, and red bar chart). Click on Finish.

There’s the histogram. At this point, you are welcome to say, “Well, that’s not a bell shape.” Yours might be a little bell-shaped, but I doubt it. To see the bell shape, you will have to use more than five samples.

To increase the number of samples, get back into the macro editor and edit the For-To line of the macro so it looks like this: For recordingRow = 2 To 25 . Run the macro. If it doesn’t run completely, click on End, go back to the worksheet that contains your data, select a blank cell, then go back to the macro editor and try running the macro again.

To see the histogram, get back to the chart and click on the red exclamation point in the PivotChart box. That will update the PivotTable and PivotChart. Now you can see a bell-shaped histogram, I bet. Jot down for yourself how wide the histogram is. What proportion is at the bottom (left) of the histogram and what proportion is at the top?

Step 3: Repeat the process with larger samples. You’ll notice that your histograms are narrower with larger samples. That’s the law of large numbers.

To have a larger sample size, go back to the worksheet that contains your data. Select some cells in the first two columns by clicking and dragging. Click on the little black square and drag down to use a larger sample. Change the value in the Sample_Size column. Then rerun the macro and update the PivotChart by clicking on the red exclamation point.

Now how wide is the histogram? You used a larger sample size. Was the histogram narrower?


©2008 Key College Publishing. All rights reserved.