Data Matters with Excel®
Activity 4.2
Here are the steps this project requires.
- Pick two of the categorical attributes in the Rep US Sample and create a cross-tabulation.
- Find the marginal totals and proportions. Run a chi-square test to get a chi- square value.
- Use Excel to find the expected counts.
Heres how to do each step.
Step 1: Pick two of the categorical attributes in the Rep US Sample and create a cross-tabulation.
In the RepUSSample.xls workbook, select Data, PivotTable and PivotChart Report, and Next. Select the gray cells at the top of the column of one of your variables and drag to the gray cell above the other variable. Click on Next and Layout. Drag your row variable onto Row and your column variable onto Column, then either one onto Data. Click OK, Finish.
Step 2: Find the marginal totals and proportions. Run a chi-square test to get a chi-square value.
The PivotTable that appeared contains the marginal totals and proportions. You will get the marginal proportions on your way to calculating the chi-square value.
To calculate the chi-square value, change the name of the PivotTable worksheet to Observed (because it has the observed counts). Then add four new worksheets (Insert, Worksheet) and rename them Expected, Difference, Difference2, and RelativeDifference2 (double-click their default names at the bottom of the screen and type in their new names). The expected counts will be on the Expected worksheet, the differences on the Difference worksheet, the squared differences on the Difference2 worksheet, and the relative squared differences on the RelativeDifference2 worksheet.
On the Observed worksheet, click anywhere and press Ctrl-A. That selects the entire sheet. Right-click on the worksheet and select Copy. Open the Expected worksheet, right-click on the upper-left cell (A1), and select Paste Special. Click on Values, then OK. That copies the values in the pivotTable into the Expected worksheet. Repeat those steps to paste the same values into the Difference, Difference2, and RelativeDifference2 worksheets.
Adding the Marginal Proportions
In the Expected worksheet, find the marginal totals. In the column to the right of the row totals, in the cell next to the top row total (in row 5), type = (equal sign), click on the top row total, type /50000 , and press Enter. That puts the row proportion in that cell. Select the cell, click on the little black box, and drag down until every row total has a row proportion next to it.
In the row under the left-most column total (in column B), type =, select the left-most column total, type /50000, and press Enter. Select that column proportion, click on the black box, and drag to the right until every column total has its column proportion.
Expected Counts
Cell B5 is the upper-left cell of the cross-tab. In B5, type = , then click on the row proportion for that row (on row 5), type * (asterisk), then select the column total for that column (in column B). Look at the formula you have just created. There is a cell identified that refers to the row proportion. It is something like F5 or H5 or M5 or something else that ends with a 5. Now click to the left of the letter in that cell identifier and type $ (dollar sign) so that it becomes $F5 or $H5 or $M5 or whatever. The dollar sign keeps the formula pointing at that column as you drag the formula over the table. There is a second cell identifier that points at the column total. It is B8 or B10 or B11 or something else that starts with a B. Click to the left of the number and type a dollar sign. That keeps that part of the formula pointing at that row as you drag the formula. Now press Enter, to enter the formula into cell B5.
I did this for a cross-tab with gender (rows) and marital status (columns). My row proportions were in the H column, and my column totals were in the eighth row. My formula in B5 was =$H5*B$8 . (Remember, the asterisk is Excels way of saying times. So this says Multiply the row proportion by the column total.) Your formula should look similar, but you may have a different letter after the first dollar sign and a different number after the second dollar sign.
Select B5, click on the little black box, and drag to the right to cover the right-most top cell of the cross-tab. Let go, click on the black box again, drag down to cover the bottom row of the cross-tab, and release. There are your expected counts.
Differences
Open the Difference worksheet. In cell B5, type the formula =Observed!B5-Expected!B5 .
In Excel, you refer to worksheets by typing an exclamation point, so this formula says Subtract B5s value on the Expected worksheet from B5s value on the Observed worksheet.
Drag that formula to the right-most cell of the cross-tab. Click on the black box again to drag the formula down to the bottom row of the cross-tab.
Squared Differences
In the Difference2 worksheet, enter this formula into B5: =Difference!B5^2 . The caret (^) means raised to the power of. ^2 means squared. Select and drag to fill the cross-tab with that formula.
Relative Squared Differences
In the RelativeDifference2 worksheet, enter this formula into B5: =Difference2!B5/Expected!B5 .
Select B5 and drag to fill all of the cells of the cross-tab.
Chi-Square
Select any blank cell and click on the summation sign (&Mac183;) at the top of the screen. Select B5 and drag to the lower-right cell of the cross-tab (which is now filled with relative squared differences). That gives you the sum of the relative squared differences, which is the chi-square value.
In which cells do you find the big differences between expected and observed? Can you guess why those differences appear in the data?
Save the Workbook
You will be editing this workbook to do the project in the next section.
| ©2008 Key College Publishing. All rights reserved. |
|