Data Matters with Excel®
Activity 10.1
In 1972, Samuel Bowles found a correlation coefficient of .375 between years of education and income. What do you find in the Rep US Sample data?
To find the correlation, select a blank cell to the right of the data and enter the formula =correl(C1:C50001,J1:J50001) . You might think that you can enter the data by clicking on the columns, but you cant. Excel gets confused by the blank lines beyond the data. You have to type in the data ranges. To explore other measures of income and wealth, replace each J with the column label for other measures of wealth. Karl Pearsons correlation test is based on an assumption that the data are normally distributed. For each numeric variable in the data, get a histogram and make a judgment about whether the data look fairly normally distributed.
To get a histogram, click on Tools, Data Analysis and select Histogram. In the Input range: box, enter the data range. For Education, which is in column C, enter C2:C50001 . For data in column J, enter J2:J50001 . Click OK.
You may get an error box that says Histogram Input range contains non-numeric data. This means that blank cells are confusing Excel. To avoid that, select a cell in the column you want to work with (but do not select the whole column). Click on the sort button at the top of the worksheet that has A on top and Z on the bottom. That will put all of the blank cells at the top of that column. Scroll down until you find the last nonblank cell of that column and use that when you are entering where your data is. For example, if the last row is 5,134, then your data is in C2:C5134.
Now click on the Graph Wizard button, then Finish.
If the histogram has more columns than are helpful, you can edit what appears in the histogram by deleting rows from the worksheet.
Are there correlations in that data that you could test with the significance test of the correlation coefficient?
| ©2008 Key College Publishing. All rights reserved. |
|