STA429/1007 F 2004 Handout 3: The Berkeley Data
Control by Subdivision
/*************************** berkeley.sas *********************************/ options linesize=79 pagesize=35 noovp formdlim='_'; title 'Berkeley Graduate Admissions Data: '; proc format; value sexfmt 1 = 'Female' 0 = 'Male'; value ynfmt 1 = 'Yes' 0 = 'No'; data berkley; input line sex dept $ admit count; format sex sexfmt.; format admit ynfmt.; datalines; 1 0 A 1 512 2 0 B 1 353 3 0 C 1 120 4 0 D 1 138 5 0 E 1 53 6 0 F 1 22 7 1 A 1 89 8 1 B 1 17 9 1 C 1 202 10 1 D 1 131 11 1 E 1 94 12 1 F 1 24 13 0 A 0 313 14 0 B 0 207 15 0 C 0 205 16 0 D 0 279 17 0 E 0 138 18 0 F 0 351 19 1 A 0 19 20 1 B 0 8 21 1 C 0 391 22 1 D 0 244 23 1 E 0 299 24 1 F 0 317 ; proc freq; tables sex*admit / nopercent nocol chisq; tables dept*sex / nopercent nocol chisq; tables dept*admit / nopercent nocol chisq; tables dept*sex*admit / nopercent nocol chisq; weight count; /homes/students/u0/stats/brunner > cat berkeley.lst _______________________________________________________________________________ Berkeley Graduate Admissions Data: 1 09:50 Thursday, September 23, 2004 The FREQ Procedure Table of sex by admit sex admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ Male | 1493 | 1198 | 2691 | 55.48 | 44.52 | ---------+--------+--------+ Female | 1278 | 557 | 1835 | 69.65 | 30.35 | ---------+--------+--------+ Total 2771 1755 4526 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 2 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table of sex by admit Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 92.2053 <.0001 Likelihood Ratio Chi-Square 1 93.4494 <.0001 Continuity Adj. Chi-Square 1 91.6096 <.0001 Mantel-Haenszel Chi-Square 1 92.1849 <.0001 Phi Coefficient -0.1427 Contingency Coefficient 0.1413 Cramer's V -0.1427 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 1493 Left-sided Pr <= F 2.854E-22 Right-sided Pr >= F 1.0000 Table Probability (P) 1.314E-22 Two-sided Pr <= P 4.836E-22 Sample Size = 4526 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 3 09:50 Thursday, September 23, 2004 The FREQ Procedure Table of dept by sex dept sex Frequency| Row Pct |Male |Female | Total ---------+--------+--------+ A | 825 | 108 | 933 | 88.42 | 11.58 | ---------+--------+--------+ B | 560 | 25 | 585 | 95.73 | 4.27 | ---------+--------+--------+ C | 325 | 593 | 918 | 35.40 | 64.60 | ---------+--------+--------+ D | 417 | 375 | 792 | 52.65 | 47.35 | ---------+--------+--------+ E | 191 | 393 | 584 | 32.71 | 67.29 | ---------+--------+--------+ F | 373 | 341 | 714 | 52.24 | 47.76 | ---------+--------+--------+ Total 2691 1835 4526 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 4 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table of dept by sex Statistic DF Value Prob ------------------------------------------------------ Chi-Square 5 1068.3717 <.0001 Likelihood Ratio Chi-Square 5 1220.6148 <.0001 Mantel-Haenszel Chi-Square 1 507.0346 <.0001 Phi Coefficient 0.4859 Contingency Coefficient 0.4370 Cramer's V 0.4859 Sample Size = 4526 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 5 09:50 Thursday, September 23, 2004 The FREQ Procedure Table of dept by admit dept admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ A | 332 | 601 | 933 | 35.58 | 64.42 | ---------+--------+--------+ B | 215 | 370 | 585 | 36.75 | 63.25 | ---------+--------+--------+ C | 596 | 322 | 918 | 64.92 | 35.08 | ---------+--------+--------+ D | 523 | 269 | 792 | 66.04 | 33.96 | ---------+--------+--------+ E | 437 | 147 | 584 | 74.83 | 25.17 | ---------+--------+--------+ F | 668 | 46 | 714 | 93.56 | 6.44 | ---------+--------+--------+ Total 2771 1755 4526 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 6 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table of dept by admit Statistic DF Value Prob ------------------------------------------------------ Chi-Square 5 778.9065 <.0001 Likelihood Ratio Chi-Square 5 855.3209 <.0001 Mantel-Haenszel Chi-Square 1 724.8170 <.0001 Phi Coefficient 0.4148 Contingency Coefficient 0.3832 Cramer's V 0.4148 Sample Size = 4526 Table 1 of sex by admit Controlling for dept=A sex admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ Male | 313 | 512 | 825 | 37.94 | 62.06 | ---------+--------+--------+ Female | 19 | 89 | 108 | 17.59 | 82.41 | ---------+--------+--------+ Total 332 601 933 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 7 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table 1 of sex by admit Controlling for dept=A Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 17.2480 <.0001 Likelihood Ratio Chi-Square 1 19.0540 <.0001 Continuity Adj. Chi-Square 1 16.3718 <.0001 Mantel-Haenszel Chi-Square 1 17.2295 <.0001 Phi Coefficient 0.1360 Contingency Coefficient 0.1347 Cramer's V 0.1360 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 313 Left-sided Pr <= F 1.0000 Right-sided Pr >= F 1.151E-05 Table Probability (P) 7.672E-06 Two-sided Pr <= P 1.669E-05 Sample Size = 933 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 8 09:50 Thursday, September 23, 2004 The FREQ Procedure Table 2 of sex by admit Controlling for dept=B sex admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ Male | 207 | 353 | 560 | 36.96 | 63.04 | ---------+--------+--------+ Female | 8 | 17 | 25 | 32.00 | 68.00 | ---------+--------+--------+ Total 215 370 585 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 9 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table 2 of sex by admit Controlling for dept=B Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 0.2537 0.6145 Likelihood Ratio Chi-Square 1 0.2586 0.6111 Continuity Adj. Chi-Square 1 0.0851 0.7705 Mantel-Haenszel Chi-Square 1 0.2533 0.6148 Phi Coefficient 0.0208 Contingency Coefficient 0.0208 Cramer's V 0.0208 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 207 Left-sided Pr <= F 0.7598 Right-sided Pr >= F 0.3918 Table Probability (P) 0.1516 Two-sided Pr <= P 0.6771 Sample Size = 585 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 10 09:50 Thursday, September 23, 2004 The FREQ Procedure Table 3 of sex by admit Controlling for dept=C sex admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ Male | 205 | 120 | 325 | 63.08 | 36.92 | ---------+--------+--------+ Female | 391 | 202 | 593 | 65.94 | 34.06 | ---------+--------+--------+ Total 596 322 918 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 11 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table 3 of sex by admit Controlling for dept=C Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 0.7535 0.3854 Likelihood Ratio Chi-Square 1 0.7510 0.3862 Continuity Adj. Chi-Square 1 0.6332 0.4262 Mantel-Haenszel Chi-Square 1 0.7527 0.3856 Phi Coefficient -0.0287 Contingency Coefficient 0.0286 Cramer's V -0.0287 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 205 Left-sided Pr <= F 0.2129 Right-sided Pr >= F 0.8265 Table Probability (P) 0.0394 Two-sided Pr <= P 0.3866 Sample Size = 918 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 12 09:50 Thursday, September 23, 2004 The FREQ Procedure Table 4 of sex by admit Controlling for dept=D sex admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ Male | 279 | 138 | 417 | 66.91 | 33.09 | ---------+--------+--------+ Female | 244 | 131 | 375 | 65.07 | 34.93 | ---------+--------+--------+ Total 523 269 792 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 13 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table 4 of sex by admit Controlling for dept=D Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 0.2980 0.5852 Likelihood Ratio Chi-Square 1 0.2979 0.5852 Continuity Adj. Chi-Square 1 0.2216 0.6378 Mantel-Haenszel Chi-Square 1 0.2976 0.5854 Phi Coefficient 0.0194 Contingency Coefficient 0.0194 Cramer's V 0.0194 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 279 Left-sided Pr <= F 0.7328 Right-sided Pr >= F 0.3188 Table Probability (P) 0.0516 Two-sided Pr <= P 0.5995 Sample Size = 792 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 14 09:50 Thursday, September 23, 2004 The FREQ Procedure Table 5 of sex by admit Controlling for dept=E sex admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ Male | 138 | 53 | 191 | 72.25 | 27.75 | ---------+--------+--------+ Female | 299 | 94 | 393 | 76.08 | 23.92 | ---------+--------+--------+ Total 437 147 584 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 15 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table 5 of sex by admit Controlling for dept=E Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 1.0011 0.3171 Likelihood Ratio Chi-Square 1 0.9904 0.3196 Continuity Adj. Chi-Square 1 0.8080 0.3687 Mantel-Haenszel Chi-Square 1 0.9994 0.3175 Phi Coefficient -0.0414 Contingency Coefficient 0.0414 Cramer's V -0.0414 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 138 Left-sided Pr <= F 0.1841 Right-sided Pr >= F 0.8646 Table Probability (P) 0.0486 Two-sided Pr <= P 0.3604 Sample Size = 584 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 16 09:50 Thursday, September 23, 2004 The FREQ Procedure Table 6 of sex by admit Controlling for dept=F sex admit Frequency| Row Pct |No |Yes | Total ---------+--------+--------+ Male | 351 | 22 | 373 | 94.10 | 5.90 | ---------+--------+--------+ Female | 317 | 24 | 341 | 92.96 | 7.04 | ---------+--------+--------+ Total 668 46 714 _______________________________________________________________________________ Berkeley Graduate Admissions Data: 17 09:50 Thursday, September 23, 2004 The FREQ Procedure Statistics for Table 6 of sex by admit Controlling for dept=F Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 0.3841 0.5354 Likelihood Ratio Chi-Square 1 0.3836 0.5357 Continuity Adj. Chi-Square 1 0.2182 0.6404 Mantel-Haenszel Chi-Square 1 0.3836 0.5357 Phi Coefficient 0.0232 Contingency Coefficient 0.0232 Cramer's V 0.0232 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 351 Left-sided Pr <= F 0.7801 Right-sided Pr >= F 0.3198 Table Probability (P) 0.1000 Two-sided Pr <= P 0.5458
To test for an association between Gender and Admission controlling for department, here is a good approach. Pool the chisquare tests by adding the values of the chisquare statistics and also adding the degrees of freedom. This means we get Chisquare = 17.2480+0.2537+0.7535+0.2980+1.0011+0.3841 = 19.9384. This exceeds the critical value of 12.59 for a chisquare with 6 degrees of freedom, and we conclude that controlling for department, admission is related to gender. A good reference for this method of pooling chisquare values is Stephen Feinberg's (1989) book, the analysis of cross-classified categorical data.
This overall test does not tell you what the nature of the relationship is. For that, you need to examine the 6 individual 2x2 tables. So look at the chi-square test separately for each department. Use a Bonferroni correction to allow for the fact that you are doing 6 tests. This means declaring the results significant only if p < 0.05/6 = 0.0083. The only results that are significant by this criterion (or with a one-at-a-time test, for that matter) are the results for Department A. There, we see ... what?