STA429/1007 F 2004 Handout 3: The Berkeley Data

Control by Subdivision



/*************************** berkeley.sas *********************************/
options linesize=79 pagesize=35 noovp formdlim='_';
title 'Berkeley Graduate Admissions Data: ';

proc format;
     value sexfmt 1 = 'Female' 0 = 'Male';
     value ynfmt 1 = 'Yes'  0 = 'No';
data berkley;
     input  line sex dept $ admit count;
     format sex sexfmt.; format admit ynfmt.;
     datalines;
   1     0      A      1    512
   2     0      B      1    353
   3     0      C      1    120
   4     0      D      1    138
   5     0      E      1     53
   6     0      F      1     22
   7     1      A      1     89
   8     1      B      1     17
   9     1      C      1    202
  10     1      D      1    131
  11     1      E      1     94
  12     1      F      1     24
  13     0      A      0    313
  14     0      B      0    207
  15     0      C      0    205
  16     0      D      0    279
  17     0      E      0    138
  18     0      F      0    351
  19     1      A      0     19
  20     1      B      0      8
  21     1      C      0    391
  22     1      D      0    244
  23     1      E      0    299
  24     1      F      0    317
;
proc freq;
     tables sex*admit / nopercent nocol chisq;
     tables dept*sex / nopercent nocol chisq;
     tables dept*admit / nopercent nocol chisq;
     tables dept*sex*admit / nopercent nocol chisq;
     weight count;


/homes/students/u0/stats/brunner > cat berkeley.lst

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      1
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                             Table of sex by admit

                      sex       admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      Male     |   1493 |   1198 |   2691
                               |  55.48 |  44.52 |
                      ---------+--------+--------+
                      Female   |   1278 |    557 |   1835
                               |  69.65 |  30.35 |
                      ---------+--------+--------+
                      Total        2771     1755     4526

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      2
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                     Statistics for Table of sex by admit

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     1     92.2053    <.0001
            Likelihood Ratio Chi-Square    1     93.4494    <.0001
            Continuity Adj. Chi-Square     1     91.6096    <.0001
            Mantel-Haenszel Chi-Square     1     92.1849    <.0001
            Phi Coefficient                      -0.1427
            Contingency Coefficient               0.1413
            Cramer's V                           -0.1427


                             Fisher's Exact Test
                      ----------------------------------
                      Cell (1,1) Frequency (F)      1493
                      Left-sided Pr <= F       2.854E-22
                      Right-sided Pr >= F         1.0000

                      Table Probability (P)    1.314E-22
                      Two-sided Pr <= P        4.836E-22

                              Sample Size = 4526



_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      3
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                             Table of dept by sex

                      dept      sex

                      Frequency|
                      Row Pct  |Male    |Female  |  Total
                      ---------+--------+--------+
                      A        |    825 |    108 |    933
                               |  88.42 |  11.58 |
                      ---------+--------+--------+
                      B        |    560 |     25 |    585
                               |  95.73 |   4.27 |
                      ---------+--------+--------+
                      C        |    325 |    593 |    918
                               |  35.40 |  64.60 |
                      ---------+--------+--------+
                      D        |    417 |    375 |    792
                               |  52.65 |  47.35 |
                      ---------+--------+--------+
                      E        |    191 |    393 |    584
                               |  32.71 |  67.29 |
                      ---------+--------+--------+
                      F        |    373 |    341 |    714
                               |  52.24 |  47.76 |
                      ---------+--------+--------+
                      Total        2691     1835     4526

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      4
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                      Statistics for Table of dept by sex

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     5   1068.3717    <.0001
            Likelihood Ratio Chi-Square    5   1220.6148    <.0001
            Mantel-Haenszel Chi-Square     1    507.0346    <.0001
            Phi Coefficient                       0.4859
            Contingency Coefficient               0.4370
            Cramer's V                            0.4859

                              Sample Size = 4526



_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      5
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                            Table of dept by admit

                      dept      admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      A        |    332 |    601 |    933
                               |  35.58 |  64.42 |
                      ---------+--------+--------+
                      B        |    215 |    370 |    585
                               |  36.75 |  63.25 |
                      ---------+--------+--------+
                      C        |    596 |    322 |    918
                               |  64.92 |  35.08 |
                      ---------+--------+--------+
                      D        |    523 |    269 |    792
                               |  66.04 |  33.96 |
                      ---------+--------+--------+
                      E        |    437 |    147 |    584
                               |  74.83 |  25.17 |
                      ---------+--------+--------+
                      F        |    668 |     46 |    714
                               |  93.56 |   6.44 |
                      ---------+--------+--------+
                      Total        2771     1755     4526

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      6
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                     Statistics for Table of dept by admit

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     5    778.9065    <.0001
            Likelihood Ratio Chi-Square    5    855.3209    <.0001
            Mantel-Haenszel Chi-Square     1    724.8170    <.0001
            Phi Coefficient                       0.4148
            Contingency Coefficient               0.3832
            Cramer's V                            0.4148

                              Sample Size = 4526



                            Table 1 of sex by admit
                            Controlling for dept=A

                      sex       admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      Male     |    313 |    512 |    825
                               |  37.94 |  62.06 |
                      ---------+--------+--------+
                      Female   |     19 |     89 |    108
                               |  17.59 |  82.41 |
                      ---------+--------+--------+
                      Total         332      601      933

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      7
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                    Statistics for Table 1 of sex by admit
                            Controlling for dept=A

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     1     17.2480    <.0001
            Likelihood Ratio Chi-Square    1     19.0540    <.0001
            Continuity Adj. Chi-Square     1     16.3718    <.0001
            Mantel-Haenszel Chi-Square     1     17.2295    <.0001
            Phi Coefficient                       0.1360
            Contingency Coefficient               0.1347
            Cramer's V                            0.1360


                             Fisher's Exact Test
                      ----------------------------------
                      Cell (1,1) Frequency (F)       313
                      Left-sided Pr <= F          1.0000
                      Right-sided Pr >= F      1.151E-05

                      Table Probability (P)    7.672E-06
                      Two-sided Pr <= P        1.669E-05

                               Sample Size = 933




_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      8
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                            Table 2 of sex by admit
                            Controlling for dept=B

                      sex       admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      Male     |    207 |    353 |    560
                               |  36.96 |  63.04 |
                      ---------+--------+--------+
                      Female   |      8 |     17 |     25
                               |  32.00 |  68.00 |
                      ---------+--------+--------+
                      Total         215      370      585

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                      9
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                    Statistics for Table 2 of sex by admit
                            Controlling for dept=B

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     1      0.2537    0.6145
            Likelihood Ratio Chi-Square    1      0.2586    0.6111
            Continuity Adj. Chi-Square     1      0.0851    0.7705
            Mantel-Haenszel Chi-Square     1      0.2533    0.6148
            Phi Coefficient                       0.0208
            Contingency Coefficient               0.0208
            Cramer's V                            0.0208


                             Fisher's Exact Test
                      ----------------------------------
                      Cell (1,1) Frequency (F)       207
                      Left-sided Pr <= F          0.7598
                      Right-sided Pr >= F         0.3918

                      Table Probability (P)       0.1516
                      Two-sided Pr <= P           0.6771

                               Sample Size = 585



_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     10
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                            Table 3 of sex by admit
                            Controlling for dept=C

                      sex       admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      Male     |    205 |    120 |    325
                               |  63.08 |  36.92 |
                      ---------+--------+--------+
                      Female   |    391 |    202 |    593
                               |  65.94 |  34.06 |
                      ---------+--------+--------+
                      Total         596      322      918

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     11
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                    Statistics for Table 3 of sex by admit
                            Controlling for dept=C

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     1      0.7535    0.3854
            Likelihood Ratio Chi-Square    1      0.7510    0.3862
            Continuity Adj. Chi-Square     1      0.6332    0.4262
            Mantel-Haenszel Chi-Square     1      0.7527    0.3856
            Phi Coefficient                      -0.0287
            Contingency Coefficient               0.0286
            Cramer's V                           -0.0287


                             Fisher's Exact Test
                      ----------------------------------
                      Cell (1,1) Frequency (F)       205
                      Left-sided Pr <= F          0.2129
                      Right-sided Pr >= F         0.8265

                      Table Probability (P)       0.0394
                      Two-sided Pr <= P           0.3866

                               Sample Size = 918



_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     12
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                            Table 4 of sex by admit
                            Controlling for dept=D

                      sex       admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      Male     |    279 |    138 |    417
                               |  66.91 |  33.09 |
                      ---------+--------+--------+
                      Female   |    244 |    131 |    375
                               |  65.07 |  34.93 |
                      ---------+--------+--------+
                      Total         523      269      792

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     13
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                    Statistics for Table 4 of sex by admit
                            Controlling for dept=D

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     1      0.2980    0.5852
            Likelihood Ratio Chi-Square    1      0.2979    0.5852
            Continuity Adj. Chi-Square     1      0.2216    0.6378
            Mantel-Haenszel Chi-Square     1      0.2976    0.5854
            Phi Coefficient                       0.0194
            Contingency Coefficient               0.0194
            Cramer's V                            0.0194


                             Fisher's Exact Test
                      ----------------------------------
                      Cell (1,1) Frequency (F)       279
                      Left-sided Pr <= F          0.7328
                      Right-sided Pr >= F         0.3188

                      Table Probability (P)       0.0516
                      Two-sided Pr <= P           0.5995

                               Sample Size = 792



_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     14
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                            Table 5 of sex by admit
                            Controlling for dept=E

                      sex       admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      Male     |    138 |     53 |    191
                               |  72.25 |  27.75 |
                      ---------+--------+--------+
                      Female   |    299 |     94 |    393
                               |  76.08 |  23.92 |
                      ---------+--------+--------+
                      Total         437      147      584

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     15
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                    Statistics for Table 5 of sex by admit
                            Controlling for dept=E

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     1      1.0011    0.3171
            Likelihood Ratio Chi-Square    1      0.9904    0.3196
            Continuity Adj. Chi-Square     1      0.8080    0.3687
            Mantel-Haenszel Chi-Square     1      0.9994    0.3175
            Phi Coefficient                      -0.0414
            Contingency Coefficient               0.0414
            Cramer's V                           -0.0414


                             Fisher's Exact Test
                      ----------------------------------
                      Cell (1,1) Frequency (F)       138
                      Left-sided Pr <= F          0.1841
                      Right-sided Pr >= F         0.8646

                      Table Probability (P)       0.0486
                      Two-sided Pr <= P           0.3604

                               Sample Size = 584



_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     16
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                            Table 6 of sex by admit
                            Controlling for dept=F

                      sex       admit

                      Frequency|
                      Row Pct  |No      |Yes     |  Total
                      ---------+--------+--------+
                      Male     |    351 |     22 |    373
                               |  94.10 |   5.90 |
                      ---------+--------+--------+
                      Female   |    317 |     24 |    341
                               |  92.96 |   7.04 |
                      ---------+--------+--------+
                      Total         668       46      714

_______________________________________________________________________________

                      Berkeley Graduate Admissions Data:                     17
                                             09:50 Thursday, September 23, 2004

                              The FREQ Procedure

                    Statistics for Table 6 of sex by admit
                            Controlling for dept=F

            Statistic                     DF       Value      Prob
            ------------------------------------------------------
            Chi-Square                     1      0.3841    0.5354
            Likelihood Ratio Chi-Square    1      0.3836    0.5357
            Continuity Adj. Chi-Square     1      0.2182    0.6404
            Mantel-Haenszel Chi-Square     1      0.3836    0.5357
            Phi Coefficient                       0.0232
            Contingency Coefficient               0.0232
            Cramer's V                            0.0232


                             Fisher's Exact Test
                      ----------------------------------
                      Cell (1,1) Frequency (F)       351
                      Left-sided Pr <= F          0.7801
                      Right-sided Pr >= F         0.3198

                      Table Probability (P)       0.1000
                      Two-sided Pr <= P           0.5458




To test for an association between Gender and Admission controlling for department, here is a good approach. Pool the chisquare tests by adding the values of the chisquare statistics and also adding the degrees of freedom. This means we get Chisquare = 17.2480+0.2537+0.7535+0.2980+1.0011+0.3841 = 19.9384. This exceeds the critical value of 12.59 for a chisquare with 6 degrees of freedom, and we conclude that controlling for department, admission is related to gender. A good reference for this method of pooling chisquare values is Stephen Feinberg's (1989) book, the analysis of cross-classified categorical data.

This overall test does not tell you what the nature of the relationship is. For that, you need to examine the 6 individual 2x2 tables. So look at the chi-square test separately for each department. Use a Bonferroni correction to allow for the fact that you are doing 6 tests. This means declaring the results significant only if p < 0.05/6 = 0.0083. The only results that are significant by this criterion (or with a one-at-a-time test, for that matter) are the results for Department A. There, we see ... what?