Results: MathLogReg4.sas

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

The FREQ Procedure

The FREQ Procedure

Table outcome * passed

Cross-Tabular Freq Table

Frequency
Table of outcome by passed
outcome passed(Passed the course)
No Yes Total
Fail
90
0
90
Gone
184
0
184
Pass
0
305
305
Total
274
305
579

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

One at a time cat IVs with proc freq

The FREQ Procedure

The FREQ Procedure

Table course2 * outcome

Cross-Tabular Freq Table

Frequency
Row Pct
Table of course2 by outcome
course2 outcome
Fail Gone Pass Total
Catch-up
9
15.25
35
59.32
15
25.42
59
 
Mainstrm
61
16.35
88
23.59
224
60.05
373
 
Elite
6
15.38
2
5.13
31
79.49
39
 
Total
76
125
270
471
Frequency Missing = 108

Statistics for Table of course2 by outcome

Chi-Square Tests

Statistic DF Value Prob
Chi-Square 4 46.2026 <.0001
Likelihood Ratio Chi-Square 4 45.7760 <.0001
Mantel-Haenszel Chi-Square 1 13.4884 0.0002
Phi Coefficient   0.3132  
Contingency Coefficient   0.2989  
Cramer's V   0.2215  

Effective Sample Size = 471
Frequency Missing = 108

WARNING: 19% of the data are missing.

Table sex * outcome

Cross-Tabular Freq Table

Frequency
Row Pct
Table of sex by outcome
sex outcome
Fail Gone Pass Total
Female
45
16.92
73
27.44
148
55.64
266
 
Male
43
15.09
95
33.33
147
51.58
285
 
Total
88
168
295
551
Frequency Missing = 28

Statistics for Table of sex by outcome

Chi-Square Tests

Statistic DF Value Prob
Chi-Square 2 2.2773 0.3202
Likelihood Ratio Chi-Square 2 2.2828 0.3194
Mantel-Haenszel Chi-Square 1 0.1233 0.7254
Phi Coefficient   0.0643  
Contingency Coefficient   0.0642  
Cramer's V   0.0643  

Effective Sample Size = 551
Frequency Missing = 28

Table ethnic * outcome

Cross-Tabular Freq Table

Frequency
Row Pct
Table of ethnic by outcome
ethnic(Judged Nationality of name) outcome
Fail Gone Pass Total
Asian
21
16.03
44
33.59
66
50.38
131
 
Eastern European
13
20.63
17
26.98
33
52.38
63
 
European not Eastern
35
17.95
53
27.18
107
54.87
195
 
Middle-Eastern and Pakistani
11
15.28
22
30.56
39
54.17
72
 
East Indian
6
7.69
25
32.05
47
60.26
78
 
Other and DK
4
10.00
23
57.50
13
32.50
40
 
Total
90
184
305
579

Statistics for Table of ethnic by outcome

Chi-Square Tests

Statistic DF Value Prob
Chi-Square 10 20.2180 0.0273
Likelihood Ratio Chi-Square 10 19.8317 0.0309
Mantel-Haenszel Chi-Square 1 0.4698 0.4931
Phi Coefficient   0.1869  
Contingency Coefficient   0.1837  
Cramer's V   0.1321  

Sample Size = 579

Table tongue * outcome

Cross-Tabular Freq Table

Frequency
Row Pct
Table of tongue by outcome
tongue(Mother Tongue (Eng or Other)) outcome
Fail Gone Pass Total
English
74
18.41
113
28.11
215
53.48
402
 
Other
14
9.40
55
36.91
80
53.69
149
 
Total
88
168
295
551
Frequency Missing = 28

Statistics for Table of tongue by outcome

Chi-Square Tests

Statistic DF Value Prob
Chi-Square 2 8.2920 0.0158
Likelihood Ratio Chi-Square 2 8.8198 0.0122
Mantel-Haenszel Chi-Square 1 1.6654 0.1969
Phi Coefficient   0.1227  
Contingency Coefficient   0.1218  
Cramer's V   0.1227  

Effective Sample Size = 551
Frequency Missing = 28

Table hsmiss * outcome

Cross-Tabular Freq Table

Frequency
Row Pct
Table of hsmiss by outcome
hsmiss(Missing Any High School Data) outcome
Fail Gone Pass Total
No
66
15.17
112
25.75
257
59.08
435
 
Yes
24
16.67
72
50.00
48
33.33
144
 
Total
90
184
305
579

Statistics for Table of hsmiss by outcome

Chi-Square Tests

Statistic DF Value Prob
Chi-Square 2 33.7946 <.0001
Likelihood Ratio Chi-Square 2 33.3038 <.0001
Mantel-Haenszel Chi-Square 1 14.7239 0.0001
Phi Coefficient   0.2416  
Contingency Coefficient   0.2348  
Cramer's V   0.2416  

Sample Size = 579


Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

Simple logistic regression: Reproduce this

The LOGISTIC Procedure

The LOGISTIC Procedure

Model Information

Model Information
Data Set WORK.MATHEX  
Response Variable passed Passed the course
Number of Response Levels 2  
Model binary logit  
Optimization Technique Fisher's scoring  

Observations Summary

Number of Observations Read 579
Number of Observations Used 466

Response Profile

Response Profile
Ordered
Value
passed Total
Frequency
1 Yes 265
2 No 201

Probability modeled is passed='Yes'.

Note:113 observations were deleted due to missing values for the response or explanatory variables.

Convergence Status

Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Fit Statistics

Model Fit Statistics
Criterion Intercept Only Intercept and Covariates
AIC 639.196 527.627
SC 643.340 535.915
-2 Log L 637.196 523.627

Global Tests

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 113.5689 1 <.0001
Score 97.2566 1 <.0001
Wald 76.5326 1 <.0001

Parameter Estimates

Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept 1 -16.1468 1.8664 74.8415 <.0001
hsgpa 1 0.2089 0.0239 76.5326 <.0001

Odds Ratios

Odds Ratio Estimates
Effect Point Estimate 95% Wald
Confidence Limits
hsgpa 1.232 1.176 1.291

Association Statistics

Association of Predicted Probabilities and Observed Responses
Percent Concordant 77.1 Somers' D 0.548
Percent Discordant 22.2 Gamma 0.553
Percent Tied 0.7 Tau-a 0.270
Pairs 53265 c 0.774

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

Hsgpa Reproduce b1 = 0.2089, Wald Chisq = 76.5326

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response passed Response Levels 2
Weight Variable None Populations 137
Data Set MATHEX Total Frequency 466
Frequency Missing 113 Observations 466

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 1 74.84 <.0001
hsgpa 1 76.53 <.0001
Likelihood Ratio 135 110.99 0.9353

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept 16.1468 1.8664 74.84 <.0001
hsgpa -0.2089 0.0239 76.53 <.0001

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

Hsmiss by outcome

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 2
Data Set MATHEX Total Frequency 579
Frequency Missing 0 Observations 579

Population Profiles

Population Profiles
Sample hsmiss Sample Size
1 No 435
2 Yes 144

Response Profiles

Response Profiles
Response outcome
1 Fail
2 Gone
3 Pass

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 122.46 <.0001
hsmiss 2 32.14 <.0001
Likelihood Ratio 0 . .

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept 1 -1.3594 0.1380 97.05 <.0001
  2 -0.8306 0.1132 53.81 <.0001
hsmiss 1 0.6663 0.2856 5.44 0.0196
  2 1.2360 0.2180 32.14 <.0001

Contrasts of ML Estimates

Contrasts of Maximum Likelihood Estimates
Contrast DF Chi-Square Pr > ChiSq
HS Missing method 1 2 32.14 <.0001
HS Missing method 2 2 32.14 <.0001

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

Estimate Probabilities using output from proc catmod

The IML Procedure

Fail_Gone_Pass

  Fail Gone Pass
No Missing HS Data: 0.1517278 0.2574661 0.5908062

Fail_Gone_Pass

  Fail Gone Pass
Yes Missing HS Data: 0.1666786 0.4999798 0.3333416

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

Hsmiss by outcome again for comparison

The FREQ Procedure

The FREQ Procedure

Table hsmiss * outcome

Cross-Tabular Freq Table

Frequency
Row Pct
Table of hsmiss by outcome
hsmiss(Missing Any High School Data) outcome
Fail Gone Pass Total
No
66
15.17
112
25.75
257
59.08
435
 
Yes
24
16.67
72
50.00
48
33.33
144
 
Total
90
184
305
579

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

HS variables

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 434
Data Set MATHEX Total Frequency 435
Frequency Missing 144 Observations 435

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 56.62 <.0001
hsgpa 2 20.48 <.0001
hscalc 2 22.20 <.0001
hsengl 2 0.88 0.6454
Likelihood Ratio 860 690.18 1.0000

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept 1 13.3872 2.5867 26.78 <.0001
  2 16.5898 2.3505 49.82 <.0001
hsgpa 1 -0.1487 0.0425 12.23 0.0005
  2 -0.1502 0.0375 16.03 <.0001
hscalc 1 -0.0485 0.0161 9.04 0.0026
  2 -0.0643 0.0141 20.77 <.0001
hsengl 1 0.00782 0.0214 0.13 0.7152
  2 -0.0119 0.0184 0.42 0.5171

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

HS gpa and calc + course2

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 358
Data Set MATHEX Total Frequency 375
Frequency Missing 204 Observations 375

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 39.24 <.0001
hsgpa 2 17.64 0.0001
hscalc 2 28.44 <.0001
course2 4 2.43 0.6576
Likelihood Ratio 706 542.29 1.0000

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter   Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept   1 13.4580 2.7733 23.55 <.0001
    2 14.3218 2.6073 30.17 <.0001
hsgpa   1 -0.1331 0.0388 11.76 0.0006
    2 -0.1247 0.0358 12.09 0.0005
hscalc   1 -0.0588 0.0169 12.04 0.0005
    2 -0.0789 0.0155 26.03 <.0001
course2 Catch-up 1 -0.2395 0.4966 0.23 0.6296
  Catch-up 2 0.3724 0.4602 0.65 0.4184
  Mainstrm 1 0.0766 0.3068 0.06 0.8028
  Mainstrm 2 0.3559 0.3316 1.15 0.2830

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

HS gpa and calc + diagnostic test

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 374
Data Set MATHEX Total Frequency 375
Frequency Missing 204 Observations 375

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 37.36 <.0001
hsgpa 2 14.57 0.0007
hscalc 2 19.65 <.0001
precalc 2 8.76 0.0126
calc 2 3.57 0.1678
Likelihood Ratio 738 548.17 1.0000

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept 1 13.2650 2.7880 22.64 <.0001
  2 14.4536 2.6768 29.16 <.0001
hsgpa 1 -0.1247 0.0388 10.33 0.0013
  2 -0.1133 0.0365 9.64 0.0019
hscalc 1 -0.0492 0.0170 8.33 0.0039
  2 -0.0664 0.0156 18.04 <.0001
precalc 1 -0.2425 0.1118 4.70 0.0301
  2 -0.2782 0.1049 7.03 0.0080
calc 1 -0.0113 0.0804 0.02 0.8886
  2 -0.1469 0.0802 3.35 0.0670

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

Try gender, ethnic and mother tongue controlling for good stuff

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 370
Data Set MATHEX Total Frequency 370
Frequency Missing 209 Observations 370

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 30.46 <.0001
hsgpa 2 10.43 0.0054
hscalc 2 25.99 <.0001
precalc 2 15.31 0.0005
ethnic 10 8.71 0.5594
gender 2 0.76 0.6851
mtongue 2 13.41 0.0012
Likelihood Ratio 718 512.06 1.0000

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter   Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept   1 10.8891 3.0227 12.98 0.0003
    2 14.4497 2.8149 26.35 <.0001
hsgpa   1 -0.1176 0.0416 7.97 0.0047
    2 -0.0924 0.0387 5.69 0.0170
hscalc   1 -0.0500 0.0177 7.98 0.0047
    2 -0.0816 0.0163 24.91 <.0001
precalc   1 -0.2555 0.1096 5.44 0.0197
    2 -0.3940 0.1056 13.91 0.0002
ethnic Asian 1 0.4901 0.3799 1.66 0.1970
  Asian 2 0.3606 0.3744 0.93 0.3354
  Eastern European 1 0.2206 0.4267 0.27 0.6052
  Eastern European 2 -0.0339 0.4417 0.01 0.9388
  European not Eas 1 -0.2127 0.3132 0.46 0.4971
  European not Eas 2 -0.2316 0.3339 0.48 0.4880
  Middle-Eastern 1 0.5776 0.4495 1.65 0.1988
  Middle-Eastern 2 0.4759 0.4455 1.14 0.2854
  East Indian 1 -0.7675 0.5653 1.84 0.1745
  East Indian 2 0.4496 0.4094 1.21 0.2721
gender   1 -0.1684 0.3421 0.24 0.6227
    2 -0.2690 0.3188 0.71 0.3988
mtongue   1 2.1901 0.7748 7.99 0.0047
    2 -0.6472 0.3836 2.85 0.0916

Contrasts of ML Estimates

Contrasts of Maximum Likelihood Estimates
Contrast DF Chi-Square Pr > ChiSq
Demographics 14 25.48 0.0301
Ethnic and Gender 12 10.07 0.6101

Prediction of Performance in First-year Calculus

Logistic regression with more than 2 resp. categories using proc catmod

hsgpa hscalc precalc calc mtongue

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 368
Data Set MATHEX Total Frequency 370
Frequency Missing 209 Observations 370

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 34.45 <.0001
hsgpa 2 14.43 0.0007
hscalc 2 25.02 <.0001
precalc 2 14.16 0.0008
mtongue 2 16.00 0.0003
Likelihood Ratio 726 517.41 1.0000

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept 1 12.0789 2.9530 16.73 <.0001
  2 14.5402 2.7213 28.55 <.0001
hsgpa 1 -0.1385 0.0403 11.83 0.0006
  2 -0.0975 0.0370 6.93 0.0085
hscalc 1 -0.0430 0.0169 6.46 0.0110
  2 -0.0772 0.0156 24.43 <.0001
precalc 1 -0.2372 0.1068 4.93 0.0263
  2 -0.3679 0.1025 12.89 0.0003
mtongue 1 1.9594 0.7554 6.73 0.0095
  2 -0.8614 0.3523 5.98 0.0145

Prediction of Performance in First-year Calculus

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 368
Data Set MATHEX Total Frequency 370
Frequency Missing 209 Observations 370

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 34.45 <.0001
hsgpa 2 14.43 0.0007
hscalc 2 25.02 <.0001
precalc 2 14.16 0.0008
mtongue 2 16.00 0.0003
Likelihood Ratio 726 517.41 1.0000

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept 1 12.0789 2.9530 16.73 <.0001
  2 14.5402 2.7213 28.55 <.0001
hsgpa 1 -0.1385 0.0403 11.83 0.0006
  2 -0.0975 0.0370 6.93 0.0085
hscalc 1 -0.0430 0.0169 6.46 0.0110
  2 -0.0772 0.0156 24.43 <.0001
precalc 1 -0.2372 0.1068 4.93 0.0263
  2 -0.3679 0.1025 12.89 0.0003
mtongue 1 1.9594 0.7554 6.73 0.0095
  2 -0.8614 0.3523 5.98 0.0145

Contrasts of ML Estimates

Contrasts of Maximum Likelihood Estimates
Contrast DF Chi-Square Pr > ChiSq
Diff Relationships Overall 4 17.06 0.0019
Diff Relationships for hsgpa 1 0.83 0.3630
Diff Relationships for hscalc 1 3.50 0.0612
Diff Relationships for precalc 1 1.15 0.2842
Diff Relationships for mtongue 1 13.60 0.0002

Prediction of Performance in First-year Calculus

Replicate hsgpa hscalc precalc calc mtongue 0.05/8 = .00625

The CATMOD Procedure

The CATMOD Procedure

Model1

Data Summary

Data Summary
Response outcome Response Levels 3
Weight Variable None Populations 377
Data Set REPLIC Total Frequency 385
Frequency Missing 194 Observations 385

Convergence Status

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Analysis of Variance

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Intercept 2 40.43 <.0001
hsgpa 2 17.66 0.0001
hscalc 2 16.72 0.0002
precalc 2 9.69 0.0079
mtongue 2 0.70 0.7051
Likelihood Ratio 744 619.87 0.9997

ML Estimates

Analysis of Maximum Likelihood Estimates
Parameter Function
Number
Estimate Standard
Error
Chi-
Square
Pr > ChiSq
Intercept 1 12.8640 2.5073 26.32 <.0001
  2 12.3389 2.2829 29.21 <.0001
hsgpa 1 -0.1391 0.0364 14.60 0.0001
  2 -0.0980 0.0324 9.13 0.0025
hscalc 1 -0.0317 0.0167 3.61 0.0575
  2 -0.0609 0.0149 16.67 <.0001
precalc 1 -0.2068 0.1005 4.23 0.0397
  2 -0.2615 0.0914 8.18 0.0042
mtongue 1 0.2323 0.3582 0.42 0.5166
  2 0.2330 0.3265 0.51 0.4753

Contrasts of ML Estimates

Contrasts of Maximum Likelihood Estimates
Contrast DF Chi-Square Pr > ChiSq
Diff Relationships for mtongue 1 0.00 0.9986