1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;7273 /* mathlogreg3.sas */7475 /* Goal: Develop a prediction model that uses all the data and makes a76 prediction for every case. */7778 %include '/home/u1407221/441s24/SAS08/ReadLabelMath2.sas';NOTE: Format YNFMT is already on the library WORK.FORMATS.NOTE: Format YNFMT has been output.NOTE: Format CRSFMT is already on the library WORK.FORMATS.NOTE: Format CRSFMT has been output.NOTE: Format NFMT is already on the library WORK.FORMATS.NOTE: Format NFMT has been output.NOTE: Format NCFMT is already on the library WORK.FORMATS.NOTE: Format NCFMT has been output.NOTE: PROCEDURE FORMAT used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 247.34kOS Memory 28836.00kTimestamp 03/10/2024 07:57:59 PMStep Count 109 Switch Count 0Page Faults 0Page Reclaims 65Page Swaps 0Voluntary Context Switches 0Involuntary Context Switches 0Block Input Operations 0Block Output Operations 32NOTE: The infile '/home/u1407221/441s24/data/math.data.txt' is:Filename=/home/u1407221/441s24/data/math.data.txt,Owner Name=u1407221,Group Name=oda,Access Permission=-rw-r--r--,Last Modified=10Feb2024:17:04:10,File Size (bytes)=90324NOTE: 1158 records were read from the infile '/home/u1407221/441s24/data/math.data.txt'.The minimum record length was 76.The maximum record length was 76.NOTE: Missing values were generated as a result of performing an operation on missing values.Each place is given by: (Number of times) at (Line):(Column).180 at 124:24NOTE: The data set WORK.MATH has 1158 observations and 37 variables.NOTE: DATA statement used (Total process time):real time 0.02 secondsuser cpu time 0.01 secondssystem cpu time 0.01 secondsmemory 1172.81kOS Memory 29352.00kTimestamp 03/10/2024 07:57:59 PMStep Count 110 Switch Count 2Page Faults 0Page Reclaims 123Page Swaps 0Voluntary Context Switches 21Involuntary Context Switches 0Block Input Operations 0Block Output Operations 776NOTE: There were 1158 observations read from the data set WORK.MATH.NOTE: The data set WORK.REPLIC has 579 observations and 37 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1413.65kOS Memory 29740.00kTimestamp 03/10/2024 07:57:59 PMStep Count 111 Switch Count 2Page Faults 0Page Reclaims 157Page Swaps 0Voluntary Context Switches 12Involuntary Context Switches 0Block Input Operations 0Block Output Operations 520NOTE: There were 1158 observations read from the data set WORK.MATH.NOTE: The data set WORK.EXPLORE has 579 observations and 28 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1409.09kOS Memory 29740.00kTimestamp 03/10/2024 07:57:59 PMStep Count 112 Switch Count 2Page Faults 0Page Reclaims 131Page Swaps 0Voluntary Context Switches 14Involuntary Context Switches 0Block Input Operations 0Block Output Operations 520239 title2 'Try using missing values for prediction';240241 /* Make missing data indicators */242 data mathex2;243 set explore;244 if gender = . then sexmiss = 1; else sexmiss=0; /* Includes mtongue */245 if course2 = . then coursemiss = 1; else coursemiss=0;246 if hsgpa = . then hsgpamiss = 1; else hsgpamiss=0;247 if hscalc = . then hscalcmiss = 1; else hscalcmiss=0;248 if hsengl = . then hsenglmiss = 1; else hsenglmiss=0;249 if totscore = . then testmiss = 1; else testmiss=0;250 nmiss = sum(sexmiss--testmiss);251 if hsgpa+hscalc+precalc = . then missused = 1 ; else missused = 0;252253 format sexmiss -- testmiss missused ynfmt.;254 label sexmiss = 'Gender and mother tongue missing'255 coursemiss = 'Course missing'256 hsgpamiss = 'HS GPA missing'257 hscalcmiss = 'HS Calculus mark missing'258 hsenglmiss = 'HS English mark missing'259 testmiss = 'Diagnostic test scores missing'260 missused = 'Any of hsgpa hscalc precalc missing';261262 /* Checks are commented out263 proc freq;264 tables gender*sexmiss / norow nocol nopercent missing;265 tables course*coursemiss / norow nocol nopercent missing;266 tables hsgpa*hsgpamiss / norow nocol nopercent missing;267 tables hscalc*hscalcmiss / norow nocol nopercent missing;268 tables hsengl*hsenglmiss / norow nocol nopercent missing;269 tables totscore*testmiss / norow nocol nopercent missing;270 tables (hsgpamiss hscalcmiss testmiss)*missused271 / norow nocol nopercent missing;272 */273NOTE: Missing values were generated as a result of performing an operation on missing values.Each place is given by: (Number of times) at (Line):(Column).142 at 251:14 204 at 251:21NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: The data set WORK.MATHEX2 has 579 observations and 36 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 1316.68kOS Memory 29612.00kTimestamp 03/10/2024 07:57:59 PMStep Count 113 Switch Count 2Page Faults 0Page Reclaims 128Page Swaps 0Voluntary Context Switches 11Involuntary Context Switches 0Block Input Operations 0Block Output Operations 528274 proc freq data=mathex2;275 title2 'Check usefulness of missing data indicators one at a time';276 tables (sexmiss -- testmiss nmiss) * passed / nocol nopercent chisq;277NOTE: There were 579 observations read from the data set WORK.MATHEX2.NOTE: PROCEDURE FREQ used (Total process time):real time 0.22 secondsuser cpu time 0.22 secondssystem cpu time 0.00 secondsmemory 4457.56kOS Memory 30896.00kTimestamp 03/10/2024 07:57:59 PMStep Count 114 Switch Count 5Page Faults 0Page Reclaims 398Page Swaps 0Voluntary Context Switches 31Involuntary Context Switches 14Block Input Operations 0Block Output Operations 632278 proc freq data=mathex2;279 title2 'Missingness on variables used, and passing the course';280 tables missused * passed / nocol nopercent chisq;281 run;NOTE: There were 579 observations read from the data set WORK.MATHEX2.NOTE: PROCEDURE FREQ used (Total process time):real time 0.04 secondsuser cpu time 0.04 secondssystem cpu time 0.00 secondsmemory 1416.78kOS Memory 30896.00kTimestamp 03/10/2024 07:58:00 PMStep Count 115 Switch Count 5Page Faults 0Page Reclaims 257Page Swaps 0Voluntary Context Switches 34Involuntary Context Switches 0Block Input Operations 0Block Output Operations 544282283 /* Strategy: If missing on hsgpa, hscalc or precalc, predict they will284 not pass. If not missing, use the model with hsgpa, hscalc and precalc.285 Question: Will missingness on Gender/Mother tongue, Course or HS English286 add to the ability of (hsgpa, hscalc or precalc) to predict?287288 However, the following table shows that every student who was missing289 course was also missing on at least one of the good predictors, so290 coursemiss is out. */291292 proc freq data=mathex2;293 tables coursemiss*missused / norow nocol nopercent missing;294NOTE: There were 579 observations read from the data set WORK.MATHEX2.NOTE: PROCEDURE FREQ used (Total process time):real time 0.02 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 1303.81kOS Memory 31152.00kTimestamp 03/10/2024 07:58:00 PMStep Count 116 Switch Count 4Page Faults 0Page Reclaims 191Page Swaps 0Voluntary Context Switches 29Involuntary Context Switches 1Block Input Operations 0Block Output Operations 536295 proc logistic data = mathex2;296 title3 'HS GPA, HS Calculus and Pre-calculus test';297 model passed (event='Yes') = hsgpa hscalc precalc298 sexmiss hsenglmiss;299 MissingVars: test sexmiss=hsenglmiss=0;300301 /* If missing on hsgpa, hscalc or precalc, give them an estimated302 probabilty of passing = 0.348. If not missing, use the model with303 hsgpa, hscalc and precalc to calculate the estimated probabilities. */304305 quit;NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.NOTE: Convergence criterion (GCONV=1E-8) satisfied.NOTE: There were 579 observations read from the data set WORK.MATHEX2.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.08 secondsuser cpu time 0.08 secondssystem cpu time 0.00 secondsmemory 2492.21kOS Memory 31928.00kTimestamp 03/10/2024 07:58:00 PMStep Count 117 Switch Count 1Page Faults 0Page Reclaims 222Page Swaps 0Voluntary Context Switches 10Involuntary Context Switches 0Block Input Operations 0Block Output Operations 56306307308309310311 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;323