1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;7273 /* MathReg3.sas */74 %include '/home/u1407221/441s24/SAS08/ReadLabelMath2.sas';NOTE: Format YNFMT has been output.NOTE: Format CRSFMT has been output.NOTE: Format NFMT has been output.NOTE: Format NCFMT has been output.NOTE: PROCEDURE FORMAT used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 327.93kOS Memory 25252.00kTimestamp 02/24/2024 05:11:05 PMStep Count 24 Switch Count 2Page Faults 0Page Reclaims 101Page Swaps 0Voluntary Context Switches 13Involuntary Context Switches 0Block Input Operations 0Block Output Operations 56NOTE: The infile '/home/u1407221/441s24/data/math.data.txt' is:Filename=/home/u1407221/441s24/data/math.data.txt,Owner Name=u1407221,Group Name=oda,Access Permission=-rw-r--r--,Last Modified=10Feb2024:16:04:10,File Size (bytes)=90324NOTE: 1158 records were read from the infile '/home/u1407221/441s24/data/math.data.txt'.The minimum record length was 76.The maximum record length was 76.NOTE: Missing values were generated as a result of performing an operation on missing values.Each place is given by: (Number of times) at (Line):(Column).180 at 120:24NOTE: The data set WORK.MATH has 1158 observations and 37 variables.NOTE: DATA statement used (Total process time):real time 0.02 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1176.34kOS Memory 26536.00kTimestamp 02/24/2024 05:11:05 PMStep Count 25 Switch Count 3Page Faults 0Page Reclaims 281Page Swaps 0Voluntary Context Switches 23Involuntary Context Switches 0Block Input Operations 0Block Output Operations 776NOTE: There were 1158 observations read from the data set WORK.MATH.NOTE: The data set WORK.REPLIC has 579 observations and 37 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1412.75kOS Memory 26924.00kTimestamp 02/24/2024 05:11:05 PMStep Count 26 Switch Count 2Page Faults 0Page Reclaims 157Page Swaps 0Voluntary Context Switches 13Involuntary Context Switches 0Block Input Operations 0Block Output Operations 528NOTE: There were 1158 observations read from the data set WORK.MATH.NOTE: The data set WORK.EXPLORE has 579 observations and 28 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1409.87kOS Memory 26924.00kTimestamp 02/24/2024 05:11:05 PMStep Count 27 Switch Count 2Page Faults 0Page Reclaims 131Page Swaps 0Voluntary Context Switches 12Involuntary Context Switches 0Block Input Operations 0Block Output Operations 520235 title2 'Replication and Prediction';236237 /* Plan:238239 1. Non-obvious findings from the exploratory analysis (based on Model I,240 which predicts grade from hsgpa hscalc hsengl totscore mtongue) were241 a. HS Engl negative242 b. mtongue negative243 c. totscore positive (diagnostic test matters controlling for HS)244 Test these on the replication with a Bonferroni correction for 3 tests.245 The other two results (HS GPA and HS Calculus) were obvious.246247 2. See if prediction intervals work as advertised on replication data.248249 3. Try prediction of letter grade.250251 4. Try predictions of grade for some imaginary students.252253 /* Test the three findings: Point 1 above */254255 proc reg data = replic plots = none;256 title3 'Try to replicate HS Engl neg, mtongue neg, totscore pos';257 title4 'with a Bonferroni correction (check p < 0.05/3 = 0.01666667)';258 model grade = hsgpa hscalc hsengl totscore mtongue;259260 /* Point 2: Look at prediction intervals */261NOTE: PROCEDURE REG used (Total process time):real time 0.06 secondsuser cpu time 0.06 secondssystem cpu time 0.02 secondsmemory 5302.25kOS Memory 30400.00kTimestamp 02/24/2024 05:11:05 PMStep Count 28 Switch Count 3Page Faults 0Page Reclaims 1745Page Swaps 0Voluntary Context Switches 29Involuntary Context Switches 1Block Input Operations 0Block Output Operations 56262 data predict;263 set math; /* Combined data set */264 keeper = grade+hsgpa+hscalc+hsengl+totscore+mtongue;265 /* keeper will be missing if any of the vars are missing */266 if keeper ne .; /* Discards all other cases */267 grade2 = grade; /* Save value of grade for future use */268 if sample=2 then grade=. ;269 /* Response variable is now missing for replication sample.270 But it is preserved in grade2 */271NOTE: Missing values were generated as a result of performing an operation on missing values.Each place is given by: (Number of times) at (Line):(Column).486 at 264:21 21 at 264:27 4 at 264:34 65 at 264:41 7 at 264:50NOTE: There were 1158 observations read from the data set WORK.MATH.NOTE: The data set WORK.PREDICT has 575 observations and 39 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1424.53kOS Memory 30252.00kTimestamp 02/24/2024 05:11:05 PMStep Count 29 Switch Count 2Page Faults 0Page Reclaims 246Page Swaps 0Voluntary Context Switches 13Involuntary Context Switches 0Block Input Operations 0Block Output Operations 536272 proc reg plots=none data=predict;273 /* Data table predict is the default anyway */274 title3 'Re-running Model I to generate y-hat and prediction intervals';275 model grade = hsgpa hscalc hsengl totscore mtongue;276 output out = predataI predicted = Yhat277 L95 = lowpred278 U95 = hipred;279 /* Data set predataI has everything in predict plus280 Yhat and the lower and upper 95% prediction limits. */281282 /* Does 95 Percent Prediction Interval really contain 95 percent of grades?283 Recall that the data fail all tests for normality, and the prediction284 intervals are based on normal theory. */285NOTE: The data set WORK.PREDATAI has 575 observations and 42 variables.NOTE: PROCEDURE REG used (Total process time):real time 0.04 secondsuser cpu time 0.04 secondssystem cpu time 0.00 secondsmemory 3192.81kOS Memory 31940.00kTimestamp 02/24/2024 05:11:05 PMStep Count 30 Switch Count 4Page Faults 0Page Reclaims 448Page Swaps 0Voluntary Context Switches 34Involuntary Context Switches 0Block Input Operations 0Block Output Operations 584286 data predictB;287 set predataI;288 if (lowpred < grade2 < hipred) then ininterval='Yes';289 else ininterval='No';290NOTE: There were 575 observations read from the data set WORK.PREDATAI.NOTE: The data set WORK.PREDICTB has 575 observations and 43 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.01 secondsmemory 1304.93kOS Memory 30636.00kTimestamp 02/24/2024 05:11:05 PMStep Count 31 Switch Count 2Page Faults 0Page Reclaims 119Page Swaps 0Voluntary Context Switches 13Involuntary Context Switches 0Block Input Operations 0Block Output Operations 520291 proc freq data=predictB;292 title3 'Does 95 Percent Prediction Interval Work?';293 tables sample * ininterval / nocol nopercent;294NOTE: There were 575 observations read from the data set WORK.PREDICTB.NOTE: PROCEDURE FREQ used (Total process time):real time 0.02 secondsuser cpu time 0.02 secondssystem cpu time 0.00 secondsmemory 1489.90kOS Memory 30896.00kTimestamp 02/24/2024 05:11:05 PMStep Count 32 Switch Count 5Page Faults 0Page Reclaims 408Page Swaps 0Voluntary Context Switches 31Involuntary Context Switches 0Block Input Operations 0Block Output Operations 544295 proc print data=predataI;296 title3 'Look at predictions for the replication sample';297 var id sample grade2 Yhat lowpred hipred;298 where sample = 2;299 /* Should predicted marks be used to advise students? */300301 /* Keep trying. Try to predict letter grade. */302NOTE: There were 288 observations read from the data set WORK.PREDATAI.WHERE sample=2;NOTE: PROCEDURE PRINT used (Total process time):real time 0.31 secondsuser cpu time 0.31 secondssystem cpu time 0.00 secondsmemory 4210.28kOS Memory 34220.00kTimestamp 02/24/2024 05:11:05 PMStep Count 33 Switch Count 6Page Faults 0Page Reclaims 1040Page Swaps 0Voluntary Context Switches 23Involuntary Context Switches 0Block Input Operations 0Block Output Operations 304303 data predictC;304 set predictB;305 if 80 <= grade2 <= 100 then lgrade = 'A';306 else if 70 <= grade2 <= 79 then lgrade = 'B';307 else if 60 <= grade2 <= 69 then lgrade = 'C';308 else if 50 <= grade2 <= 59 then lgrade = 'D';309 else if 0 <= grade2 <= 49 then lgrade = 'F';310 label lgrade = 'Letter Grade';311 pregrade = round(Yhat);312 if 80 <= pregrade <= 100 then prelgrade = 'A';313 else if 70 <= pregrade <= 79 then prelgrade = 'B';314 else if 60 <= pregrade <= 69 then prelgrade = 'C';315 else if 50 <= pregrade <= 59 then prelgrade = 'D';316 else if 0 <= pregrade <= 49 then prelgrade = 'F';317 label prelgrade = 'Predicted Letter Grade';318NOTE: There were 575 observations read from the data set WORK.PREDICTB.NOTE: The data set WORK.PREDICTC has 575 observations and 46 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 1326.59kOS Memory 34220.00kTimestamp 02/24/2024 05:11:05 PMStep Count 34 Switch Count 2Page Faults 0Page Reclaims 125Page Swaps 0Voluntary Context Switches 15Involuntary Context Switches 0Block Input Operations 0Block Output Operations 520319 proc freq;320 title3 'Accuracy of predicting Letter Grades From Model I';321 tables sample*prelgrade*lgrade / nocol nopercent;322 /* Will yield separate table for each sample. */323324 /* Predict grade for a new student with hsgpa=80 hscalc=90 hsengl=70325 totscore=15. For just a prediction (no interval), proc glm is easier. */326NOTE: There were 575 observations read from the data set WORK.PREDICTC.NOTE: PROCEDURE FREQ used (Total process time):real time 0.07 secondsuser cpu time 0.06 secondssystem cpu time 0.01 secondsmemory 1375.50kOS Memory 34480.00kTimestamp 02/24/2024 05:11:06 PMStep Count 35 Switch Count 5Page Faults 0Page Reclaims 203Page Swaps 0Voluntary Context Switches 36Involuntary Context Switches 1Block Input Operations 0Block Output Operations 552327 proc glm data = explore;328 model grade = hsgpa hscalc hsengl mtongue totscore;329 estimate 'New Student 1' intercept 1 hsgpa 80 hscalc 90 hsengl 70330 mtongue 1 totscore 15;331 estimate 'New Student 2' intercept 1 hsgpa 80 hscalc 90 hsengl 0332 mtongue 1 totscore 15;333334 /* Prediction for Y_{n+1} is the same as estimate of E[Y|X]. CI from proc glm335 is for E[Y|X]. PREDICTION interval for Y_{n+1} is wider. */336NOTE: PROCEDURE GLM used (Total process time):real time 0.06 secondsuser cpu time 0.06 secondssystem cpu time 0.00 secondsmemory 2030.46kOS Memory 35000.00kTimestamp 02/24/2024 05:11:06 PMStep Count 36 Switch Count 2Page Faults 0Page Reclaims 357Page Swaps 0Voluntary Context Switches 14Involuntary Context Switches 0Block Input Operations 0Block Output Operations 312337 data student;338 hsgpa=80; hscalc=90; hsengl=70; mtongue=1; totscore=15; id = -1; output;339 hsgpa=80; hscalc=90; hsengl=0; mtongue=1; totscore=15; id = -2; output;NOTE: The data set WORK.STUDENT has 2 observations and 6 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 670.03kOS Memory 33960.00kTimestamp 02/24/2024 05:11:06 PMStep Count 37 Switch Count 2Page Faults 0Page Reclaims 92Page Swaps 0Voluntary Context Switches 14Involuntary Context Switches 0Block Input Operations 0Block Output Operations 264340 proc print;341NOTE: There were 2 observations read from the data set WORK.STUDENT.NOTE: PROCEDURE PRINT used (Total process time):real time 0.01 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 667.25kOS Memory 33960.00kTimestamp 02/24/2024 05:11:06 PMStep Count 38 Switch Count 0Page Faults 0Page Reclaims 69Page Swaps 0Voluntary Context Switches 0Involuntary Context Switches 0Block Input Operations 0Block Output Operations 24342 data together;343 set explore student;344 /* All variables not assigned will be missing for new observations */345NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: There were 2 observations read from the data set WORK.STUDENT.NOTE: The data set WORK.TOGETHER has 581 observations and 28 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 1602.56kOS Memory 34480.00kTimestamp 02/24/2024 05:11:06 PMStep Count 39 Switch Count 2Page Faults 0Page Reclaims 136Page Swaps 0Voluntary Context Switches 13Involuntary Context Switches 0Block Input Operations 0Block Output Operations 528346 proc reg noprint data=together;347 title3 'Fit Model I to predict new student data';348 model grade = hsgpa hscalc hsengl mtongue totscore;349 output out = guess predicted = PredictedY350 L95 = LowerLimit351 U95 = UpperLimit;352NOTE: The data set WORK.GUESS has 581 observations and 31 variables.NOTE: PROCEDURE REG used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 2671.65kOS Memory 35780.00kTimestamp 02/24/2024 05:11:06 PMStep Count 40 Switch Count 4Page Faults 0Page Reclaims 299Page Swaps 0Voluntary Context Switches 39Involuntary Context Switches 0Block Input Operations 0Block Output Operations 568353 data newguess;354 set guess;355 if id < 0; /* Discard all other cases */356NOTE: There were 581 observations read from the data set WORK.GUESS.NOTE: The data set WORK.NEWGUESS has 2 observations and 31 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1163.09kOS Memory 34220.00kTimestamp 02/24/2024 05:11:06 PMStep Count 41 Switch Count 2Page Faults 0Page Reclaims 130Page Swaps 0Voluntary Context Switches 14Involuntary Context Switches 0Block Input Operations 0Block Output Operations 264357 proc print;358 title3 'Prediction intervals for new students';359 var id hsgpa hscalc hsengl totscore predictedY LowerLimit UpperLimit;360361 quit;NOTE: There were 2 observations read from the data set WORK.NEWGUESS.NOTE: PROCEDURE PRINT used (Total process time):real time 0.01 secondsuser cpu time 0.02 secondssystem cpu time 0.00 secondsmemory 687.75kOS Memory 33960.00kTimestamp 02/24/2024 05:11:06 PMStep Count 42 Switch Count 1Page Faults 0Page Reclaims 67Page Swaps 0Voluntary Context Switches 9Involuntary Context Switches 0Block Input Operations 0Block Output Operations 8362363364365366367 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;379