1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;NOTE: ODS statements in the SAS Studio environment may disable some output features.7374 /* mathlogreg1.sas */75 title2 'Logistic Regression on the Exploratory Math data';76 %include '/home/u1407221/441s24/SAS08/ReadLabelMath2.sas';NOTE: Format YNFMT is already on the library WORK.FORMATS.NOTE: Format YNFMT has been output.NOTE: Format CRSFMT is already on the library WORK.FORMATS.NOTE: Format CRSFMT has been output.NOTE: Format NFMT is already on the library WORK.FORMATS.NOTE: Format NFMT has been output.NOTE: Format NCFMT is already on the library WORK.FORMATS.NOTE: Format NCFMT has been output.NOTE: PROCEDURE FORMAT used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 257.46kOS Memory 28580.00kTimestamp 02/29/2024 05:56:40 PMStep Count 111 Switch Count 0Page Faults 0Page Reclaims 19Page Swaps 0Voluntary Context Switches 0Involuntary Context Switches 0Block Input Operations 0Block Output Operations 32NOTE: The infile '/home/u1407221/441s24/data/math.data.txt' is:Filename=/home/u1407221/441s24/data/math.data.txt,Owner Name=u1407221,Group Name=oda,Access Permission=-rw-r--r--,Last Modified=10Feb2024:16:04:10,File Size (bytes)=90324NOTE: 1158 records were read from the infile '/home/u1407221/441s24/data/math.data.txt'.The minimum record length was 76.The maximum record length was 76.NOTE: Missing values were generated as a result of performing an operation on missing values.Each place is given by: (Number of times) at (Line):(Column).180 at 122:24NOTE: The data set WORK.MATH has 1158 observations and 37 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 1193.96kOS Memory 29352.00kTimestamp 02/29/2024 05:56:40 PMStep Count 112 Switch Count 2Page Faults 0Page Reclaims 156Page Swaps 0Voluntary Context Switches 19Involuntary Context Switches 0Block Input Operations 0Block Output Operations 776NOTE: There were 1158 observations read from the data set WORK.MATH.NOTE: The data set WORK.REPLIC has 579 observations and 37 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1411.37kOS Memory 29740.00kTimestamp 02/29/2024 05:56:40 PMStep Count 113 Switch Count 2Page Faults 0Page Reclaims 140Page Swaps 0Voluntary Context Switches 9Involuntary Context Switches 0Block Input Operations 0Block Output Operations 520NOTE: There were 1158 observations read from the data set WORK.MATH.NOTE: The data set WORK.EXPLORE has 579 observations and 28 variables.NOTE: DATA statement used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1409.06kOS Memory 29740.00kTimestamp 02/29/2024 05:56:40 PMStep Count 114 Switch Count 2Page Faults 0Page Reclaims 134Page Swaps 0Voluntary Context Switches 11Involuntary Context Switches 0Block Input Operations 0Block Output Operations 520237238 proc freq data=explore;239 title3 'Course by passed with proc freq';240 tables course2 * passed / nocol nopercent chisq;241NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE FREQ used (Total process time):real time 0.05 secondsuser cpu time 0.04 secondssystem cpu time 0.01 secondsmemory 3862.40kOS Memory 30896.00kTimestamp 02/29/2024 05:56:40 PMStep Count 115 Switch Count 5Page Faults 0Page Reclaims 382Page Swaps 0Voluntary Context Switches 29Involuntary Context Switches 0Block Input Operations 0Block Output Operations 536242 proc logistic data=explore;243 title3 'Course by passed with dummy vars: Compare LR Chisq = 34.4171';244 model passed (event='Yes') = c1 c3; /* Omit c2 so Mainstream is reference category */245 /* Without proc format, (event='1') would work. */246 /* Wald chi-squared tests */247 course: test c1=c3=0;248 Course1_vs_2: test c1=0;249 Course1_vs_3: test c1=c3;250 Course2_vs_3: test c3=0;251252 /*253 Question: The estimated odds of passing the course are ___ times as great for a254 student in the elite course, compared to a student in the mainstream course.255256 Question: With 95% confidence, the chances of a student passing the catch-up257 course are between ___% and ___% as great as the chances of passing the258 mainstream course.259260 Note the deliberately vague but useful word "chances."261262 > exp(-1.4838)263 [1] 0.2267743264 > A = -1.4838 - 1.96*0.3171; B = -1.4838 + 1.96*0.3171265 > exp(c(A,B))266 [1] 0.1218072 0.4221967267268269 A few details about the output :270271 The higher the minus 2 Log Likelihood, the lower the (estimated) maximum272 probability of observing these responses. It is a meaure of lack of273 model fit. The Akaike information criterion and Schwarz's Bayesian274 criterion both impose a further penalty for number of explanatory275 variables. Small is good.276277 "Association of Predicted Probabilities and Observed Responses":278 * Every case has Y=0 or Y=1.279 * Every case has a p-hat.280 * Pick a case with Y=0, and another case with Y=1. That's a pair.281 * If the case with Y=0 has a lower p-hat than the case with Y=1,282 the pair is concordant.283 */284NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.NOTE: Convergence criterion (GCONV=1E-8) satisfied.NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.07 secondsuser cpu time 0.07 secondssystem cpu time 0.00 secondsmemory 2649.12kOS Memory 32184.00kTimestamp 02/29/2024 05:56:40 PMStep Count 116 Switch Count 1Page Faults 0Page Reclaims 364Page Swaps 0Voluntary Context Switches 10Involuntary Context Switches 0Block Input Operations 0Block Output Operations 64285 proc iml;NOTE: IML Ready286 title3 'Estimate prob. of passing for for course=3: Compare 31/39 = 0.7949';287 b0 = 0.4077;287 ! b1 = -1.4838;287 ! b2 = 0.9468;288 c1 = 0;288 ! c3=1;289 lcombo = b0 + b1*c1 + b2*c3;290 probpass = exp(lcombo) / (1+exp(lcombo));291 print "Estimated probability of passing course 3 (Elite) is " probpass;292293NOTE: Exiting IML.NOTE: PROCEDURE IML used (Total process time):real time 0.01 secondsuser cpu time 0.02 secondssystem cpu time 0.00 secondsmemory 558.21kOS Memory 31140.00kTimestamp 02/29/2024 05:56:40 PMStep Count 117 Switch Count 1Page Faults 0Page Reclaims 97Page Swaps 0Voluntary Context Switches 9Involuntary Context Switches 0Block Input Operations 0Block Output Operations 24294 proc logistic data=explore;295 title3 'Use the class and contrast statements';296 class course2 / param=ref; /* This param option makes the ALPHABETICALLY297 last category (Mainstream) the reference298 category. Default is effect coding. */299 model passed (event='Yes') = course2;300 contrast 'Catch-up vs Mainstream' course2 1 0;301 contrast 'Elite vs Mainstream' course2 0 1;302 contrast 'Catch-up vs Elite' course2 1 -1;303304 /* Contrast is a little tricky in proc logistic compared to proc glm.305 It lets you specify a set of linear combinations of regression306 coefficients to test against zero. It is essential to know exactly307 what the dummy variable coding scheme is. This can still be more308 convenient than defining your own dummy variables in the data step. */309NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.NOTE: Convergence criterion (GCONV=1E-8) satisfied.NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.07 secondsuser cpu time 0.08 secondssystem cpu time 0.00 secondsmemory 2480.21kOS Memory 32440.00kTimestamp 02/29/2024 05:56:40 PMStep Count 118 Switch Count 1Page Faults 0Page Reclaims 236Page Swaps 0Voluntary Context Switches 11Involuntary Context Switches 0Block Input Operations 0Block Output Operations 64310 proc logistic data=explore;311 title3 'Course controlling for score on diagnostic test';312 class course2 / param=ref;313 model passed (event='Yes') = course2 totscore;314 contrast 'Course controlling for totscore' course2 1 0,315 course2 0 1;316 contrast 'Catch-up vs. Elite controlling for totscore'317 course2 1 -1;318 run;NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.NOTE: Convergence criterion (GCONV=1E-8) satisfied.NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.09 secondsuser cpu time 0.08 secondssystem cpu time 0.00 secondsmemory 2452.18kOS Memory 32440.00kTimestamp 02/29/2024 05:56:41 PMStep Count 119 Switch Count 1Page Faults 0Page Reclaims 198Page Swaps 0Voluntary Context Switches 10Involuntary Context Switches 3Block Input Operations 0Block Output Operations 88319320 /* Estimate a probability of passing without typing in the321 * estimated regression coefficients: Use the Output Delivery322 * System (ODS). All the tables in a SAS results file have names.323 * You can find out what they are with a web search on324 * "proc logistic ods table names" , which will take you to the325 * manual. Easier is to do a preliminary run with ods trace on,326 * which writes the table names on the log file as they are produced.327 */328329 ods trace on;330 proc logistic data=explore;331 title3 'What are the ods table names?';332 model passed (event='Yes') = c1 c3 totscore;333 run;Output Added:-------------Name: ModelInfoLabel: Model InformationTemplate: Stat.Logistic.ModelInfoPath: Logistic.ModelInfo-------------Output Added:-------------Name: NObsLabel: Observations SummaryTemplate: Stat.Logistic.NObsPath: Logistic.NObs-------------Output Added:-------------Name: ResponseProfileLabel: Response ProfileTemplate: Stat.Logistic.ResponseProfilePath: Logistic.ResponseProfile-------------NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.Output Added:-------------Name: ConvergenceStatusLabel: Convergence StatusTemplate: Stat.Logistic.MConvergenceStatusPath: Logistic.ConvergenceStatus-------------NOTE: Convergence criterion (GCONV=1E-8) satisfied.Output Added:-------------Name: FitStatisticsLabel: Fit StatisticsTemplate: Stat.Logistic.FitStatisticsPath: Logistic.FitStatistics-------------Output Added:-------------Name: GlobalTestsLabel: Global TestsTemplate: Stat.Logistic.GlobalTestsPath: Logistic.GlobalTests-------------Output Added:-------------Name: ParameterEstimatesLabel: Parameter EstimatesTemplate: Stat.Logistic.ParameterEstimatesPath: Logistic.ParameterEstimates-------------Output Added:-------------Name: OddsRatiosLabel: Odds RatiosTemplate: Stat.Logistic.OddsRatiosPath: Logistic.OddsRatios-------------Output Added:-------------Name: AssociationLabel: Association StatisticsTemplate: Stat.Logistic.AssociationPath: Logistic.Association-------------NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.06 secondsuser cpu time 0.07 secondssystem cpu time 0.00 secondsmemory 2344.28kOS Memory 32440.00kTimestamp 02/29/2024 05:56:41 PMStep Count 120 Switch Count 1Page Faults 0Page Reclaims 202Page Swaps 0Voluntary Context Switches 9Involuntary Context Switches 0Block Input Operations 0Block Output Operations 64333 ! /* Need run with ods trace */334 ods trace off;335336 ods output ParameterEstimates = estimout;337 /* The ParameterEstimates table will be written to a SAS data338 set called estimout. */339 proc logistic data=explore;340 title3 'Save parameter estimates in data set estimout using ods';341 model passed (event='Yes') = c1 c3 totscore;342NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.NOTE: Convergence criterion (GCONV=1E-8) satisfied.NOTE: The data set WORK.ESTIMOUT has 4 observations and 7 variables.NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.06 secondsuser cpu time 0.06 secondssystem cpu time 0.01 secondsmemory 2720.78kOS Memory 32700.00kTimestamp 02/29/2024 05:56:41 PMStep Count 121 Switch Count 3Page Faults 0Page Reclaims 307Page Swaps 0Voluntary Context Switches 27Involuntary Context Switches 1Block Input Operations 0Block Output Operations 320343 proc print data=estimout;344NOTE: There were 4 observations read from the data set WORK.ESTIMOUT.NOTE: PROCEDURE PRINT used (Total process time):real time 0.01 secondsuser cpu time 0.01 secondssystem cpu time 0.00 secondsmemory 687.93kOS Memory 31400.00kTimestamp 02/29/2024 05:56:41 PMStep Count 122 Switch Count 0Page Faults 0Page Reclaims 65Page Swaps 0Voluntary Context Switches 1Involuntary Context Switches 0Block Input Operations 0Block Output Operations 8345 proc iml;NOTE: IML Ready346 title3 'Estimated Probabilty of Passing';347 use estimout;348 read all var {Estimate} into b;349 print "Estimated regression coefficients";350 print b;351 /* Student in the catch-up class who got 10 right out of 20 */352 x1 = {1, 1, 0, 10};352 ! /* Rows are separated by commas */353 pihat1 = exp(x1`*b)/(1+exp(x1`*b));354 print "Student in the catch-up class who got 10 right out of 20" pihat1;355 /* Student in the elite class who got all 20 right */356 x2 = {1, 0, 1, 20};356 ! /* Rows are separated by commas */357 Pihat2 = exp(x2`*b)/(1+exp(x2`*b));358 print "Student in the elite class who got 20 right out of 20" pihat2;359360361 /********************** Output not shown ****************************/362NOTE: Exiting IML.NOTE: PROCEDURE IML used (Total process time):real time 0.01 secondsuser cpu time 0.02 secondssystem cpu time 0.00 secondsmemory 873.81kOS Memory 31656.00kTimestamp 02/29/2024 05:56:41 PMStep Count 123 Switch Count 1Page Faults 0Page Reclaims 98Page Swaps 0Voluntary Context Switches 12Involuntary Context Switches 0Block Input Operations 0Block Output Operations 16363 proc logistic data=explore noprint;364 title3 'Course controlling for score on diagnostic test';365 class course2 / param=ref;366 model passed (event='Yes') = course2 totscore;367 contrast 'Course controlling for totscore' course2 1 0,368 course2 0 1 / e;369 /* The e option gives the "effect" matrix L in H0: L beta = 0 */370NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.NOTE: Convergence criterion (GCONV=1E-8) satisfied.NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1808.21kOS Memory 32696.00kTimestamp 02/29/2024 05:56:41 PMStep Count 124 Switch Count 1Page Faults 0Page Reclaims 187Page Swaps 0Voluntary Context Switches 9Involuntary Context Switches 0Block Input Operations 0Block Output Operations 32371 proc logistic data=explore noprint;372 title3 'Course controlling for score on diagnostic test';373 class course2 / param=ref;374 model passed (event='Yes') = course2 totscore;375 contrast 'Course controlling for totscore' course2 -19.7 0,376 course2 0 9 / e;377378379380 quit;NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.NOTE: Convergence criterion (GCONV=1E-8) satisfied.NOTE: There were 579 observations read from the data set WORK.EXPLORE.NOTE: PROCEDURE LOGISTIC used (Total process time):real time 0.00 secondsuser cpu time 0.00 secondssystem cpu time 0.00 secondsmemory 1808.00kOS Memory 32696.00kTimestamp 02/29/2024 05:56:41 PMStep Count 125 Switch Count 1Page Faults 0Page Reclaims 189Page Swaps 0Voluntary Context Switches 10Involuntary Context Switches 0Block Input Operations 0Block Output Operations 32381382383 /********************************************************************/384385386387388 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;400