1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
NOTE: ODS statements in the SAS Studio environment may disable some output features.
73
74 /* mathlogreg1.sas */
75 title2 'Logistic Regression on the Exploratory Math data';
76 %include '/home/u1407221/441s24/SAS08/ReadLabelMath2.sas';
NOTE: Format YNFMT is already on the library WORK.FORMATS.
NOTE: Format YNFMT has been output.
NOTE: Format CRSFMT is already on the library WORK.FORMATS.
NOTE: Format CRSFMT has been output.
NOTE: Format NFMT is already on the library WORK.FORMATS.
NOTE: Format NFMT has been output.
NOTE: Format NCFMT is already on the library WORK.FORMATS.
NOTE: Format NCFMT has been output.
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 257.46k
OS Memory 28580.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 111 Switch Count 0
Page Faults 0
Page Reclaims 19
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 32
NOTE: The infile '/home/u1407221/441s24/data/math.data.txt' is:
Filename=/home/u1407221/441s24/data/math.data.txt,
Owner Name=u1407221,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=10Feb2024:16:04:10,
File Size (bytes)=90324
NOTE: 1158 records were read from the infile '/home/u1407221/441s24/data/math.data.txt'.
The minimum record length was 76.
The maximum record length was 76.
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
180 at 122:24
NOTE: The data set WORK.MATH has 1158 observations and 37 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 1193.96k
OS Memory 29352.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 112 Switch Count 2
Page Faults 0
Page Reclaims 156
Page Swaps 0
Voluntary Context Switches 19
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 776
NOTE: There were 1158 observations read from the data set WORK.MATH.
NOTE: The data set WORK.REPLIC has 579 observations and 37 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1411.37k
OS Memory 29740.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 113 Switch Count 2
Page Faults 0
Page Reclaims 140
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 520
NOTE: There were 1158 observations read from the data set WORK.MATH.
NOTE: The data set WORK.EXPLORE has 579 observations and 28 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1409.06k
OS Memory 29740.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 114 Switch Count 2
Page Faults 0
Page Reclaims 134
Page Swaps 0
Voluntary Context Switches 11
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 520
237
238 proc freq data=explore;
239 title3 'Course by passed with proc freq';
240 tables course2 * passed / nocol nopercent chisq;
241
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.05 seconds
user cpu time 0.04 seconds
system cpu time 0.01 seconds
memory 3862.40k
OS Memory 30896.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 115 Switch Count 5
Page Faults 0
Page Reclaims 382
Page Swaps 0
Voluntary Context Switches 29
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 536
242 proc logistic data=explore;
243 title3 'Course by passed with dummy vars: Compare LR Chisq = 34.4171';
244 model passed (event='Yes') = c1 c3; /* Omit c2 so Mainstream is reference category */
245 /* Without proc format, (event='1') would work. */
246 /* Wald chi-squared tests */
247 course: test c1=c3=0;
248 Course1_vs_2: test c1=0;
249 Course1_vs_3: test c1=c3;
250 Course2_vs_3: test c3=0;
251
252 /*
253 Question: The estimated odds of passing the course are ___ times as great for a
254 student in the elite course, compared to a student in the mainstream course.
255
256 Question: With 95% confidence, the chances of a student passing the catch-up
257 course are between ___% and ___% as great as the chances of passing the
258 mainstream course.
259
260 Note the deliberately vague but useful word "chances."
261
262 > exp(-1.4838)
263 [1] 0.2267743
264 > A = -1.4838 - 1.96*0.3171; B = -1.4838 + 1.96*0.3171
265 > exp(c(A,B))
266 [1] 0.1218072 0.4221967
267
268
269 A few details about the output :
270
271 The higher the minus 2 Log Likelihood, the lower the (estimated) maximum
272 probability of observing these responses. It is a meaure of lack of
273 model fit. The Akaike information criterion and Schwarz's Bayesian
274 criterion both impose a further penalty for number of explanatory
275 variables. Small is good.
276
277 "Association of Predicted Probabilities and Observed Responses":
278 * Every case has Y=0 or Y=1.
279 * Every case has a p-hat.
280 * Pick a case with Y=0, and another case with Y=1. That's a pair.
281 * If the case with Y=0 has a lower p-hat than the case with Y=1,
282 the pair is concordant.
283 */
284
NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.07 seconds
user cpu time 0.07 seconds
system cpu time 0.00 seconds
memory 2649.12k
OS Memory 32184.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 116 Switch Count 1
Page Faults 0
Page Reclaims 364
Page Swaps 0
Voluntary Context Switches 10
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 64
285 proc iml;
NOTE: IML Ready
286 title3 'Estimate prob. of passing for for course=3: Compare 31/39 = 0.7949';
287 b0 = 0.4077;
287 ! b1 = -1.4838;
287 ! b2 = 0.9468;
288 c1 = 0;
288 ! c3=1;
289 lcombo = b0 + b1*c1 + b2*c3;
290 probpass = exp(lcombo) / (1+exp(lcombo));
291 print "Estimated probability of passing course 3 (Elite) is " probpass;
292
293
NOTE: Exiting IML.
NOTE: PROCEDURE IML used (Total process time):
real time 0.01 seconds
user cpu time 0.02 seconds
system cpu time 0.00 seconds
memory 558.21k
OS Memory 31140.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 117 Switch Count 1
Page Faults 0
Page Reclaims 97
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 24
294 proc logistic data=explore;
295 title3 'Use the class and contrast statements';
296 class course2 / param=ref; /* This param option makes the ALPHABETICALLY
297 last category (Mainstream) the reference
298 category. Default is effect coding. */
299 model passed (event='Yes') = course2;
300 contrast 'Catch-up vs Mainstream' course2 1 0;
301 contrast 'Elite vs Mainstream' course2 0 1;
302 contrast 'Catch-up vs Elite' course2 1 -1;
303
304 /* Contrast is a little tricky in proc logistic compared to proc glm.
305 It lets you specify a set of linear combinations of regression
306 coefficients to test against zero. It is essential to know exactly
307 what the dummy variable coding scheme is. This can still be more
308 convenient than defining your own dummy variables in the data step. */
309
NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.07 seconds
user cpu time 0.08 seconds
system cpu time 0.00 seconds
memory 2480.21k
OS Memory 32440.00k
Timestamp 02/29/2024 05:56:40 PM
Step Count 118 Switch Count 1
Page Faults 0
Page Reclaims 236
Page Swaps 0
Voluntary Context Switches 11
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 64
310 proc logistic data=explore;
311 title3 'Course controlling for score on diagnostic test';
312 class course2 / param=ref;
313 model passed (event='Yes') = course2 totscore;
314 contrast 'Course controlling for totscore' course2 1 0,
315 course2 0 1;
316 contrast 'Catch-up vs. Elite controlling for totscore'
317 course2 1 -1;
318 run;
NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.09 seconds
user cpu time 0.08 seconds
system cpu time 0.00 seconds
memory 2452.18k
OS Memory 32440.00k
Timestamp 02/29/2024 05:56:41 PM
Step Count 119 Switch Count 1
Page Faults 0
Page Reclaims 198
Page Swaps 0
Voluntary Context Switches 10
Involuntary Context Switches 3
Block Input Operations 0
Block Output Operations 88
319
320 /* Estimate a probability of passing without typing in the
321 * estimated regression coefficients: Use the Output Delivery
322 * System (ODS). All the tables in a SAS results file have names.
323 * You can find out what they are with a web search on
324 * "proc logistic ods table names" , which will take you to the
325 * manual. Easier is to do a preliminary run with ods trace on,
326 * which writes the table names on the log file as they are produced.
327 */
328
329 ods trace on;
330 proc logistic data=explore;
331 title3 'What are the ods table names?';
332 model passed (event='Yes') = c1 c3 totscore;
333 run;
Output Added:
-------------
Name: ModelInfo
Label: Model Information
Template: Stat.Logistic.ModelInfo
Path: Logistic.ModelInfo
-------------
Output Added:
-------------
Name: NObs
Label: Observations Summary
Template: Stat.Logistic.NObs
Path: Logistic.NObs
-------------
Output Added:
-------------
Name: ResponseProfile
Label: Response Profile
Template: Stat.Logistic.ResponseProfile
Path: Logistic.ResponseProfile
-------------
NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.
Output Added:
-------------
Name: ConvergenceStatus
Label: Convergence Status
Template: Stat.Logistic.MConvergenceStatus
Path: Logistic.ConvergenceStatus
-------------
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
Output Added:
-------------
Name: FitStatistics
Label: Fit Statistics
Template: Stat.Logistic.FitStatistics
Path: Logistic.FitStatistics
-------------
Output Added:
-------------
Name: GlobalTests
Label: Global Tests
Template: Stat.Logistic.GlobalTests
Path: Logistic.GlobalTests
-------------
Output Added:
-------------
Name: ParameterEstimates
Label: Parameter Estimates
Template: Stat.Logistic.ParameterEstimates
Path: Logistic.ParameterEstimates
-------------
Output Added:
-------------
Name: OddsRatios
Label: Odds Ratios
Template: Stat.Logistic.OddsRatios
Path: Logistic.OddsRatios
-------------
Output Added:
-------------
Name: Association
Label: Association Statistics
Template: Stat.Logistic.Association
Path: Logistic.Association
-------------
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.06 seconds
user cpu time 0.07 seconds
system cpu time 0.00 seconds
memory 2344.28k
OS Memory 32440.00k
Timestamp 02/29/2024 05:56:41 PM
Step Count 120 Switch Count 1
Page Faults 0
Page Reclaims 202
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 64
333 ! /* Need run with ods trace */
334 ods trace off;
335
336 ods output ParameterEstimates = estimout;
337 /* The ParameterEstimates table will be written to a SAS data
338 set called estimout. */
339 proc logistic data=explore;
340 title3 'Save parameter estimates in data set estimout using ods';
341 model passed (event='Yes') = c1 c3 totscore;
342
NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: The data set WORK.ESTIMOUT has 4 observations and 7 variables.
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.06 seconds
user cpu time 0.06 seconds
system cpu time 0.01 seconds
memory 2720.78k
OS Memory 32700.00k
Timestamp 02/29/2024 05:56:41 PM
Step Count 121 Switch Count 3
Page Faults 0
Page Reclaims 307
Page Swaps 0
Voluntary Context Switches 27
Involuntary Context Switches 1
Block Input Operations 0
Block Output Operations 320
343 proc print data=estimout;
344
NOTE: There were 4 observations read from the data set WORK.ESTIMOUT.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 687.93k
OS Memory 31400.00k
Timestamp 02/29/2024 05:56:41 PM
Step Count 122 Switch Count 0
Page Faults 0
Page Reclaims 65
Page Swaps 0
Voluntary Context Switches 1
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 8
345 proc iml;
NOTE: IML Ready
346 title3 'Estimated Probabilty of Passing';
347 use estimout;
348 read all var {Estimate} into b;
349 print "Estimated regression coefficients";
350 print b;
351 /* Student in the catch-up class who got 10 right out of 20 */
352 x1 = {1, 1, 0, 10};
352 ! /* Rows are separated by commas */
353 pihat1 = exp(x1`*b)/(1+exp(x1`*b));
354 print "Student in the catch-up class who got 10 right out of 20" pihat1;
355 /* Student in the elite class who got all 20 right */
356 x2 = {1, 0, 1, 20};
356 ! /* Rows are separated by commas */
357 Pihat2 = exp(x2`*b)/(1+exp(x2`*b));
358 print "Student in the elite class who got 20 right out of 20" pihat2;
359
360
361 /********************** Output not shown ****************************/
362
NOTE: Exiting IML.
NOTE: PROCEDURE IML used (Total process time):
real time 0.01 seconds
user cpu time 0.02 seconds
system cpu time 0.00 seconds
memory 873.81k
OS Memory 31656.00k
Timestamp 02/29/2024 05:56:41 PM
Step Count 123 Switch Count 1
Page Faults 0
Page Reclaims 98
Page Swaps 0
Voluntary Context Switches 12
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 16
363 proc logistic data=explore noprint;
364 title3 'Course controlling for score on diagnostic test';
365 class course2 / param=ref;
366 model passed (event='Yes') = course2 totscore;
367 contrast 'Course controlling for totscore' course2 1 0,
368 course2 0 1 / e;
369 /* The e option gives the "effect" matrix L in H0: L beta = 0 */
370
NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1808.21k
OS Memory 32696.00k
Timestamp 02/29/2024 05:56:41 PM
Step Count 124 Switch Count 1
Page Faults 0
Page Reclaims 187
Page Swaps 0
Voluntary Context Switches 9
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 32
371 proc logistic data=explore noprint;
372 title3 'Course controlling for score on diagnostic test';
373 class course2 / param=ref;
374 model passed (event='Yes') = course2 totscore;
375 contrast 'Course controlling for totscore' course2 -19.7 0,
376 course2 0 9 / e;
377
378
379
380 quit;
NOTE: PROC LOGISTIC is modeling the probability that passed='Yes'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 579 observations read from the data set WORK.EXPLORE.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1808.00k
OS Memory 32696.00k
Timestamp 02/29/2024 05:56:41 PM
Step Count 125 Switch Count 1
Page Faults 0
Page Reclaims 189
Page Swaps 0
Voluntary Context Switches 10
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 32
381
382
383 /********************************************************************/
384
385
386
387
388 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
400