STA313 F 2004 Handout 8
Path Model 1 with SAS
/* path1.sas */ options linesize=79 noovp formdlim='_'; title 'STA313f04 Path 1 Example'; data path1; infile 'path1.dat'; input x1 x2 y1 y2; proc calis cov; /* Analyze the covariance matrix (Default is corr) */ title2 'Full (unrestricted) Model'; var x1 x2 y1 y2; /* Manifest vars are in the data set */ lineqs /* Simultaneous equations, separated by commas */ y1 = b1 x1 + e1, y2 = b2 y1 + b3 x2 + e2; std /* Variances (not standard deviations) */ x1 = sigsqx1, /* Optional starting values in parentheses */ x2 = sigsqx2, e1 = sigsqe1, e2 = sigsqe2; cov /* Covariances */ x1 x2 = sigma12; /* Unmentioned pairs get covariance zero */ bounds 0.0 < sigsqx1, 0.0 < sigsqx2, 0.0 < sigsqe1, 0.0 < sigsqe2; proc calis cov; /* Analyze the covariance matrix (Default is corr) */ title2 'Reduced (restricted) Model: b3=0'; var x1 x2 y1 y2; /* Manifest vars are in the data set */ lineqs /* Simultaneous equations, separated by commas */ y1 = b1 x1 + e1, y2 = b2 y1 + e2; std /* Variances (not standard deviations) */ x1 = sigsqx1, /* Optional starting values in parentheses */ x2 = sigsqx2, e1 = sigsqe1, e2 = sigsqe2; cov /* Covariances */ x1 x2 = sigma12; /* Unmentioned pairs get covariance zero */ bounds 0.0 < sigsqx1, 0.0 < sigsqx2, 0.0 < sigsqe1, 0.0 < sigsqe2; proc iml; title2 'Compute G two ways'; print " "; print "Based on Fit Function"; G1 = 300*(3.3328-0.0227); pval1 = 1-probchi(G1,1); print "G = " G1 ", df = 1, p = " pval1; print " "; print "Based on chi-square"; G2 = 300/299 * (996.5153-6.7874); pval2 = 1-probchi(G2,1); print "G = " G2 ", df = 1, p = " pval2;
Before looking at the list file, here is a little discussion of how the test statistic G is being computed with proc iml. Notice that except for getting the p-value, these calculations could be done with a hand calculator.
Let us use the term "saturated model" for a model with no constraints on the covariance matrix of the manifest variables. This is the language we have been using in class. Any (identified) model with the same number of parameters as the unique elements of the covariance matrix is also saturated, and yields the same -2 Log Likelihood -- that is, any saturated model has a -2 Log likelihood equal to
n p ( 1 +log(2 pi) ) + n log(|Sigma_hat|) .
The equality of the -2 LL quantities for any saturated model follows from the invariance principle of maximum likelihood estimation, just for the record.
Now suppose you fit a non-saturated model. The difference between the quantity above and -2 LL for the model you fit is a reasonable test for the "goodness of fit" of your model. The null hypothesis is that your model holds, versus the alternative that there are no restrictions at all on the variance-covariance matrix of the manifest variables. The difference between the two -2LL quantities is a G; it's asymptotically chi-square, with degrees of freedom the difference between the number of parameters in your model and the number of parameters (unique elements of the covariance matrix) of the saturated model. This "goodness of fit" chisquare will equal zero (with df=0) only if you are fitting a model that is one-to-one with the saturated model.
If you fit an unrestricted model (but still maybe restricted compared to the saturated model) and you also fit a (more) restricted model, the DIFFERENCE between the 2 goodness of fit chi-square statistics is exactly our test statistic G for testing the null hypothesis that the restricted model is true versus the alternative that the unrestricted model is true. There are two ways to get the goodness of fit chisquare statistic from the SAS output for a model. Of course you need to fit a restriced and an unrestricted model, and subtract to get G.
The first way is based on the "Fit Function" of the SAS output, which equals 0.0227 for the Full (unrestricted) model in the path1 example, and 3.3328 for the reduced (restricted) model. Multiply it by n, and you get that goodness of fit chisquare, directly. Multiply the difference by n, and you get the test statistic we are seeking. Thus, what we want is G = 300*(3.3328-0.0227) = 993.03. That's G1 in the proc iml above.
The second way to get G is from the SAS "Chi-square" statistic; Chi-square is equal to 6.7874 for the Full (unrestricted) model, and 996.5153 for the Reduced (restricted) model in the SAS output. This is almost the right number. It's what we want, but multiplied by (n-1)/n. Don't ask me why they do this, but of course for very large samples, (n-1)/n has no effect, and the G test is based on large-sample theory. We will multiply by n/(n-1) to get the traditional likelihood ratio test. Thus, G = 300/299 * (996.5153-6.7874) = 993.038. That's G2 in the proc iml; it's equal to G1 except for rounding error.
For comparison, when we did this example with R (in Handout 7) we got G = 993.038.
Now here is path1.lst.
_______________________________________________________________________________ STA313f04 Path 1 Example 1 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Pattern and Initial Values LINEQS Model Statement Matrix Rows Columns ------Matrix Type------- Term 1 1 _SEL_ 4 6 SELECTION 2 _BETA_ 6 6 EQSBETA IMINUSINV 3 _GAMMA_ 6 4 EQSGAMMA 4 _PHI_ 4 4 SYMMETRIC The 2 Endogenous Variables Manifest y1 y2 Latent The 4 Exogenous Variables Manifest x1 x2 Latent Error e1 e2 _______________________________________________________________________________ STA313f04 Path 1 Example 2 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Pattern and Initial Values Manifest Variable Equations with Initial Estimates y1 = .*x1 + 1.0000 e1 b1 y2 = .*y1 + .*x2 + 1.0000 e2 b2 b3 Variances of Exogenous Variables Variable Parameter Estimate x1 sigsqx1 . x2 sigsqx2 . e1 sigsqe1 . e2 sigsqe2 . Covariances Among Exogenous Variables Var1 Var2 Parameter Estimate x1 x2 sigma12 . _______________________________________________________________________________ STA313f04 Path 1 Example 3 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Observations 300 Model Terms 1 Variables 4 Model Matrices 4 Informations 10 Parameters 8 Variable Mean Std Dev x1 0.16588 2.27566 x2 -0.20103 3.08880 y1 0.11331 2.68760 y2 -0.41910 10.88369 Set Covariances of Exogenous Manifest Variables x1 x2 NOTE: Some initial estimates computed by two-stage LS method. _______________________________________________________________________________ STA313f04 Path 1 Example 4 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Vector of Initial Estimates Parameter Estimate Type 1 b2 2.09453 Matrix Entry: _BETA_[2:1] 2 b1 1.00190 Matrix Entry: _GAMMA_[1:1] 3 b3 2.98282 Matrix Entry: _GAMMA_[2:2] 4 sigsqx1 5.17862 Matrix Entry: _PHI_[1:1] 5 sigma12 0.44648 Matrix Entry: _PHI_[2:1] 6 sigsqx2 9.54070 Matrix Entry: _PHI_[2:2] 7 sigsqe1 2.02490 Matrix Entry: _PHI_[3:3] 8 sigsqe2 3.23198 Matrix Entry: _PHI_[4:4] _______________________________________________________________________________ STA313f04 Path 1 Example 5 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Levenberg-Marquardt Optimization Scaling Update of More (1978) Parameter Estimates 8 Functions (Observations) 10 Lower Bounds 4 Upper Bounds 0 Optimization Start Active Constraints 0 Objective Function 0.0274155346 Max Abs Gradient Element 0.2050710365 Radius 1 Actual Max Abs Over Rest Func Act Objective Obj Fun Gradient Pred Iter arts Calls Con Function Change Element Lambda Change 1 0 2 0 0.02272 0.00469 0.0127 0 0.994 2 0 3 0 0.02270 0.000025 0.000596 0 0.998 3 0 4 0 0.02270 3.992E-8 0.000037 0 0.983 4 0 5 0 0.02270 1.18E-10 1.741E-6 0 0.978 Optimization Results Iterations 4 Function Calls 6 Jacobian Calls 5 Active Constraints 0 Objective Function 0.0227001914 Max Abs Gradient Element 1.7411686E-6 Lambda 0 Actual Over Pred Change 0.9775696387 Radius 0.000047076 ABSGCONV convergence criterion satisfied. _______________________________________________________________________________ STA313f04 Path 1 Example 6 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Fit Function 0.0227 Goodness of Fit Index (GFI) 0.9889 GFI Adjusted for Degrees of Freedom (AGFI) 0.9446 Root Mean Square Residual (RMR) 1.7981 Parsimonious GFI (Mulaik, 1989) 0.3296 Chi-Square 6.7874 Chi-Square DF 2 Pr > Chi-Square 0.0336 Independence Model Chi-Square 1466.5 Independence Model Chi-Square DF 6 RMSEA Estimate 0.0895 RMSEA 90% Lower Confidence Limit 0.0215 RMSEA 90% Upper Confidence Limit 0.1675 ECVI Estimate 0.0771 ECVI 90% Lower Confidence Limit 0.0620 ECVI 90% Upper Confidence Limit 0.1176 Probability of Close Fit 0.1372 Bentler's Comparative Fit Index 0.9967 Normal Theory Reweighted LS Chi-Square 6.7003 Akaike's Information Criterion 2.7874 Bozdogan's (1987) CAIC -6.6202 Schwarz's Bayesian Criterion -4.6202 McDonald's (1989) Centrality 0.9921 Bentler & Bonett's (1980) Non-normed Index 0.9902 Bentler & Bonett's (1980) NFI 0.9954 James, Mulaik, & Brett (1982) Parsimonious NFI 0.3318 Z-Test of Wilson & Hilferty (1931) 1.8416 Bollen (1986) Normed Index Rho1 0.9861 Bollen (1988) Non-normed Index Delta2 0.9967 Hoelter's (1983) Critical N 265 _______________________________________________________________________________ STA313f04 Path 1 Example 7 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Manifest Variable Equations with Estimates y1 = 1.0019*x1 + 1.0000 e1 Std Err 0.0362 b1 t Value 27.7053 y2 = 2.0486*y1 + 2.9828*x2 + 1.0000 e2 Std Err 0.0386 b2 0.0336 b3 t Value 53.0060 88.6977 Variances of Exogenous Variables Standard Variable Parameter Estimate Error t Value x1 sigsqx1 5.17862 0.42354 12.23 x2 sigsqx2 9.54070 0.78030 12.23 e1 sigsqe1 2.02490 0.16561 12.23 e2 sigsqe2 3.21678 0.26309 12.23 Covariances Among Exogenous Variables Standard Var1 Var2 Parameter Estimate Error t Value x1 x2 sigma12 0.44648 0.40732 1.10 _______________________________________________________________________________ STA313f04 Path 1 Example 8 Full (unrestricted) Model 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Manifest Variable Equations with Standardized Estimates y1 = 0.8483*x1 + 0.5295 e1 b1 y2 = 0.4947*y1 + 0.8278*x2 + 0.1611 e2 b2 b3 Squared Multiple Correlations Error Total Variable Variance Variance R-Square 1 y1 2.02490 7.22317 0.7197 2 y2 3.21678 123.88541 0.9740 Correlations Among Exogenous Variables Var1 Var2 Parameter Estimate x1 x2 sigma12 0.06352 _______________________________________________________________________________ STA313f04 Path 1 Example 9 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Pattern and Initial Values LINEQS Model Statement Matrix Rows Columns ------Matrix Type------- Term 1 1 _SEL_ 4 6 SELECTION 2 _BETA_ 6 6 EQSBETA IMINUSINV 3 _GAMMA_ 6 4 EQSGAMMA 4 _PHI_ 4 4 SYMMETRIC The 2 Endogenous Variables Manifest y1 y2 Latent The 4 Exogenous Variables Manifest x1 x2 Latent Error e1 e2 _______________________________________________________________________________ STA313f04 Path 1 Example 10 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Pattern and Initial Values Manifest Variable Equations with Initial Estimates y1 = .*x1 + 1.0000 e1 b1 y2 = .*y1 + 1.0000 e2 b2 Variances of Exogenous Variables Variable Parameter Estimate x1 sigsqx1 . x2 sigsqx2 . e1 sigsqe1 . e2 sigsqe2 . Covariances Among Exogenous Variables Var1 Var2 Parameter Estimate x1 x2 sigma12 . _______________________________________________________________________________ STA313f04 Path 1 Example 11 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Observations 300 Model Terms 1 Variables 4 Model Matrices 4 Informations 10 Parameters 7 Variable Mean Std Dev x1 0.16588 2.27566 x2 -0.20103 3.08880 y1 0.11331 2.68760 y2 -0.41910 10.88369 Set Covariances of Exogenous Manifest Variables x1 x2 NOTE: Some initial estimates computed by two-stage LS method. _______________________________________________________________________________ STA313f04 Path 1 Example 12 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Vector of Initial Estimates Parameter Estimate Type 1 b2 2.09622 Matrix Entry: _BETA_[2:1] 2 b1 1.00190 Matrix Entry: _GAMMA_[1:1] 3 sigsqx1 5.17862 Matrix Entry: _PHI_[1:1] 4 sigma12 0.44648 Matrix Entry: _PHI_[2:1] 5 sigsqx2 9.54070 Matrix Entry: _PHI_[2:2] 6 sigsqe1 2.02490 Matrix Entry: _PHI_[3:3] 7 sigsqe2 88.11855 Matrix Entry: _PHI_[4:4] _______________________________________________________________________________ STA313f04 Path 1 Example 13 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Levenberg-Marquardt Optimization Scaling Update of More (1978) Parameter Estimates 7 Functions (Observations) 10 Lower Bounds 4 Upper Bounds 0 Optimization Start Active Constraints 0 Objective Function 3.3330031833 Max Abs Gradient Element 0.0075986595 Radius 1 Actual Max Abs Over Rest Func Act Objective Obj Fun Gradient Pred Iter arts Calls Con Function Change Element Lambda Change 1 0 2 0 3.33283 0.000176 1.998E-6 0 1.000 Optimization Results Iterations 1 Function Calls 3 Jacobian Calls 2 Active Constraints 0 Objective Function 3.3328270857 Max Abs Gradient Element 1.9984173E-6 Lambda 0 Actual Over Pred Change 1 Radius 0.0375337328 ABSGCONV convergence criterion satisfied. _______________________________________________________________________________ STA313f04 Path 1 Example 14 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Fit Function 3.3328 Goodness of Fit Index (GFI) 0.6694 GFI Adjusted for Degrees of Freedom (AGFI) -0.1018 Root Mean Square Residual (RMR) 8.7264 Parsimonious GFI (Mulaik, 1989) 0.3347 Chi-Square 996.5153 Chi-Square DF 3 Pr > Chi-Square <.0001 Independence Model Chi-Square 1466.5 Independence Model Chi-Square DF 6 RMSEA Estimate 1.0524 RMSEA 90% Lower Confidence Limit 0.9980 RMSEA 90% Upper Confidence Limit 1.1079 ECVI Estimate 3.3804 ECVI 90% Lower Confidence Limit 3.0430 ECVI 90% Upper Confidence Limit 3.7431 Probability of Close Fit 0.0000 Bentler's Comparative Fit Index 0.3197 Normal Theory Reweighted LS Chi-Square 295.2477 Akaike's Information Criterion 990.5153 Bozdogan's (1987) CAIC 976.4040 Schwarz's Bayesian Criterion 979.4040 McDonald's (1989) Centrality 0.1909 Bentler & Bonett's (1980) Non-normed Index -0.3605 Bentler & Bonett's (1980) NFI 0.3205 James, Mulaik, & Brett (1982) Parsimonious NFI 0.1602 Z-Test of Wilson & Hilferty (1931) 22.0440 Bollen (1986) Normed Index Rho1 -0.3590 Bollen (1988) Non-normed Index Delta2 0.3211 Hoelter's (1983) Critical N 4 _______________________________________________________________________________ STA313f04 Path 1 Example 15 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Manifest Variable Equations with Estimates y1 = 1.0019*x1 + 1.0000 e1 Std Err 0.0362 b1 t Value 27.7053 y2 = 2.0499*y1 + 1.0000 e2 Std Err 0.2020 b2 t Value 10.1483 Variances of Exogenous Variables Standard Variable Parameter Estimate Error t Value x1 sigsqx1 5.17862 0.42354 12.23 x2 sigsqx2 9.54070 0.78030 12.23 e1 sigsqe1 2.02490 0.16561 12.23 e2 sigsqe2 88.11855 7.20687 12.23 Covariances Among Exogenous Variables Standard Var1 Var2 Parameter Estimate Error t Value x1 x2 sigma12 0.44648 0.40732 1.10 _______________________________________________________________________________ STA313f04 Path 1 Example 16 Reduced (restricted) Model: b3=0 10:17 Friday, November 5, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Manifest Variable Equations with Standardized Estimates y1 = 0.8483*x1 + 0.5295 e1 b1 y2 = 0.5062*y1 + 0.8624 e2 b2 Squared Multiple Correlations Error Total Variable Variance Variance R-Square 1 y1 2.02490 7.22317 0.7197 2 y2 88.11855 118.47013 0.2562 Correlations Among Exogenous Variables Var1 Var2 Parameter Estimate x1 x2 sigma12 0.06352 _______________________________________________________________________________ STA313f04 Path 1 Example 17 Compute G two ways 10:17 Friday, November 5, 2004 Based on Fit Function G1 PVAL1 G = 993.03 , df = 1, p = 0 Based on chi-square G2 PVAL2 G = 993.03803 , df = 1, p = 0