STA313 F 2004 Handout 10: Simple regression with measurement error
Path2a & b with SAS: Model b is identified, a is not
/* path2a.sas */ options linesize=79 noovp formdlim='_'; title 'STA313f04 Path 2a: Non-identified Simple Regression with Meas Error'; title2 'Just try to fit the model'; data path1; infile 'path2.dat'; input x1 x2 y; proc calis cov; /* Analyze the covariance matrix (Default is corr) */ var x1 y; /* Manifest vars are in the data set */ lineqs /* Simultaneous equations, separated by commas */ y = b F + e3, x1 = F + e1; std /* Variances (not standard deviations) */ F = sigsqF, /* Optional starting values in parentheses */ e1 = sigsqe, e3 = sigsqe3; bounds 0.0 < sigsqF, 0.0 < sigsqe, 0.0 < sigsqe3;
The log file has:
WARNING: Problem not identified: More parameters to estimate ( 4 ) than given values in data matrix ( 3 ). NOTE: GCONV2 convergence criterion satisfied. NOTE: Moore-Penrose inverse is used in covariance matrix. WARNING: Chi square quantile not computable for df= -1.
List file has lots of stuff including:
Vector of Initial Estimates Parameter Estimate Type 1 b 13.08145 Matrix Entry: _GAMMA_[2:1] 2 sigsqF 0.01000 Matrix Entry: _PHI_[1:1] 3 sigsqe3 3.01198 Matrix Entry: _PHI_[2:2] 4 sigsqe 2.97844 Matrix Entry: _PHI_[3:3]
and
Actual Max Abs Over Rest Func Act Objective Obj Fun Gradient Pred Iter arts Calls Con Function Change Element Lambda Change 1* 0 2 0 0.00778 0.0687 4.1649 111E-16 1.176 2* 0 3 0 1.47279E-8 0.00778 0.00682 111E-16 1.088 3 0 4 0 3.5527E-14 1.473E-8 0.000011 0 1.000 4 0 5 0 0 3.55E-14 2.43E-11 0 1.009 Optimization Results Iterations 4 Function Calls 6 Jacobian Calls 5 Active Constraints 0 Objective Function 0 Max Abs Gradient Element 2.431098E-11 Lambda 0 Actual Over Pred Change 1.0085198247 Radius 0.0001551798 GCONV2 convergence criterion satisfied. NOTE: Moore-Penrose inverse is used in covariance matrix. NOTE: Covariance matrix for the estimates is not full rank. NOTE: The variance of some parameter estimates is zero or some parameter estimates are linearly related to other parameter estimates as shown in the following equations: sigsqF = -175608 + 15824 * b - 123.672338 * sigsqe3 + 1.000000 * sigsqe _______________________________________________________________________________ STA313f04 Path 2a: Non-identified Simple Regression with Meas Error 6 Just try to fit the model 11:38 Friday, November 12, 2004 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Fit Function 0.0000 Goodness of Fit Index (GFI) 1.0000 GFI Adjusted for Degrees of Freedom (AGFI) . Root Mean Square Residual (RMR) 0.0000 Parsimonious GFI (Mulaik, 1989) -1.0000 Chi-Square 0.0000 Chi-Square DF -1 Pr > Chi-Square . Independence Model Chi-Square 0.0013 Independence Model Chi-Square DF 1 RMSEA Estimate 0.0000 RMSEA 90% Lower Confidence Limit . RMSEA 90% Upper Confidence Limit . ECVI Estimate 0.0000 ECVI 90% Lower Confidence Limit . ECVI 90% Upper Confidence Limit . Probability of Close Fit . Bentler's Comparative Fit Index . Normal Theory Reweighted LS Chi-Square 0.0000 Akaike's Information Criterion 2.0000 Bozdogan's (1987) CAIC 6.2983 Schwarz's Bayesian Criterion 5.2983 McDonald's (1989) Centrality 0.9975 Bentler & Bonett's (1980) Non-normed Index . Bentler & Bonett's (1980) NFI 1.0000 James, Mulaik, & Brett (1982) Parsimonious NFI -1.0000 Z-Test of Wilson & Hilferty (1931) . Bollen (1986) Normed Index Rho1 . Bollen (1988) Non-normed Index Delta2 0.0013 Hoelter's (1983) Critical N .
Basically it's a disaster and you can sort of tell. But it does give parameter estimates. Do you trust them?
The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Manifest Variable Equations with Estimates x1 = 1.0000 F + 1.0000 e1 y = 11.1208*F + 1.0000 e3 Std Err 0.0189 b t Value 588.0 Variances of Exogenous Variables Standard Variable Parameter Estimate Error t Value F sigsqF 0.0007028 0.01942 0.04 e3 sigsqe3 3.02730 2.42011 1.25 e1 sigsqe 2.97834 0.29921 9.95
Very different from R's estimated (sigsqF, sigsqe3, sigsqe,b)
$estimate [1] 0.97578652 1.98931682 3.10719304 0.01092407 > restimate <- c(0.97578652, 1.98931682, 3.10719304, 0.01092407) > path2a(restimate,xydat) [1] 1579.282 > sasestimate <- c(0.0007028, 3.02730, 2.97834,11.1208) > path2a(sasestimate,xydat) [1] 1579.345
But about the same -2 Log Likelihood. Now fit path2b, which is identified.
/* path2b.sas */ options linesize=79 noovp formdlim='_'; title 'STA313f04 Path 2b: Identified Simple Regression with Meas Error'; title2 'Test H0: b=0'; data path1; infile 'path2.dat'; input x1 x2 y; proc calis cov; /* Analyze the covariance matrix (Default is corr) */ title3 'Full model'; var x1 x2 y; /* Manifest vars are in the data set */ lineqs /* Simultaneous equations, separated by commas */ y = b F + e3, x1 = F + e1, x2 = F + e2; std /* Variances (not standard deviations) */ F = sigsqF, /* Optional starting values in parentheses */ e1 = sigsqe, e2 = sigsqe, e3 = sigsqe3; bounds 0.0 < sigsqF, 0.0 < sigsqe, 0.0 < sigsqe3; proc calis cov; /* Analyze the covariance matrix (Default is corr) */ title3 'Reduced model with b=0'; var x1 x2 y; /* Manifest vars are in the data set */ lineqs /* Simultaneous equations, separated by commas */ y = e3, x1 = F + e1, x2 = F + e2; std /* Variances (not standard deviations) */ F = sigsqF, /* Optional starting values in parentheses */ e1 = sigsqe, e2 = sigsqe, e3 = sigsqe3; bounds 0.0 < sigsqF, 0.0 < sigsqe, 0.0 < sigsqe3;
Part of the list file:
STA313f04 Path 2b: Identified Simple Regression with Meas Error 4 Test H0: b=0 14:00 Friday, November 12, 2004 Full model The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Vector of Initial Estimates Parameter Estimate Type 1 b -0.29711 Matrix Entry: _GAMMA_[3:1] 2 sigsqF 0.77385 Matrix Entry: _PHI_[1:1] 3 sigsqe3 3.04590 Matrix Entry: _PHI_[2:2] 4 sigsqe 2.27748 Matrix Entry: _PHI_[3:3] _PHI_[4:4] _______________________________________________________________________________ Actual Max Abs Over Rest Func Act Objective Obj Fun Gradient Pred Iter arts Calls Con Function Change Element Lambda Change 1 0 2 0 0.00530 0.00267 0.00275 0 0.970 2 0 3 0 0.00526 0.000036 0.000014 0 1.005 3 0 4 0 0.00526 9.94E-10 7.98E-17 0 1.000 Optimization Results Iterations 3 Function Calls 5 Jacobian Calls 4 Active Constraints 0 Objective Function 0.0052594721 Max Abs Gradient Element 7.979728E-17 Lambda 0 Actual Over Pred Change 1.0000307497 Radius 0.0000897404 ABSGCONV convergence criterion satisfied. _______________________________________________________________________________ STA313f04 Path 2b: Identified Simple Regression with Meas Error 6 Test H0: b=0 14:00 Friday, November 12, 2004 Full model The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Fit Function 0.0053 Goodness of Fit Index (GFI) 0.9965 GFI Adjusted for Degrees of Freedom (AGFI) 0.9895 Root Mean Square Residual (RMR) 0.0851 Parsimonious GFI (Mulaik, 1989) 0.6643 Chi-Square 1.0466 Chi-Square DF 2 Pr > Chi-Square 0.5926 Independence Model Chi-Square 17.049 Independence Model Chi-Square DF 3 RMSEA Estimate 0.0000 RMSEA 90% Lower Confidence Limit . RMSEA 90% Upper Confidence Limit 0.1162 ECVI Estimate 0.0463 ECVI 90% Lower Confidence Limit . ECVI 90% Upper Confidence Limit 0.0785 Probability of Close Fit 0.7216 Bentler's Comparative Fit Index 1.0000 Normal Theory Reweighted LS Chi-Square 1.0439 Akaike's Information Criterion -2.9534 Bozdogan's (1987) CAIC -11.5500 Schwarz's Bayesian Criterion -9.5500 McDonald's (1989) Centrality 1.0024 Bentler & Bonett's (1980) Non-normed Index 1.1018 Bentler & Bonett's (1980) NFI 0.9386 James, Mulaik, & Brett (1982) Parsimonious NFI 0.6257 Z-Test of Wilson & Hilferty (1931) -0.2491 Bollen (1986) Normed Index Rho1 0.9079 Bollen (1988) Non-normed Index Delta2 1.0634 Hoelter's (1983) Critical N 1141 _______________________________________________________________________________ STA313f04 Path 2b: Identified Simple Regression with Meas Error 7 Test H0: b=0 Full model The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Manifest Variable Equations with Estimates x1 = 1.0000 F + 1.0000 e1 x2 = 1.0000 F + 1.0000 e2 y = -0.1439*F + 1.0000 e3 Std Err 0.2095 b t Value -0.6869 Variances of Exogenous Variables Standard Variable Parameter Estimate Error t Value F sigsqF 0.83875 0.22433 3.74 e3 sigsqe3 3.09685 0.31277 9.90 e1 sigsqe 2.21257 0.22181 9.97 e2 sigsqe 2.21257 0.22181 9.97
Not extremely different from true values of
sigsqf <- 1 ; sigsqe <- 2 ; sigsqe3 <- 3 ; b <- -0.10
Here is some output from the reduced (restricted) model.
Vector of Initial Estimates Parameter Estimate Type 1 sigsqF 1.00000 Matrix Entry: _PHI_[1:1] 2 sigsqe3 3.11421 Matrix Entry: _PHI_[2:2] 3 sigsqe 1.26214 Matrix Entry: _PHI_[3:3] _PHI_[4:4] Predetermined Elements of the Predicted Moment Matrix x1 x2 y x1 . . 0 x2 . . 0 y 0 0 . WARNING: The predicted moment matrix has 2 constant elements whose values differ from those of the observed moment matrix. The sum of squared differences is 0.0621609525. NOTE: Only 4 elements of the moment matrix are used in the model specification.
SAS is complaining because the model implies zero covariance between x1 & y and between x2 & y, yet the sample covariances are not zero. There is no problem here. But when SAS complains, it is important to understand why.
There was just one step in the maximum likelihood search.
Actual Max Abs Over Rest Func Act Objective Obj Fun Gradient Pred Iter arts Calls Con Function Change Element Lambda Change 1 0 2 0 0.00767 0.2081 1.32E-16 0 0.689 Optimization Results Iterations 1 Function Calls 3 Jacobian Calls 2 Active Constraints 0 Objective Function 0.0076671625 Max Abs Gradient Element 1.31839E-16 Lambda 0 Actual Over Pred Change 0.6890530923 Radius 1.626914686 ABSGCONV convergence criterion satisfied. STA313f04 Path 2b: Identified Simple Regression with Meas Error 14 Test H0: b=0 14:00 Friday, November 12, 2004 Reduced model with b=0 The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Fit Function 0.0077 Goodness of Fit Index (GFI) 0.9948 GFI Adjusted for Degrees of Freedom (AGFI) 0.9896 Root Mean Square Residual (RMR) 0.1100 Parsimonious GFI (Mulaik, 1989) 0.9948 Chi-Square 1.5258 Chi-Square DF 3 Pr > Chi-Square 0.6763 Independence Model Chi-Square 17.049 Independence Model Chi-Square DF 3 RMSEA Estimate 0.0000 RMSEA 90% Lower Confidence Limit . RMSEA 90% Upper Confidence Limit 0.0918 ECVI Estimate 0.0384 ECVI 90% Lower Confidence Limit . ECVI 90% Upper Confidence Limit 0.0715 Probability of Close Fit 0.8122 Bentler's Comparative Fit Index 1.0000 Normal Theory Reweighted LS Chi-Square 1.5532 Akaike's Information Criterion -4.4742 Bozdogan's (1987) CAIC -17.3692 Schwarz's Bayesian Criterion -14.3692 McDonald's (1989) Centrality 1.0037 Bentler & Bonett's (1980) Non-normed Index 1.1049 Bentler & Bonett's (1980) NFI 0.9105 James, Mulaik, & Brett (1982) Parsimonious NFI 0.9105 Z-Test of Wilson & Hilferty (1931) -0.4692 Bollen (1986) Normed Index Rho1 0.9105 Bollen (1988) Non-normed Index Delta2 1.1049 Hoelter's (1983) Critical N 1021
Find the accurate versions of those fit functions (called "objective functions" in the output) and compute
G = 200*(0.0076671625-0.0052594721) = 0.4815381. With 1 df, the null hypothesis will not be rejected. Compare the so-called t statistic for b in the full model output: t = -0.6869. Actually, for large samples, this statistic is approximately standard normal under the null hypothesis, and the square of a standard normal is chi-square with 1 df. As n -> infinity, the difference between t2 and G goes to zero. Compute (-0.6869)2 = 0.4718316. Not too far from G.