STA313 F 2004 Handout 10: Simple regression with measurement error
Path2a & b with SAS: Model b is identified, a is not
/* path2a.sas */
options linesize=79 noovp formdlim='_';
title 'STA313f04 Path 2a: Non-identified Simple Regression with Meas Error';
title2 'Just try to fit the model';
data path1;
infile 'path2.dat';
input x1 x2 y;
proc calis cov; /* Analyze the covariance matrix (Default is corr) */
var x1 y; /* Manifest vars are in the data set */
lineqs /* Simultaneous equations, separated by commas */
y = b F + e3,
x1 = F + e1;
std /* Variances (not standard deviations) */
F = sigsqF, /* Optional starting values in parentheses */
e1 = sigsqe,
e3 = sigsqe3;
bounds 0.0 < sigsqF,
0.0 < sigsqe,
0.0 < sigsqe3;
The log file has:
WARNING: Problem not identified: More parameters to estimate ( 4 ) than given
values in data matrix ( 3 ).
NOTE: GCONV2 convergence criterion satisfied.
NOTE: Moore-Penrose inverse is used in covariance matrix.
WARNING: Chi square quantile not computable for df= -1.
List file has lots of stuff including:
Vector of Initial Estimates
Parameter Estimate Type
1 b 13.08145 Matrix Entry: _GAMMA_[2:1]
2 sigsqF 0.01000 Matrix Entry: _PHI_[1:1]
3 sigsqe3 3.01198 Matrix Entry: _PHI_[2:2]
4 sigsqe 2.97844 Matrix Entry: _PHI_[3:3]
and
Actual
Max Abs Over
Rest Func Act Objective Obj Fun Gradient Pred
Iter arts Calls Con Function Change Element Lambda Change
1* 0 2 0 0.00778 0.0687 4.1649 111E-16 1.176
2* 0 3 0 1.47279E-8 0.00778 0.00682 111E-16 1.088
3 0 4 0 3.5527E-14 1.473E-8 0.000011 0 1.000
4 0 5 0 0 3.55E-14 2.43E-11 0 1.009
Optimization Results
Iterations 4 Function Calls 6
Jacobian Calls 5 Active Constraints 0
Objective Function 0 Max Abs Gradient Element 2.431098E-11
Lambda 0 Actual Over Pred Change 1.0085198247
Radius 0.0001551798
GCONV2 convergence criterion satisfied.
NOTE: Moore-Penrose inverse is used in covariance matrix.
NOTE: Covariance matrix for the estimates is not full rank.
NOTE: The variance of some parameter estimates is zero or some parameter
estimates are linearly related to other parameter estimates as shown in
the following equations:
sigsqF = -175608 + 15824 * b -
123.672338 * sigsqe3 + 1.000000
* sigsqe
_______________________________________________________________________________
STA313f04 Path 2a: Non-identified Simple Regression with Meas Error
6
Just try to fit the model
11:38 Friday, November 12,
2004
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Fit Function 0.0000
Goodness of Fit Index (GFI) 1.0000
GFI Adjusted for Degrees of Freedom (AGFI) .
Root Mean Square Residual (RMR) 0.0000
Parsimonious GFI (Mulaik, 1989) -1.0000
Chi-Square 0.0000
Chi-Square DF -1
Pr > Chi-Square .
Independence Model Chi-Square 0.0013
Independence Model Chi-Square DF 1
RMSEA Estimate 0.0000
RMSEA 90% Lower Confidence Limit .
RMSEA 90% Upper Confidence Limit .
ECVI Estimate 0.0000
ECVI 90% Lower Confidence Limit .
ECVI 90% Upper Confidence Limit .
Probability of Close Fit .
Bentler's Comparative Fit Index .
Normal Theory Reweighted LS Chi-Square 0.0000
Akaike's Information Criterion 2.0000
Bozdogan's (1987) CAIC 6.2983
Schwarz's Bayesian Criterion 5.2983
McDonald's (1989) Centrality 0.9975
Bentler & Bonett's (1980) Non-normed Index .
Bentler & Bonett's (1980) NFI 1.0000
James, Mulaik, & Brett (1982) Parsimonious NFI -1.0000
Z-Test of Wilson & Hilferty (1931) .
Bollen (1986) Normed Index Rho1 .
Bollen (1988) Non-normed Index Delta2 0.0013
Hoelter's (1983) Critical N .
Basically it's a disaster and you can sort of tell. But it does give parameter estimates. Do you trust them?
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Manifest Variable Equations with Estimates
x1 = 1.0000 F + 1.0000 e1
y = 11.1208*F + 1.0000 e3
Std Err 0.0189 b
t Value 588.0
Variances of Exogenous Variables
Standard
Variable Parameter Estimate Error t Value
F sigsqF 0.0007028 0.01942 0.04
e3 sigsqe3 3.02730 2.42011 1.25
e1 sigsqe 2.97834 0.29921 9.95
Very different from R's estimated (sigsqF, sigsqe3, sigsqe,b)
$estimate [1] 0.97578652 1.98931682 3.10719304 0.01092407 > restimate <- c(0.97578652, 1.98931682, 3.10719304, 0.01092407) > path2a(restimate,xydat) [1] 1579.282 > sasestimate <- c(0.0007028, 3.02730, 2.97834,11.1208) > path2a(sasestimate,xydat) [1] 1579.345
But about the same -2 Log Likelihood. Now fit path2b, which is identified.
/* path2b.sas */
options linesize=79 noovp formdlim='_';
title 'STA313f04 Path 2b: Identified Simple Regression with Meas Error';
title2 'Test H0: b=0';
data path1;
infile 'path2.dat';
input x1 x2 y;
proc calis cov; /* Analyze the covariance matrix (Default is corr) */
title3 'Full model';
var x1 x2 y; /* Manifest vars are in the data set */
lineqs /* Simultaneous equations, separated by commas */
y = b F + e3,
x1 = F + e1,
x2 = F + e2;
std /* Variances (not standard deviations) */
F = sigsqF, /* Optional starting values in parentheses */
e1 = sigsqe,
e2 = sigsqe,
e3 = sigsqe3;
bounds 0.0 < sigsqF,
0.0 < sigsqe,
0.0 < sigsqe3;
proc calis cov; /* Analyze the covariance matrix (Default is corr) */
title3 'Reduced model with b=0';
var x1 x2 y; /* Manifest vars are in the data set */
lineqs /* Simultaneous equations, separated by commas */
y = e3,
x1 = F + e1,
x2 = F + e2;
std /* Variances (not standard deviations) */
F = sigsqF, /* Optional starting values in parentheses */
e1 = sigsqe,
e2 = sigsqe,
e3 = sigsqe3;
bounds 0.0 < sigsqF,
0.0 < sigsqe,
0.0 < sigsqe3;
Part of the list file:
STA313f04 Path 2b: Identified Simple Regression with Meas Error
4
Test H0: b=0 14:00 Friday, November 12,
2004
Full model
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Vector of Initial Estimates
Parameter Estimate Type
1 b -0.29711 Matrix Entry: _GAMMA_[3:1]
2 sigsqF 0.77385 Matrix Entry: _PHI_[1:1]
3 sigsqe3 3.04590 Matrix Entry: _PHI_[2:2]
4 sigsqe 2.27748 Matrix Entry: _PHI_[3:3] _PHI_[4:4]
_______________________________________________________________________________
Actual
Max Abs Over
Rest Func Act Objective Obj Fun Gradient Pred
Iter arts Calls Con Function Change Element Lambda Change
1 0 2 0 0.00530 0.00267 0.00275 0 0.970
2 0 3 0 0.00526 0.000036 0.000014 0 1.005
3 0 4 0 0.00526 9.94E-10 7.98E-17 0 1.000
Optimization Results
Iterations 3 Function Calls 5
Jacobian Calls 4 Active Constraints 0
Objective Function 0.0052594721 Max Abs Gradient Element 7.979728E-17
Lambda 0 Actual Over Pred Change 1.0000307497
Radius 0.0000897404
ABSGCONV convergence criterion satisfied.
_______________________________________________________________________________
STA313f04 Path 2b: Identified Simple Regression with Meas Error
6
Test H0: b=0 14:00 Friday, November 12,
2004
Full model
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Fit Function 0.0053
Goodness of Fit Index (GFI) 0.9965
GFI Adjusted for Degrees of Freedom (AGFI) 0.9895
Root Mean Square Residual (RMR) 0.0851
Parsimonious GFI (Mulaik, 1989) 0.6643
Chi-Square 1.0466
Chi-Square DF 2
Pr > Chi-Square 0.5926
Independence Model Chi-Square 17.049
Independence Model Chi-Square DF 3
RMSEA Estimate 0.0000
RMSEA 90% Lower Confidence Limit .
RMSEA 90% Upper Confidence Limit 0.1162
ECVI Estimate 0.0463
ECVI 90% Lower Confidence Limit .
ECVI 90% Upper Confidence Limit 0.0785
Probability of Close Fit 0.7216
Bentler's Comparative Fit Index 1.0000
Normal Theory Reweighted LS Chi-Square 1.0439
Akaike's Information Criterion -2.9534
Bozdogan's (1987) CAIC -11.5500
Schwarz's Bayesian Criterion -9.5500
McDonald's (1989) Centrality 1.0024
Bentler & Bonett's (1980) Non-normed Index 1.1018
Bentler & Bonett's (1980) NFI 0.9386
James, Mulaik, & Brett (1982) Parsimonious NFI 0.6257
Z-Test of Wilson & Hilferty (1931) -0.2491
Bollen (1986) Normed Index Rho1 0.9079
Bollen (1988) Non-normed Index Delta2 1.0634
Hoelter's (1983) Critical N 1141
_______________________________________________________________________________
STA313f04 Path 2b: Identified Simple Regression with Meas Error
7
Test H0: b=0 Full model
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Manifest Variable Equations with Estimates
x1 = 1.0000 F + 1.0000 e1
x2 = 1.0000 F + 1.0000 e2
y = -0.1439*F + 1.0000 e3
Std Err 0.2095 b
t Value -0.6869
Variances of Exogenous Variables
Standard
Variable Parameter Estimate Error t Value
F sigsqF 0.83875 0.22433 3.74
e3 sigsqe3 3.09685 0.31277 9.90
e1 sigsqe 2.21257 0.22181 9.97
e2 sigsqe 2.21257 0.22181 9.97
Not extremely different from true values of
sigsqf <- 1 ; sigsqe <- 2 ; sigsqe3 <- 3 ; b <- -0.10
Here is some output from the reduced (restricted) model.
Vector of Initial Estimates
Parameter Estimate Type
1 sigsqF 1.00000 Matrix Entry: _PHI_[1:1]
2 sigsqe3 3.11421 Matrix Entry: _PHI_[2:2]
3 sigsqe 1.26214 Matrix Entry: _PHI_[3:3] _PHI_[4:4]
Predetermined Elements of the Predicted Moment Matrix
x1 x2 y
x1 . . 0
x2 . . 0
y 0 0 .
WARNING: The predicted moment matrix has 2 constant elements whose values
differ from those of the observed moment matrix. The sum of squared
differences is 0.0621609525.
NOTE: Only 4 elements of the moment matrix are used in the model
specification.
SAS is complaining because the model implies zero covariance between x1 & y and between x2 & y, yet the sample covariances are not zero. There is no problem here. But when SAS complains, it is important to understand why.
There was just one step in the maximum likelihood search.
Actual
Max Abs Over
Rest Func Act Objective Obj Fun Gradient Pred
Iter arts Calls Con Function Change Element Lambda Change
1 0 2 0 0.00767 0.2081 1.32E-16 0 0.689
Optimization Results
Iterations 1 Function Calls 3
Jacobian Calls 2 Active Constraints 0
Objective Function 0.0076671625 Max Abs Gradient Element 1.31839E-16
Lambda 0 Actual Over Pred Change 0.6890530923
Radius 1.626914686
ABSGCONV convergence criterion satisfied.
STA313f04 Path 2b: Identified Simple Regression with Meas Error
14
Test H0: b=0 14:00 Friday, November 12,
2004
Reduced model with b=0
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Fit Function 0.0077
Goodness of Fit Index (GFI) 0.9948
GFI Adjusted for Degrees of Freedom (AGFI) 0.9896
Root Mean Square Residual (RMR) 0.1100
Parsimonious GFI (Mulaik, 1989) 0.9948
Chi-Square 1.5258
Chi-Square DF 3
Pr > Chi-Square 0.6763
Independence Model Chi-Square 17.049
Independence Model Chi-Square DF 3
RMSEA Estimate 0.0000
RMSEA 90% Lower Confidence Limit .
RMSEA 90% Upper Confidence Limit 0.0918
ECVI Estimate 0.0384
ECVI 90% Lower Confidence Limit .
ECVI 90% Upper Confidence Limit 0.0715
Probability of Close Fit 0.8122
Bentler's Comparative Fit Index 1.0000
Normal Theory Reweighted LS Chi-Square 1.5532
Akaike's Information Criterion -4.4742
Bozdogan's (1987) CAIC -17.3692
Schwarz's Bayesian Criterion -14.3692
McDonald's (1989) Centrality 1.0037
Bentler & Bonett's (1980) Non-normed Index 1.1049
Bentler & Bonett's (1980) NFI 0.9105
James, Mulaik, & Brett (1982) Parsimonious NFI 0.9105
Z-Test of Wilson & Hilferty (1931) -0.4692
Bollen (1986) Normed Index Rho1 0.9105
Bollen (1988) Non-normed Index Delta2 1.1049
Hoelter's (1983) Critical N 1021
Find the accurate versions of those fit functions (called "objective functions" in the output) and compute
G = 200*(0.0076671625-0.0052594721) = 0.4815381. With 1 df, the null hypothesis will not be rejected. Compare the so-called t statistic for b in the full model output: t = -0.6869. Actually, for large samples, this statistic is approximately standard normal under the null hypothesis, and the square of a standard normal is chi-square with 1 df. As n -> infinity, the difference between t2 and G goes to zero. Compute (-0.6869)2 = 0.4718316. Not too far from G.