STA313 F 2004 Handout 10: Simple regression with measurement error

Path2a & b with SAS: Model b is identified, a is not


/* path2a.sas */
options linesize=79 noovp formdlim='_';
title 'STA313f04 Path 2a: Non-identified Simple Regression with Meas Error';
title2 'Just try to fit the model';

data path1;
     infile 'path2.dat';
     input x1 x2 y;

proc calis cov;          /* Analyze the covariance matrix (Default is corr) */
     var x1 y;           /* Manifest vars are in the data set */
     lineqs              /* Simultaneous equations, separated by commas */
            y =  b F + e3,
            x1 = F + e1;
     std                  /* Variances (not standard deviations) */
            F = sigsqF,   /* Optional starting values in parentheses */
            e1 = sigsqe,
            e3 = sigsqe3;
     bounds 0.0 < sigsqF,
            0.0 < sigsqe,
            0.0 < sigsqe3;


The log file has:

WARNING: Problem not identified: More parameters to estimate ( 4 ) than given
         values in data matrix ( 3 ).
NOTE: GCONV2 convergence criterion satisfied.
NOTE: Moore-Penrose inverse is used in covariance matrix.
WARNING: Chi square quantile not computable for df= -1.

List file has lots of stuff including:


                          Vector of Initial Estimates

                 Parameter      Estimate    Type

            1    b              13.08145    Matrix Entry: _GAMMA_[2:1]
            2    sigsqF          0.01000    Matrix Entry: _PHI_[1:1]
            3    sigsqe3         3.01198    Matrix Entry: _PHI_[2:2]
            4    sigsqe          2.97844    Matrix Entry: _PHI_[3:3]

and


                                                                       Actual
                                                      Max Abs             Over
         Rest    Func      Act    Objective  Obj Fun Gradient             Pred
 Iter    arts   Calls      Con     Function   Change  Element  Lambda   Change

    1*      0       2        0      0.00778   0.0687   4.1649 111E-16    1.176
    2*      0       3        0   1.47279E-8  0.00778  0.00682 111E-16    1.088
    3       0       4        0   3.5527E-14 1.473E-8 0.000011       0    1.000
    4       0       5        0            0 3.55E-14 2.43E-11       0    1.009

                             Optimization Results

Iterations                           4  Function Calls                       6
Jacobian Calls                       5  Active Constraints                   0
Objective Function                   0  Max Abs Gradient Element  2.431098E-11
Lambda                               0  Actual Over Pred Change   1.0085198247
Radius                    0.0001551798

GCONV2 convergence criterion satisfied.

           NOTE: Moore-Penrose inverse is used in covariance matrix.


NOTE: Covariance matrix for the estimates is not full rank.


NOTE: The variance of some parameter estimates is zero or some parameter
      estimates are linearly related to other parameter estimates as shown in
      the following equations:



  sigsqF     =        -175608   +          15824   *   b          -
                                 123.672338   *   sigsqe3    +       1.000000
                                *   sigsqe


_______________________________________________________________________________

      STA313f04 Path 2a: Non-identified Simple Regression with Meas Error
      6
                           Just try to fit the model
                                                11:38 Friday, November 12,
      2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

         Fit Function                                          0.0000
         Goodness of Fit Index (GFI)                           1.0000
         GFI Adjusted for Degrees of Freedom (AGFI)                 .
         Root Mean Square Residual (RMR)                       0.0000
         Parsimonious GFI (Mulaik, 1989)                      -1.0000
         Chi-Square                                            0.0000
         Chi-Square DF                                             -1
         Pr > Chi-Square                                            .
         Independence Model Chi-Square                         0.0013
         Independence Model Chi-Square DF                           1
         RMSEA Estimate                                        0.0000
         RMSEA 90% Lower Confidence Limit                           .
         RMSEA 90% Upper Confidence Limit                           .
         ECVI Estimate                                         0.0000
         ECVI 90% Lower Confidence Limit                            .
         ECVI 90% Upper Confidence Limit                            .
         Probability of Close Fit                                   .
         Bentler's Comparative Fit Index                            .
         Normal Theory Reweighted LS Chi-Square                0.0000
         Akaike's Information Criterion                        2.0000
         Bozdogan's (1987) CAIC                                6.2983
         Schwarz's Bayesian Criterion                          5.2983
         McDonald's (1989) Centrality                          0.9975
         Bentler & Bonett's (1980) Non-normed Index                 .
         Bentler & Bonett's (1980) NFI                         1.0000
         James, Mulaik, & Brett (1982) Parsimonious NFI       -1.0000
         Z-Test of Wilson & Hilferty (1931)                         .
         Bollen (1986) Normed Index Rho1                            .
         Bollen (1988) Non-normed Index Delta2                 0.0013
         Hoelter's (1983) Critical N                                .

Basically it's a disaster and you can sort of tell. But it does give parameter estimates. Do you trust them?



                             The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                  Manifest Variable Equations with Estimates


                x1      =   1.0000 F        +  1.0000 e1
                y       =  11.1208*F        +  1.0000 e3
                Std Err     0.0189 b
                t Value      588.0


                       Variances of Exogenous Variables

                                                 Standard
           Variable Parameter      Estimate         Error    t Value

           F        sigsqF        0.0007028       0.01942       0.04
           e3       sigsqe3         3.02730       2.42011       1.25
           e1       sigsqe          2.97834       0.29921       9.95

Very different from R's estimated (sigsqF, sigsqe3, sigsqe,b)


$estimate
[1] 0.97578652 1.98931682 3.10719304 0.01092407

> restimate <- c(0.97578652, 1.98931682, 3.10719304, 0.01092407)
> path2a(restimate,xydat)
[1] 1579.282
> sasestimate <- c(0.0007028, 3.02730, 2.97834,11.1208)
> path2a(sasestimate,xydat)
[1] 1579.345

But about the same -2 Log Likelihood. Now fit path2b, which is identified.


/* path2b.sas */
options linesize=79 noovp formdlim='_';
title 'STA313f04 Path 2b: Identified Simple Regression with Meas Error';
title2 'Test H0: b=0';

data path1;
     infile 'path2.dat';
     input x1 x2 y;

proc calis cov;          /* Analyze the covariance matrix (Default is corr) */
     title3 'Full model';
     var x1 x2 y;          /* Manifest vars are in the data set */
     lineqs              /* Simultaneous equations, separated by commas */
            y =  b F + e3,
            x1 = F + e1,
            x2 = F + e2;
     std                  /* Variances (not standard deviations) */
            F = sigsqF,   /* Optional starting values in parentheses */
            e1 = sigsqe,
            e2 = sigsqe,
            e3 = sigsqe3;
     bounds 0.0 < sigsqF,
            0.0 < sigsqe,
            0.0 < sigsqe3;

proc calis cov;          /* Analyze the covariance matrix (Default is corr) */
     title3 'Reduced model with b=0';
     var x1 x2 y;        /* Manifest vars are in the data set */
     lineqs              /* Simultaneous equations, separated by commas */
            y =  e3,
            x1 = F + e1,
            x2 = F + e2;
     std                  /* Variances (not standard deviations) */
            F = sigsqF,   /* Optional starting values in parentheses */
            e1 = sigsqe,
            e2 = sigsqe,
            e3 = sigsqe3;
     bounds 0.0 < sigsqF,
            0.0 < sigsqe,
            0.0 < sigsqe3;

Part of the list file:


        STA313f04 Path 2b: Identified Simple Regression with Meas Error
        4
                                 Test H0: b=0   14:00 Friday, November 12,
        2004
                                  Full model

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                          Vector of Initial Estimates

            Parameter      Estimate    Type

       1    b              -0.29711    Matrix Entry: _GAMMA_[3:1]
       2    sigsqF          0.77385    Matrix Entry: _PHI_[1:1]
       3    sigsqe3         3.04590    Matrix Entry: _PHI_[2:2]
       4    sigsqe          2.27748    Matrix Entry: _PHI_[3:3]  _PHI_[4:4]

_______________________________________________________________________________


                                                                        Actual
                                                      Max Abs             Over
         Rest    Func      Act    Objective  Obj Fun Gradient             Pred
 Iter    arts   Calls      Con     Function   Change  Element  Lambda   Change

    1       0       2        0      0.00530  0.00267  0.00275       0    0.970
    2       0       3        0      0.00526 0.000036 0.000014       0    1.005
    3       0       4        0      0.00526 9.94E-10 7.98E-17       0    1.000

                             Optimization Results

Iterations                           3  Function Calls                       5
Jacobian Calls                       4  Active Constraints                   0
Objective Function      0.0052594721  Max Abs Gradient Element  7.979728E-17
Lambda                               0  Actual Over Pred Change   1.0000307497
Radius                    0.0000897404

ABSGCONV convergence criterion satisfied.


_______________________________________________________________________________

        STA313f04 Path 2b: Identified Simple Regression with Meas Error
        6
                                 Test H0: b=0   14:00 Friday, November 12,
        2004
                                  Full model

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

         Fit Function                                          0.0053
         Goodness of Fit Index (GFI)                           0.9965
         GFI Adjusted for Degrees of Freedom (AGFI)            0.9895
         Root Mean Square Residual (RMR)                       0.0851
         Parsimonious GFI (Mulaik, 1989)                       0.6643
         Chi-Square                                            1.0466
         Chi-Square DF                                              2
         Pr > Chi-Square                                       0.5926
         Independence Model Chi-Square                         17.049
         Independence Model Chi-Square DF                           3
         RMSEA Estimate                                        0.0000
         RMSEA 90% Lower Confidence Limit                           .
         RMSEA 90% Upper Confidence Limit                      0.1162
         ECVI Estimate                                         0.0463
         ECVI 90% Lower Confidence Limit                            .
         ECVI 90% Upper Confidence Limit                       0.0785
         Probability of Close Fit                              0.7216
         Bentler's Comparative Fit Index                       1.0000
         Normal Theory Reweighted LS Chi-Square                1.0439
         Akaike's Information Criterion                       -2.9534
         Bozdogan's (1987) CAIC                              -11.5500
         Schwarz's Bayesian Criterion                         -9.5500
         McDonald's (1989) Centrality                          1.0024
         Bentler & Bonett's (1980) Non-normed Index            1.1018
         Bentler & Bonett's (1980) NFI                         0.9386
         James, Mulaik, & Brett (1982) Parsimonious NFI        0.6257
         Z-Test of Wilson & Hilferty (1931)                   -0.2491
         Bollen (1986) Normed Index Rho1                       0.9079
         Bollen (1988) Non-normed Index Delta2                 1.0634
         Hoelter's (1983) Critical N                             1141


_______________________________________________________________________________

        STA313f04 Path 2b: Identified Simple Regression with Meas Error
        7
                       Test H0: b=0   Full model

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                  Manifest Variable Equations with Estimates


                x1      =   1.0000 F        +  1.0000 e1
                x2      =   1.0000 F        +  1.0000 e2
                y       =  -0.1439*F        +  1.0000 e3
                Std Err     0.2095 b
                t Value    -0.6869


                       Variances of Exogenous Variables

                                                 Standard
           Variable Parameter      Estimate         Error    t Value

           F        sigsqF          0.83875       0.22433       3.74
           e3       sigsqe3         3.09685       0.31277       9.90
           e1       sigsqe          2.21257       0.22181       9.97
           e2       sigsqe          2.21257       0.22181       9.97

Not extremely different from true values of

sigsqf <- 1 ; sigsqe <- 2 ; sigsqe3 <- 3 ; b <- -0.10

Here is some output from the reduced (restricted) model.


                          Vector of Initial Estimates

            Parameter      Estimate    Type

       1    sigsqF          1.00000    Matrix Entry: _PHI_[1:1]
       2    sigsqe3         3.11421    Matrix Entry: _PHI_[2:2]
       3    sigsqe          1.26214    Matrix Entry: _PHI_[3:3]  _PHI_[4:4]

             Predetermined Elements of the Predicted Moment Matrix

                             x1                x2                 y

           x1                 .                 .                 0
           x2                 .                 .                 0
           y                  0                 0                 .

WARNING: The predicted moment matrix has 2 constant elements whose values
         differ from those of the observed moment matrix.  The sum of squared
         differences is 0.0621609525.


NOTE: Only 4 elements of the moment matrix are used in the model
specification.

SAS is complaining because the model implies zero covariance between x1 & y and between x2 & y, yet the sample covariances are not zero. There is no problem here. But when SAS complains, it is important to understand why.

There was just one step in the maximum likelihood search.

                                                                       Actual
                                                      Max Abs             Over
         Rest    Func      Act    Objective  Obj Fun Gradient             Pred
 Iter    arts   Calls      Con     Function   Change  Element  Lambda   Change

    1       0       2        0      0.00767   0.2081 1.32E-16       0    0.689

                             Optimization Results

Iterations                           1  Function Calls                       3
Jacobian Calls                       2  Active Constraints                   0
Objective Function      0.0076671625  Max Abs Gradient Element   1.31839E-16
Lambda                               0  Actual Over Pred Change   0.6890530923
Radius                     1.626914686

ABSGCONV convergence criterion satisfied.



        STA313f04 Path 2b: Identified Simple Regression with Meas Error
        14
                                 Test H0: b=0   14:00 Friday, November 12,
        2004
                            Reduced model with b=0

         

                    The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

         Fit Function                                          0.0077
         Goodness of Fit Index (GFI)                           0.9948
         GFI Adjusted for Degrees of Freedom (AGFI)            0.9896
         Root Mean Square Residual (RMR)                       0.1100
         Parsimonious GFI (Mulaik, 1989)                       0.9948
         Chi-Square                                            1.5258
         Chi-Square DF                                              3
         Pr > Chi-Square                                       0.6763
         Independence Model Chi-Square                         17.049
         Independence Model Chi-Square DF                           3
         RMSEA Estimate                                        0.0000
         RMSEA 90% Lower Confidence Limit                           .
         RMSEA 90% Upper Confidence Limit                      0.0918
         ECVI Estimate                                         0.0384
         ECVI 90% Lower Confidence Limit                            .
         ECVI 90% Upper Confidence Limit                       0.0715
         Probability of Close Fit                              0.8122
         Bentler's Comparative Fit Index                       1.0000
         Normal Theory Reweighted LS Chi-Square                1.5532
         Akaike's Information Criterion                       -4.4742
         Bozdogan's (1987) CAIC                              -17.3692
         Schwarz's Bayesian Criterion                        -14.3692
         McDonald's (1989) Centrality                          1.0037
         Bentler & Bonett's (1980) Non-normed Index            1.1049
         Bentler & Bonett's (1980) NFI                         0.9105
         James, Mulaik, & Brett (1982) Parsimonious NFI        0.9105
         Z-Test of Wilson & Hilferty (1931)                   -0.4692
         Bollen (1986) Normed Index Rho1                       0.9105
         Bollen (1988) Non-normed Index Delta2                 1.1049
         Hoelter's (1983) Critical N                             1021

Find the accurate versions of those fit functions (called "objective functions" in the output) and compute

G = 200*(0.0076671625-0.0052594721) = 0.4815381. With 1 df, the null hypothesis will not be rejected. Compare the so-called t statistic for b in the full model output: t = -0.6869. Actually, for large samples, this statistic is approximately standard normal under the null hypothesis, and the square of a standard normal is chi-square with 1 df. As n -> infinity, the difference between t2 and G goes to zero. Compute (-0.6869)2 = 0.4718316. Not too far from G.