Structural Equation Models I

STA429/1007 F 2004 Handout 14

Structural Equation Models with proc calis

In the classical structural equation models, where everything is normal with expected value zero (accomplished by centering all variables by subtracting off the mean), there is a "saturated model" -- one that imposes no constraints on the variance-covariance matrix of the manifest variables, other than the obscure constraints implied by multivariate normal distribution. The saturated model is most easily estimated by just estimating all the population varianes and covariances by the corresponding sample quantities. For any data set, there are infinitely many potential saturated models. They all have the same number of parameters as the number of variances and covariances of manifest variables, and their parameters are one-to-one functions of the variances and covariances. Each member of this class of saturated models has the same -2 log likelihood. In other words, they are all equivalent, and they all fit the data equally well. Rather than minimizing -2 Likelihood (equivalent to maximizing the likelihood), proc calis does something else equivalent; it minimizes

(-2 Log likelihood for the specified model
minus
-2 Log Likelihood for the saturated model) / n

Once this quantity is minimized over the set of parameters, the parameter values at which the minimum occurs are the Maximum Likelihood Estimates.

In the output, the minimum value is called either the "objective function" or the "fit function," depending on where you look. The number called objective function is preferable, because it is displayed to more decimal places of accuracy. Multiply it by n, and you get a likelihood ratio test for goodness of fit, comparing the specified model to the saturated model.

So the likelihood ratio test for goodness of fit is

G = n * Fit Function, or G = n * Objective Function

and the degrees of freedom for the test are

df = p(p+1)/2 - k,

where there are p manifest variables and k parameters in the model.

To compare a full and a reduced model, you can use the difference of the chisquare values for goodness of fit, with degrees of freedom equal to the difference in degrees of freedom. The result is a classical likelihood ratio test. There is another way to get a chisquare test for goodness of fit. In the massive table of fit information, the value labelled "Chi-square" is

                                          (n - 1)
                          Chi-square  =   --------  * G
                                             n

I have no idea why they multiply by (n-1)/n. Of course, since these are large-sample likelihood ratio tests, the effect of this multiplier is negligible, and the result is still a valid chi-square test -- a bit more conservative than the standard one. It's convenient, too, because it comes with a p-value. And as you might expect, the difference between Chi-square values for a full and a reduced model is also a valid chisquare test, with df equal to the diference in degrees of freedom.

There's one more useful test that is given by default. The "Independence Model Chi-Square" compares the saturated model to one that has zero covariances (no relationship at all) among all the variables. It is a simultaneous test for all the covariances (or equivalently, all the correlations) among manifest variables. If it's not significant, then there's not much point in fitting a structural equation model.


/* chain3.sas*/
options linesize=79 noovp formdlim='_';
title 'Three-var chain model';

data chain3;
        infile 'chain.dat';
        input y1 y2 y3;

proc calis cov pshort simple;
        title2 'Full (unrestricted) Model: y1 -> y2 -> y3';
        var y1 y2 y3;           /* Manifest vars in the data set */
        lineqs
                y2 = b1 y1 + e1,
                y3 = b2 y2 + e2;

        std                    /* Variances not standard deviation */
                y1 = sigy1,
                e1 = sigee1,
                e2 = sigee2;

        /* cov statement is unnecessary */
        cov                     /* Covariances */
                y1 e1 = 0,
                y1 e2 = 0,
                e1 e2 = 0;

        bounds
                0.0 < sigy1,
                0.0 < sigee1,
                0.0 < sigee2;

proc calis cov pshort;
        title2 'Full (unrestricted) Model: y3 -> y2 -> y1';
        var y1 y2 y3;           /* Manifest vars are in the data set */
        lineqs
                y2 = b1 y1 + e1,
                y3 = b2 y2 + e2;

        std                     /* Variances not standard deviation */
                y1 = sigy1,
                e1 = sigee1,
                e2 = sigee2;

        /* cov statement is unnecessary */
        cov                     /* Covariances */
                y1 e1 = 0,
                y1 e2 = 0,
                e1 e2 = 0;

        bounds
                0.0 < sigy1,
                0.0 < sigee1,
                0.0 < sigee2;


proc calis cov pshort;
        title2 'Reduced Model: y1 -> y2  y3';
        var y1 y2 y3;           /* Manifest vars are in the data set */
        lineqs
                y2 = b1 y1 + e1,
                y3 = b2 y2 + e2;

        std                    /* Variances not standard deviation */
                y1 = sigy1,
                e1 = sigee1,
                e2 = sigee2;

        /* cov statement is unnecessary */
        cov                     /* Covariances */
                y1 e1 = 0,
                y1 e2 = 0,
                e1 e2 = 0;

        bounds
                0.0 < sigy1,
                0.0 < sigee1,
                0.0 < sigee2;

        lincon  b2=0; /* Linear constraints separated by commas.
                         The independence model would be
                         lincon  b1=0 b2=0;
                      */



_______________________________________________________________________________

                             Three-var chain model                            1
                   Full (unrestricted) Model: y1 -> y2 -> y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
           Covariance Structure Analysis: Pattern and Initial Values

                            LINEQS Model Statement


                       Matrix      Rows    Columns    ------Matrix Type-------

Term 1            1    _SEL_          3          5    SELECTION
                  2    _BETA_         5          5    EQSBETA        IMINUSINV
                  3    _GAMMA_        5          3    EQSGAMMA
                  4    _PHI_          3          3    SYMMETRIC


                          The 2 Endogenous Variables

    Manifest        y2  y3
    Latent


                          The 3 Exogenous Variables

    Manifest        y1
    Latent
    Error           e1  e2

_______________________________________________________________________________

                             Three-var chain model                            2
                   Full (unrestricted) Model: y1 -> y2 -> y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

            Observations         200    Model Terms              1
            Variables              3    Model Matrices           4
            Informations           6    Parameters               5


                     Variable          Mean       Std Dev

                     y1             0.07600       1.07182
                     y2            -0.09045       1.47679
                     y3             0.03195       1.35460

_______________________________________________________________________________

                             Three-var chain model                            3
                   Full (unrestricted) Model: y1 -> y2 -> y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                       Levenberg-Marquardt Optimization

                         Scaling Update of More (1978)

                   Parameter Estimates                    5
                   Functions (Observations)               6
                   Lower Bounds                           3
                   Upper Bounds                           0

                              Optimization Start

Active Constraints                   0  Objective Function        2.2667947436
Max Abs Gradient Element  0.7827946588  Radius                               1


                                                                        Actual
                                                      Max Abs             Over
         Rest    Func      Act    Objective  Obj Fun Gradient             Pred
 Iter    arts   Calls      Con     Function   Change  Element  Lambda   Change

    1       0       2        0      1.42926   0.8375   0.1975   0.338    1.000
    2       0       3        0      0.09884   1.3304   0.1695       0    3.263
    3       0       4        0      0.01771   0.0811 2.93E-16       0    1.320

                             Optimization Results

Iterations                           3  Function Calls                       5
Jacobian Calls                       4  Active Constraints                   0
Objective Function        0.0177065358  Max Abs Gradient Element  2.925504E-16
Lambda                               0  Actual Over Pred Change   1.3197130985
Radius                    0.7012876847

ABSGCONV convergence criterion satisfied.

_______________________________________________________________________________

                             Three-var chain model                            4
                   Full (unrestricted) Model: y1 -> y2 -> y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

         Fit Function                                          0.0177
         Goodness of Fit Index (GFI)                           0.9884
         GFI Adjusted for Degrees of Freedom (AGFI)            0.9306
         Root Mean Square Residual (RMR)                       0.0671
         Parsimonious GFI (Mulaik, 1989)                       0.3295
         Chi-Square                                            3.5236
         Chi-Square DF                                              1
         Pr > Chi-Square                                       0.0605
         Independence Model Chi-Square                         65.979
         Independence Model Chi-Square DF                           3
         RMSEA Estimate                                        0.1126
         RMSEA 90% Lower Confidence Limit                           .
         RMSEA 90% Upper Confidence Limit                      0.2497
         ECVI Estimate                                         0.0690
         ECVI 90% Lower Confidence Limit                            .
         ECVI 90% Upper Confidence Limit                       0.1193
         Probability of Close Fit                              0.1255
         Bentler's Comparative Fit Index                       0.9599
         Normal Theory Reweighted LS Chi-Square                3.4926
         Akaike's Information Criterion                        1.5236
         Bozdogan's (1987) CAIC                               -2.7747
         Schwarz's Bayesian Criterion                         -1.7747
         McDonald's (1989) Centrality                          0.9937
         Bentler & Bonett's (1980) Non-normed Index            0.8798
         Bentler & Bonett's (1980) NFI                         0.9466
         James, Mulaik, & Brett (1982) Parsimonious NFI        0.3155
         Z-Test of Wilson & Hilferty (1931)                    1.5781
         Bollen (1986) Normed Index Rho1                       0.8398
         Bollen (1988) Non-normed Index Delta2                 0.9612
         Hoelter's (1983) Critical N                              218

_______________________________________________________________________________

                             Three-var chain model                            5
                   Full (unrestricted) Model: y1 -> y2 -> y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                  Manifest Variable Equations with Estimates


                  y2      =  -0.0626*y1     +  1.0000 e1
                                     b1
                  y3      =   0.4747*y2     +  1.0000 e2
                                     b2


                       Variances of Exogenous Variables

                       Variable Parameter      Estimate

                       y1       sigy1           1.14880
                       e1       sigee1          2.17641
                       e2       sigee2          1.34344



_______________________________________________________________________________

                             Three-var chain model                            6
                   Full (unrestricted) Model: y1 -> y2 -> y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

            Manifest Variable Equations with Standardized Estimates


                  y2      =  -0.0454*y1     +  0.9990 e1
                                     b1
                  y3      =   0.5175*y2     +  0.8557 e2
                                     b2


                         Squared Multiple Correlations

                                     Error         Total
                    Variable      Variance      Variance    R-Square

               1    y2             2.17641       2.18091     0.00207
               2    y3             1.34344       1.83494      0.2679



_______________________________________________________________________________

                             Three-var chain model                            7
                   Full (unrestricted) Model: y3 -> y2 -> y1
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
           Covariance Structure Analysis: Pattern and Initial Values

                            LINEQS Model Statement


                       Matrix      Rows    Columns    ------Matrix Type-------

Term 1            1    _SEL_          3          5    SELECTION
                  2    _BETA_         5          5    EQSBETA        IMINUSINV
                  3    _GAMMA_        5          3    EQSGAMMA
                  4    _PHI_          3          3    SYMMETRIC


                          The 2 Endogenous Variables

    Manifest        y2  y3
    Latent


                          The 3 Exogenous Variables

    Manifest        y1
    Latent
    Error           e1  e2

_______________________________________________________________________________

                             Three-var chain model                            8
                   Full (unrestricted) Model: y3 -> y2 -> y1
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                       Levenberg-Marquardt Optimization

                         Scaling Update of More (1978)

                   Parameter Estimates                    5
                   Functions (Observations)               6
                   Lower Bounds                           3
                   Upper Bounds                           0

                              Optimization Start

Active Constraints                   0  Objective Function        2.2667947436
Max Abs Gradient Element  0.7827946588  Radius                               1


                                                                        Actual
                                                      Max Abs             Over
         Rest    Func      Act    Objective  Obj Fun Gradient             Pred
 Iter    arts   Calls      Con     Function   Change  Element  Lambda   Change

    1       0       2        0      1.42926   0.8375   0.1975   0.338    1.000
    2       0       3        0      0.09884   1.3304   0.1695       0    3.263
    3       0       4        0      0.01771   0.0811 2.93E-16       0    1.320

                             Optimization Results

Iterations                           3  Function Calls                       5
Jacobian Calls                       4  Active Constraints                   0
Objective Function        0.0177065358  Max Abs Gradient Element  2.925504E-16
Lambda                               0  Actual Over Pred Change   1.3197130985
Radius                    0.7012876847

ABSGCONV convergence criterion satisfied.

_______________________________________________________________________________

                             Three-var chain model                            9
                   Full (unrestricted) Model: y3 -> y2 -> y1
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

         Fit Function                                          0.0177
         Goodness of Fit Index (GFI)                           0.9884
         GFI Adjusted for Degrees of Freedom (AGFI)            0.9306
         Root Mean Square Residual (RMR)                       0.0671
         Parsimonious GFI (Mulaik, 1989)                       0.3295
         Chi-Square                                            3.5236
         Chi-Square DF                                              1
         Pr > Chi-Square                                       0.0605
         Independence Model Chi-Square                         65.979
         Independence Model Chi-Square DF                           3
         RMSEA Estimate                                        0.1126
         RMSEA 90% Lower Confidence Limit                           .
         RMSEA 90% Upper Confidence Limit                      0.2497
         ECVI Estimate                                         0.0690
         ECVI 90% Lower Confidence Limit                            .
         ECVI 90% Upper Confidence Limit                       0.1193
         Probability of Close Fit                              0.1255
         Bentler's Comparative Fit Index                       0.9599
         Normal Theory Reweighted LS Chi-Square                3.4926
         Akaike's Information Criterion                        1.5236
         Bozdogan's (1987) CAIC                               -2.7747
         Schwarz's Bayesian Criterion                         -1.7747
         McDonald's (1989) Centrality                          0.9937
         Bentler & Bonett's (1980) Non-normed Index            0.8798
         Bentler & Bonett's (1980) NFI                         0.9466
         James, Mulaik, & Brett (1982) Parsimonious NFI        0.3155
         Z-Test of Wilson & Hilferty (1931)                    1.5781
         Bollen (1986) Normed Index Rho1                       0.8398
         Bollen (1988) Non-normed Index Delta2                 0.9612
         Hoelter's (1983) Critical N                              218

_______________________________________________________________________________

                             Three-var chain model                           10
                   Full (unrestricted) Model: y3 -> y2 -> y1
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                  Manifest Variable Equations with Estimates


                  y2      =  -0.0626*y1     +  1.0000 e1
                                     b1
                  y3      =   0.4747*y2     +  1.0000 e2
                                     b2


                       Variances of Exogenous Variables

                       Variable Parameter      Estimate

                       y1       sigy1           1.14880
                       e1       sigee1          2.17641
                       e2       sigee2          1.34344



_______________________________________________________________________________

                             Three-var chain model                           11
                   Full (unrestricted) Model: y3 -> y2 -> y1
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

            Manifest Variable Equations with Standardized Estimates


                  y2      =  -0.0454*y1     +  0.9990 e1
                                     b1
                  y3      =   0.5175*y2     +  0.8557 e2
                                     b2


                         Squared Multiple Correlations

                                     Error         Total
                    Variable      Variance      Variance    R-Square

               1    y2             2.17641       2.18091     0.00207
               2    y3             1.34344       1.83494      0.2679



_______________________________________________________________________________

                             Three-var chain model                           12
                          Reduced Model: y1 -> y2  y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
           Covariance Structure Analysis: Pattern and Initial Values

                            LINEQS Model Statement


                       Matrix      Rows    Columns    ------Matrix Type-------

Term 1            1    _SEL_          3          5    SELECTION
                  2    _BETA_         5          5    EQSBETA        IMINUSINV
                  3    _GAMMA_        5          3    EQSGAMMA
                  4    _PHI_          3          3    SYMMETRIC


                          The 2 Endogenous Variables

    Manifest        y2  y3
    Latent


                          The 3 Exogenous Variables

    Manifest        y1
    Latent
    Error           e1  e2

_______________________________________________________________________________

                             Three-var chain model                           13
                          Reduced Model: y1 -> y2  y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

    NOTE: Initial point was changed to be feasible for boundary and linear
                                 constraints.

_______________________________________________________________________________

                             Three-var chain model                           14
                          Reduced Model: y1 -> y2  y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                       Levenberg-Marquardt Optimization

                         Scaling Update of More (1978)

                   Parameter Estimates                    5
                   Functions (Observations)               6
                   Lower Bounds                           4
                   Upper Bounds                           1

                              Optimization Start

Active Constraints                   1  Objective Function        1.4108855332
Max Abs Gradient Element  0.0672113951  Radius                               1


                                                                        Actual
                                                      Max Abs             Over
         Rest    Func      Act    Objective  Obj Fun Gradient             Pred
 Iter    arts   Calls      Con     Function   Change  Element  Lambda   Change

    1       0       2        1      0.32948   1.0814 2.94E-16       0    2.952

                             Optimization Results

Iterations                           1  Function Calls                       3
Jacobian Calls                       2  Active Constraints                   1
Objective Function        0.3294830486  Max Abs Gradient Element  2.944582E-16
Lambda                               0  Actual Over Pred Change     2.95231001
Radius                    1.7118184209

ABSGCONV convergence criterion satisfied.

WARNING: There are 1 active constraints at the solution.  The standard errors
         and Chi-Square test statistic assume the solution is located in the
         interior of the parameter space and hence do not apply if it is
         likely that some different set of inequality constraints could be
         active.


NOTE: The degrees of freedom are increased by the number of active constraints
      (see Dijkstra, 1992). The number of parameters in calculating fit
      indices is decreased by the number of active constraints. To turn off
      the adjustment, use the NOADJDF option.


_______________________________________________________________________________

                             Three-var chain model                           15
                          Reduced Model: y1 -> y2  y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

         Fit Function                                          0.3295
         Goodness of Fit Index (GFI)                           0.8424
         GFI Adjusted for Degrees of Freedom (AGFI)            0.5271
         Root Mean Square Residual (RMR)                       0.4304
         Parsimonious GFI (Mulaik, 1989)                       0.5616
         Chi-Square                                           65.5671
         Chi-Square DF                                              2
         Pr > Chi-Square                                       <.0001
         Independence Model Chi-Square                         65.979
         Independence Model Chi-Square DF                           3
         RMSEA Estimate                                        0.3996
         RMSEA 90% Lower Confidence Limit                      0.3199
         RMSEA 90% Upper Confidence Limit                      0.4855
         ECVI Estimate                                         0.3705
         ECVI 90% Lower Confidence Limit                       0.2548
         ECVI 90% Upper Confidence Limit                       0.5242
         Probability of Close Fit                              0.0000
         Bentler's Comparative Fit Index                      -0.0093
         Normal Theory Reweighted LS Chi-Square               55.8602
         Akaike's Information Criterion                       61.5671
         Bozdogan's (1987) CAIC                               52.9705
         Schwarz's Bayesian Criterion                         54.9705
         McDonald's (1989) Centrality                          0.8531
         Bentler & Bonett's (1980) Non-normed Index           -0.5140
         Bentler & Bonett's (1980) NFI                         0.0062
         James, Mulaik, & Brett (1982) Parsimonious NFI        0.0042
         Z-Test of Wilson & Hilferty (1931)                    6.9349
         Bollen (1986) Normed Index Rho1                      -0.4906
         Bollen (1988) Non-normed Index Delta2                 0.0064
         Hoelter's (1983) Critical N                               20

_______________________________________________________________________________

                             Three-var chain model                           16
                          Reduced Model: y1 -> y2  y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

                  Manifest Variable Equations with Estimates


                  y2      =  -0.0626*y1     +  1.0000 e1
                                     b1
                  y3      =        0*y2     +  1.0000 e2
                                     b2


                       Variances of Exogenous Variables

                       Variable Parameter      Estimate

                       y1       sigy1           1.14880
                       e1       sigee1          2.17641
                       e2       sigee2          1.83494



_______________________________________________________________________________

                             Three-var chain model                           17
                          Reduced Model: y1 -> y2  y3
                                              20:39 Wednesday, December 1, 2004

                              The CALIS Procedure
         Covariance Structure Analysis: Maximum Likelihood Estimation

            Manifest Variable Equations with Standardized Estimates


                  y2      =  -0.0454*y1     +  0.9990 e1
                                     b1
                  y3      =        0*y2     +  1.0000 e2
                                     b2


                         Squared Multiple Correlations

                                     Error         Total
                    Variable      Variance      Variance    R-Square

               1    y2             2.17641       2.18091     0.00207
               2    y3             1.83494       1.83494           0