title 'The Population Variation Method for Selecting Sample Size';

/***********************  popvar.sas ****************************
* This code writes on the log file.                             *
/***************************************************************/

/*****************************************************************
Given a population effect size A, what sample size is required  
to detect the effect with a given probability?                  

Suppose we are planning a 2x3x4 analysis of covariance, 
with two covariates, and factors named A, B and C. We 
are setting it up as a regression model, with one dummy 
variable for A, 2 dummy variables for B, and 3 for C. 
Interactions are represented by product terms, and there 
are 2 products for the AxB interaction, 3 for AxC, 6 for 
BxC, and 1*2*3 = 6 for AxBxC. The regression coefficients 
for these plus two for the covariates and one for the 
intercept give us p = 26. The null hypothesis is that of no 
BxC interaction, so s = 6. The "other effects in the 
model" for which we are "controlling" are represented 
by 2 covariates and 17 dummy variables and products of 
dummy variables. 

What sample size is required for a power of 0.80 if the BxC 
interaction explains 10% of the remaining variation IN THE 
POPULATION?  The sample variation method yielded n = 144 for 
this problem.  
*****************************************************************/

                    /*********************************************/
data fpower1;       /* Replace alpha, s, p, and wantpow below    */
     alpha = 0.05;  /* Significance level                        */
     s = 6;         /* Numerator df = # IVs being tested         */
     p = 26;        /* There are p beta parameters               */
     A = 0.10;      /* POPULATION effect size                    */
     wantpow = .80; /* Find n to yield this power.               */
                    /*********************************************/
     power = 0; n = p; oneminus = 1-alpha; /* Initializing ... */
     do until (power >= wantpow);
        n = n+1 ;
        ncp = (n-p)*A/(1-A);
        df2 = n-p;
        power = 1-probf(finv(oneminus,s,df2),s,df2,ncp);
     end;
     put ' *********************************************************';
     put '   ';
     put '   For a multiple regression model with ' p 'betas, ';
     put '   testing ' s 'explanatory variables using alpha = ' alpha ',';
     put '   a sample size of ' n 'is needed';
     put '   in order to have probability ' wantpow 'of rejecting H0';
     put '   for a POPULATION  effect of size A = ' A ;
     put '   ';
     put ' *********************************************************';
run;

/******************************************************************/
/* Given sample size, what effect size (population A) is required */
/* to have a specified power?                                     */
/******************************************************************/

/* This example uses the same design as above. Suppose we did have the
   n = 144 for a = 0.10 located by the sample variation method. What 
   POPULATION effect size would be necessary for this sample size to 
   yield a power of 0.80?  */
   
                    /*********************************************/
data fpower2;        /* Replace alpha, s, n, p, and wantpow below */
     alpha = 0.05;  /* Significance level                        */
     s = 6;         /* Numerator df = # IVs being tested         */
     n = 144;       /* Sample size                               */
     p = 26;        /* There are p beta parameters               */
     wantpow = .80; /* Find effect size A to yield this power.   */
                    /*********************************************/
     df2 = n-p; oneminus = 1 - alpha; 
     critval = finv(oneminus,s,df2); 
   /* Initializing ... */  A = 0; 
   do until (power ge wantpow);
      A = A + .001 ;
      ncp = (n-p)*A/(1-A);
      power = 1-probf(critval,s,df2,ncp); 
     end;
   put ' ******************************************************';
   put ' ';
   put '  For a multiple regression model with ' p 'betas, ';
   put '  testing ' s ' explanatory at significance level ';
   put '  alpha = ' alpha ' controlling for the other variables,';
   put '  and a sample size of ' n', the variables need to explain';
   put '  A = ' A ' of the remaining POPULATION variation to have a';
   put '  probability of '  wantpow  'of being significant';
   put '   ';
   put ' *******************************************************';
run;