/****** Panel-Corrected Standard-Errors (Franzese,Mar96)
****"A GAUSS Procedure to Implement Beck-Katz PCSEs in Data Sets with
Non-Rectangular and/or Missing Data (with other Bells and Whistles)"
It is a rare pooled data set whose data starts and ends in the sametime periods (t-p's) for each
cross-section (c-s) and is devoid of missingvalues. Fortunately, this poses no intrinsic problem for
the estimation ofthe panel-corrected standard-errors (PCSEs) suggested by Beck and Katz(APSR
1995). Unfortunately, the RATS and GAUSS procedures written byBeck and Katz to estimate
PCSEs do assume rectangularity and the absenceof missing values. In writing a GAUSS procedure
to handle missingvalues, I decided to automate the process as much as possible in thehopes that this
would save me some effort in the future (it already has)and that it might be of some use to others
working with this sort of data.
The procedure takes as inputs vectors of variable and c-s names, thedependent and independent
variables, the number of c-s's, the first andlast t-p's, and scalar switches for optional (panel) WLS,
and/or c-s and/ort-p fixed effects. Organize your data in the usual way (c-s 1, t-p 1 toT; c-s 2, t-p 1
to T; etc). Always include a constant as your firstindependent variable and the dependent variable
_name_ as your lastvariable name. (All this is noted in the procedure's comments.) If thedata are
non-rectangular, just put missing values in to rectangularizeit. (Try to make your y-vector and
x-matrix as compact as possible bynot including any t-p which no c-s uses; this will avoid procedure
crashes.)The output is simlar to the OLS.SRC (Aptech) output in GAUSS, except thatstandard errors
(and t-stats and p-levels) are from PCSEs and that the lastcolumn (correlation of that X with Y) is
replaced with the OLS standard errors.
The only trick to dealing with missing data is to realize that thecross-products of residuals, i.e., the
sum over t of e(i,t)*e(j,t) doesnot contain the same number of valid observations for each i and j.
Thusthe denominator of each element of the variance-covariance matrices are(potentially) different.
The procedure offered here handles this bycreating a vector of ones for valid and zeros for missing
observations.This vector can be reshaped and/or manipulated (summed, multiplied, ...) asnecessary
to divide each element of the variance-covariance matrices bythe correct number of valid
observations. (The code is heavily remarkedand should be understandable if you know GAUSS and
PCSEs.)
There are three additional bivariate options I have included in theprocedure. First, you can choose
WLS or OLS. If WLS is requested, OLSis run first (this first-stage regression is currently set not to
beoutput to the screen; you may change it by hardwiring the appropriate_output=0 line to
_output=1). The residuals from that are squared andregressed on c-s dummies (in such a way that
the F-stat for theregression is a test of the panel-weight model against homoskedasticity).(This
regression is currently set to be output to the screen; you maychange that by hardwiring the
appropriate line.) The inverse of thesquare roots of the fitted values from that regression are your
usualpanel weights. (An alternative model of residual variance can besubstituted by hardwiring
changes to the lines noted in the procedure.)WLS is then performed by running OLS on the
transformed data, and PCSEsare calculated from that weighted regression. Weighted and
unweightedstatistics are reported.
The procedure will also automatically create a set of c-s and/or t-pdummies for fixed effect models
if requested to do so. For a recentdiscussion of when this is appropriate and of interpretation see
Smith(1995). (Keep the constant in your independent variables; it'll bedeleted for you. Make sure
you do not otherwise create perfectcolinearity or the procedure with end with an error message from
OLS.)
Finally, in writing this procedure, I noticed that OLS.SRC ((c)APTECH), generates incorrect DW
statistics in the presence of missingvalues (it "packs" vectors with missing values, making some of
the"adjacent" observations not truly temporally adjacent; APTECH hasbeen notified of the error).
This has been corrected in my procedure,but be warned that whenever you are using the DW statistic
produced byGAUSS OLS.SRC outside of PCSE.G you are not getting the correct DW.(Incidentally,
the DW is invalid when there are lagged dependentvariable(s) in the equation. An alternative test
forfirst-order serial correlation with a lag is:
h = (approx) (1-DW/2)*[n/(1-n*V(b1))] which is distributedassymptotically standard normal (where
V(b1) is the estimated variance ofthe coefficient on the lagged dependent variable).
Q and LM tests are other valid alternatives, see Johnston 1984 sections8-5 through 10-3.)
Any suggestions on improving this procedure would be gratefullyreceived. The procedure is public
domain and you may use it as you wish(though a citation of the source would be
appropriate/appreciated). I amconfident the procedure correctly calculates PCSEs; however, I
cannotaccept responsibility for any difficiulties, financial or otherwise,arising from its use. (My
lawyer/sister insisted I officially note that.)