CODES 2000 User Forum -- Data Network Note #4
SAS Procedures for Multiple Imputation

Applies to: SAS 8.1 & 8.2.
Last updated: Wednesday November 14, 2001.

SUMMARY
Several separate steps are required to perform multiple imputation using SAS
procedures. This report shows procedures used for the multiple imputation
examples presented at the CODES Data Network tutorial.

PROCEDURE
- Prepare a query or table in CODES 2000.
CODES 2000 Crash to Inpatient linkage results are tabulated in an Access 2000 database.
Crash and inpatient data are in separate tables with missing values
represented as Nulls. You must prepare a query or table that joins crash
to inpatient records, selects motorcycle cases, and converts Null values to
values that can be used by SAS. You might also convert text fields to
numeric fields because we are going to assume a multivariate normal model for
the linked data.
For the tutorial, all five sets of imputed matched record pairs were
concatenated into a single table called ImputedMatchPairs1to5. A field
named Imputation was used to distinguish between the different imputed sets of
matched pairs. Here is an Access query named 'qryMCMINorm' that selects
cases and converts field values for linked crash and inpatient records for
motorcycle crashes. The query was created in the CrashHospital2.mdb
database.
SELECT ImputedMatchPairs1to5.Imputation, MinusOne([HourTruncate]) AS Hour,
MinusOne([BestAge]) AS Age, HelmetOne([Safety1]) AS Helmet, HelmetOne([SafetyEMS]) AS EMS, MinusOne([Charges]) AS Charge
FROM (Crash INNER JOIN ImputedMatchPairs1to5 ON Crash.CODESNbr = ImputedMatchPairs1to5.CODESNbr) INNER JOIN HospitalDb ON ImputedMatchPairs1to5.CODESNbr_B = HospitalDb.SID
WHERE Crash.Vehicle1='MC' AND HospitalDb.Death='N';
The Crash table was joined to the HospitalDb table because charges were in
the this original hospital table but not the imported or standardized Hospital
tables.
MinusOne is a Visual Basic function that converts Null values to -1 and
leaves other values unchanged.
HelmetOne is a Visual Basic functions that converts 'N' to 0, 'Y' to 1, and
Null values to -1.
- Import CODES 2000 data into a SAS dataset.
This SAS data step converts the output from the Access query qryMCMINorm to
a SAS dataset called MCMINorm using a DDE connection to the Access
database. The Access database must be open when you run this step.
Note that -1 values are converted to missing values (.) and that Hour and
Charges values are transformed for greater normality.
FILENAME
MCMINorm DDE 'MSAccess|CrashHospital2.mdb;QUERY
qryMCMINorm!Data';
data
MCMINorm;
infile
MCMINorm;
input
Imputed Hour Age Helmet EMS Charges;
if Hour =
-1 then Hour
= .;
if Age = -1
then Age = .;
if Helmet
= -1 then
Helmet = .;
if EMS = -1
then EMS = .;
if Charges
= -1 then
Charges = .;
Hour4 = Hour;
if Hour4
< 4 then
Hour4=Hour4+24;
LogChgs = log(Charges);
;
- Multiply impute missing values using SAS MI procedure.
This SAS MI procedure imputes five
complete datasets and stores the results in the outmi SAS dataset for future
use. The imputation model includes age, transformed hour, helmet use,
and transformed charges. Imputed values are rounded to agree with
reported values.
proc mi
data=MCMINorm seed=37851 out=outmi nimpute=5
round=1;
title 'MI
Helmet Probability Model 5';
var Age
Hour4 Helmet LogChgs;
run;
- Prepare imputed values for analysis.
This SAS data step corrects out of range imputed values and reverse
transforms imputed values.
data
MI25;
set outmi;
_Imputation_ = _Imputation_ + 5*(Imputed
- 1);
Helmet = max(Helmet,0);
Helmet = min(Helmet,1);
Hour = Hour4;
if Hour
> 23 then
Hour=Hour-4;
Hour = max(Hour,0);
Hour = min(Hour,23);
Age = max(Age,0);
Charges = exp(LogChgs);
run;
proc
print data=MI25 (obs=20);
run;
The imputation calculation is required because the input table contains
five complete linkage imputations. The final result is 25
imputations: five value imputations for each of five linkage
imputations. The SAS print procedure prints a small sample of the first
complete dataset.
- Analyze imputed values.
These SAS procedures perform a separate regression analysis for each of the
25 separate imputations. A very simple regression model is used here to
summarize the relationship between inpatient charges and helmet use.
proc
sort data=MI25;
by
_Imputation_;
run;
proc reg
data=MI25 outest=outreg
covout noprint;
by
_Imputation_;
model
Charges = Helmet;
run;
proc
print data=outreg
(where = (_TYPE_ = 'PARMS'));
var
_Imputation_ _DEPVAR_ Intercept Helmet;
run;
Results of the 25 regression analyses are stored in the outreg SAS
dataset. The SAS print procedure prints key parameter estimates from the
regression results.
- Combine separate regression parameters into a single estimate.
This SAS MIANALYZE procedure combines parameter estimates from 25 separate
regression analysis to produce final parameter estimates and error estimates.
proc
mianalyze data=outreg;
var
Intercept Helmet;
run;
SAS is a registered trademark of SAS Institute, Inc.