CODES 2000 User Forum -- Data Network Note #4

SAS Procedures for Multiple Imputation

Applies to: SAS 8.1 & 8.2.
Last updated: Wednesday November 14, 2001.

SUMMARY

Several separate steps are required to perform multiple imputation using SAS procedures.  This report shows procedures used for the multiple imputation examples presented at the CODES Data Network tutorial.

PROCEDURE

  1. Prepare a query or table in CODES 2000.

CODES 2000 Crash to Inpatient linkage results are tabulated in an Access 2000 database.  Crash and inpatient data are in separate tables with missing values represented as Nulls.  You must prepare a query or table that joins crash to inpatient records, selects motorcycle cases, and converts Null values to values that can be used by SAS.  You might also convert text fields to numeric fields because we are going to assume a multivariate normal model for the linked data.

For the tutorial, all five sets of imputed matched record pairs were concatenated into a single table called ImputedMatchPairs1to5.  A field named Imputation was used to distinguish between the different imputed sets of matched pairs.  Here is an Access query named 'qryMCMINorm' that selects cases and converts field values for linked crash and inpatient records for motorcycle crashes.  The query was created in the CrashHospital2.mdb database.

SELECT ImputedMatchPairs1to5.Imputation, MinusOne([HourTruncate]) AS Hour, MinusOne([BestAge]) AS Age, HelmetOne([Safety1]) AS Helmet, HelmetOne([SafetyEMS]) AS EMS, MinusOne([Charges]) AS Charge
FROM (Crash INNER JOIN ImputedMatchPairs1to5 ON Crash.CODESNbr = ImputedMatchPairs1to5.CODESNbr) INNER JOIN HospitalDb ON ImputedMatchPairs1to5.CODESNbr_B = HospitalDb.SID
WHERE Crash.Vehicle1='MC' AND HospitalDb.Death='N';

The Crash table was joined to the HospitalDb table because charges were in the this original hospital table but not the imported or standardized Hospital tables.

MinusOne is a Visual Basic function that converts Null values to -1 and leaves other values unchanged.

HelmetOne is a Visual Basic functions that converts 'N' to 0, 'Y' to 1, and Null values to -1.

  1. Import CODES 2000 data into a SAS dataset.

This SAS data step converts the output from the Access query qryMCMINorm to a SAS dataset called MCMINorm using a DDE connection to the Access database.  The Access database must be open when you run this step.  Note that -1 values are converted to missing values (.) and that Hour and Charges values are transformed for greater normality.

FILENAME MCMINorm DDE 'MSAccess|CrashHospital2.mdb;QUERY qryMCMINorm!Data';

data MCMINorm;

infile MCMINorm;

input Imputed Hour Age Helmet EMS Charges;

if Hour = -1 then Hour = .;

if Age = -1 then Age = .;

if Helmet = -1 then Helmet = .;

if EMS = -1 then EMS = .;

if Charges = -1 then Charges = .;

Hour4 = Hour;

if Hour4 < 4 then Hour4=Hour4+24;

LogChgs = log(Charges);

;

  1. Multiply impute missing values using SAS MI procedure.

This SAS MI procedure imputes five complete datasets and stores the results in the outmi SAS dataset for future use.  The imputation model includes age, transformed hour, helmet use, and transformed charges.  Imputed values are rounded to agree with reported values.

proc mi data=MCMINorm seed=37851 out=outmi nimpute=5 round=1;

title 'MI Helmet Probability Model 5';

var Age Hour4 Helmet LogChgs;

run;

  1. Prepare imputed values for analysis.

This SAS data step corrects out of range imputed values and reverse transforms imputed values.

data MI25;

set outmi;

_Imputation_ = _Imputation_ + 5*(Imputed - 1);

Helmet = max(Helmet,0);

Helmet = min(Helmet,1);

Hour = Hour4;

if Hour > 23 then Hour=Hour-4;

Hour = max(Hour,0);

Hour = min(Hour,23);

Age = max(Age,0);

Charges = exp(LogChgs);

run;

proc print data=MI25 (obs=20);

run;

The imputation calculation is required because the input table contains five complete linkage imputations.  The final result is 25 imputations:  five value imputations for each of five linkage imputations.  The SAS print procedure prints a small sample of the first complete dataset.

  1. Analyze imputed values.

These SAS procedures perform a separate regression analysis for each of the 25 separate imputations.  A very simple regression model is used here to summarize the relationship between inpatient charges and helmet use. 

proc sort data=MI25;

by _Imputation_;

run;

proc reg data=MI25 outest=outreg covout noprint;

by _Imputation_;

model Charges = Helmet;

run;

proc print data=outreg (where = (_TYPE_ = 'PARMS'));

var _Imputation_ _DEPVAR_ Intercept Helmet;

run;

Results of the 25 regression analyses are stored in the outreg SAS dataset.  The SAS print procedure prints key parameter estimates from the regression results. 

  1. Combine separate regression parameters into a single estimate.

This SAS MIANALYZE procedure combines parameter estimates from 25 separate regression analysis to produce final parameter estimates and error estimates.

proc mianalyze data=outreg;

var Intercept Helmet;

run;

SAS is a registered trademark of SAS Institute, Inc.

 
© Copyright 2000 - 2008 Strategic Matching, Inc. All rights reserved. Microsoft, Windows, and Access are trademarks of Microsoft Corporation. Last modified: Monday January 28, 2008.