CODES 2000 User Forum -- Data Network Note #12

Methodology for Meta-Analysis of Helmet Use Effects

Applies to: CODES Data Network.
Last updated: Tuesday February 12, 2002.

SUMMARY

We assume the effects of motorcycle helmet use on inpatient charges are not identical in all states, nor does each state have a completely unique effect.  Instead, we model the effect of helmet use as normally distributed in each state, with state means also normally distributed across the CODES Data Network.  We estimate the parameters of this hierarchical normal model from regression parameters reported by the Data Network states.  In addition, we assess model fit by simulating the data.  Procedures described here implement the methodology presented by Gelman, Carlin, Stern, and Rubin (1995).  Bayesian Data Analysis.  Boca Raton, Florida:  Chapman & Hall/CRC Press, for their Example 5.5.

METHODOLOGY

  1. Assume a hierarchical statistical model for helmet use effects

Following Gelman et al. (1995), Section 5.4, this meta-analysis constructs:

A simple hierarchical model based on the normal distribution, in which observed [helmet use effects] data are normally distributed with a different mean for each [state], with known observation variance, and a normal population distribution for the [state] means.  This model is sometimes termed the one-way normal random-effects model with known data variance and is widely applicable, being an important special case of the hierarchical normal linear model...

  1. Simulate joint posterior distributions for all model parameters

Our simulation follows Gelman et al. (1995), Section 5.4,

For this model, computation of the posterior distribution of [helmet effects] is most conveniently performed via simulation [using 10,000 random draws], following the factorization [given for the joint posterior distribution of model parameters].  The first step, simulating [the population standard deviation] tau, is easily performed numerically using the [given] inverse cdf method on a grid of [100] uniformly spaced values of tau, with [the given posterior distribution of tau].  The second and third steps, simulating [the population mean] mu and then [the vector of state helmet effects] theta, can both be done easily by sampling from [given] normal distributions.

The following SAS* procedures implement the methodology presented by Gelman et al. (1995), Section 5.4.  We simulate posterior distributions for all model parameters and summarize 10,000 random draws as histograms and quantile tables.

 

*-------- Meta-Analysis of Imputed Helmet Use Effects -------*

| Adapted from Gelman et al. (1995). |

| 5.5 Example: Combining information from educational |

| testing experiments at eight schools (from Rubin (1981)). |

*------------------------------------------------------------*;

 

data Reported TauDensity Imputed;

input (CODES1-CODES13) ($) yBar1-yBar13 Sigma1-Sigma13 @@;

array CODES[13] CODES1-CODES13;

array yBar[13] yBar1-yBar13;

array Sigma[13] Sigma1-Sigma13;

array tauValue[200];

array tauProb[200];

c = 1.3529E-44;

grid0 = 0.1;

grid1 = 10000.1;

step = 100;

trials = 10000;

states = 13;

tauSum = 0;

tauCDF = 0;

index = 0;

do i=1 to states;

state=CODES[i];

effect=yBar[i];

keep state effect;

output Reported;

end;

do tau=grid0 to grid1 by step;

 

* Compute muHat and Vmu for posterior normal distribution of mu;

Sum1 = 0;

Sum2 = 0;

do i=1 to states;

Sum1 = Sum1 + (yBar[i]/(Sigma[i]**2 + tau**2));

Sum2 = Sum2 + (1 /(Sigma[i]**2 + tau**2));

end;

muHat = Sum1 / Sum2;

Vmu = 1 / Sum2;

Product1 = 1;

do i=1 to states;

Product1 = Product1 * (1 / Sqrt(Sigma[i]**2 + tau**2));

Product1 = Product1 * Exp(-(yBar[i] - muHat)**2 / (2 * (Sigma[i]**2 + tau**2)));

end;

tauSum = tauSum + step * (Sqrt(Vmu) * Product1);

tauPost = (1/c) * Sqrt(Vmu) * Product1;

tauCDF = tauCDF + (step * tauPost);

index=index+1;

tauValue[index] = tau;

tauProb[index] = tauCDF;

keep tau tauPost tauCDF tauSum;

output TauDensity;

end;

do draw=1 to trials;

tauMin = 0;

tauMax = 0;

ProbMin = 0;

ProbMax = 0;

tau = 0;

index = 1;

* Draw tau from its posterior distribution;

RanDraw = ranuni(1);

do while (RanDraw > tauProb[index] and index < 200);

ProbMin = tauProb[index];

tauMin = tauValue[index];

index = index + 1;

end;

if index < 200 then do;

ProbMax = tauProb[index];

tauMax = tauValue[index];

tau = tauMin + (tauMax - tauMin) * (RanDraw - ProbMin) / (ProbMax - ProbMin);

end;

* Draw mu from its posterior normal distribution given tau;

* Compute muHat and Vmu for posterior normal distribution of mu;

Sum1 = 0;

Sum2 = 0;

do i=1 to states;

Sum1 = Sum1 + (yBar[i]/(Sigma[i]**2 + tau**2));

Sum2 = Sum2 + (1 /(Sigma[i]**2 + tau**2));

end;

muHat = Sum1 / Sum2;

Vmu = 1 / Sum2;

sigmaMu = sqrt(Vmu);

mu = muHat + sigmaMu * rannor(1);

* Draw thetaJ from its posterior normal distribution given mu and tau;

* Calculate muJ and VJ for thetaJ given mu and tau;

do J=1 to states;

VJ = 1 / ((1 / Sigma[J]**2) + (1 / tau**2));

sigmaJ = sqrt(VJ);

muJ = ((yBar[J] / Sigma[J]**2) + (mu / tau**2)) * VJ;

thetaJ = muJ + sigmaJ * rannor(1);

yJ = thetaJ + Sigma[J] * rannor(1);

State = CODES[J];

keep states trials draw tau mu J thetaJ yJ State;

output Imputed;

end;

end;

 

datalines;

MD NE NH OK UT DE KY PA SD WI MN SC ME

860 16380 -1288 -1736 -3616 -5754 -3059 -2498 6684 -7269 -10375 -3300 -7307

4474 20977 4096 1200 2865 8986 5088 16077 10118 4195 8406 3919 7187

;

proc print data=TauDensity (firstobs=101);

title1 'CODES Helmet Use Effect';

title2 'Integral of tau CDF';

var tau tauPost tauCDF tauSum;

run;

proc gplot data=TauDensity;

title2 'Marginal posterior density p(tau | y)';

plot tauPost*tau;

run;

proc gplot data=TauEstimate;

title2 'CDF of p(tau | y)';

plot tauCDF*tau;

run;

*proc print data=Imputed;

* var trials draw tau mu J sigmaJ muJ thetaJ;

*run;

data Impute1;

Set Imputed;

if J=1;

run;

proc gchart data=Impute1;

title2 'Simulated Population Std Dev';

vbar tau;

run;

proc gchart data=Impute1;

title2 'Simulated Population Mean';

vbar mu;

run;

proc sort data=Impute1;

by mu;

run;

data Quantile;

set Impute1;

quantile = 100*mod(_N_,trials)/trials;

if quantile in (5.0,25.0,50.0,75.0,95.0) then do;

keep quantile mu;

output;

end;

run;

proc print data=Quantile;

title2 'mu Quantiles';

var quantile mu;

run;

proc sort data=Impute1;

by tau;

run;

data Quantile;

set Impute1;

if 100*mod(_N_,trials)/trials in (5.0,25.0,50.0,75.0,95.0) then do;

quantile = 100*mod(_N_,trials)/trials;

keep quantile tau;

output;

end;

run;

proc print data=Quantile;

title2 'tau Quantiles';

var quantile tau;

run;

proc sort data=Imputed;

by State thetaJ;

run;

proc gchart data=Imputed;

by State;

title2 'Simulated Effect thetaJ';

vbar thetaJ;

run;

data Quantile;

set Imputed;

quantile = 100*mod(_N_,trials)/trials;

if quantile in (5.0,25.0,50.0,75.0,95.0) then do;

keep State quantile thetaJ;

output;

end;

run;

proc sort data=Quantile;

by State quantile;

run;

proc print data=Quantile;

by State;

title2 'thetaJ Quantiles';

var quantile thetaJ;

run;

  1. Simulate posterior predictive distributions for new state helmet use effects

Our simulation follows Gelman et al. (1995), Section 5.4,

Obtaining [10,000] samples from the posterior distribution of new data, either from a current [reporting state] or a new [reporting state], is straightforward given [10,000] draws from the posterior distribution of the parameters.  We consider [here] ... future data ... from the current set of [reporting states], with means [simulated as described above].

To obtain a draw from the posterior predictive distribution of new data from the current set of random effects, first obtain a draw from [the joint  posterior distribution of model parameters] and then draw the predictive data [independently] from [their given normal distributions].

The following SAS* procedures implement the methodology presented by Gelman et al. (1995), Section 5.4.  We simulate posterior predictive distributions for new helmet use data from all reporting states and summarize 10,000 random draws as histograms and quantile tables.

Simulate future data in the main SAS data step:

* Draw thetaJ from its posterior normal distribution given mu and tau;

* Calculate muJ and VJ for thetaJ given mu and tau;

do J=1 to states;

VJ = 1 / ((1 / Sigma[J]**2) + (1 / tau**2));

sigmaJ = sqrt(VJ);

muJ = ((yBar[J] / Sigma[J]**2) + (mu / tau**2)) * VJ;

thetaJ = muJ + sigmaJ * rannor(1);

yJ = thetaJ + Sigma[J] * rannor(1);

State = CODES[J];

keep states trials draw tau mu J thetaJ yJ State;

output Imputed;

end;

Summarize future data as histograms and quantiles:

proc sort data=Imputed;

by State yJ;

run;

proc gchart data=Imputed;

by State;

title2 'Simulated Future Data yJ';

vbar yJ;

run;

data Quantile;

set Imputed;

quantile = 100*mod(_N_,trials)/trials;

if quantile in (5.0,25.0,50.0,75.0,95.0) then do;

keep State quantile yJ;

output;

end;

run;

proc sort data=Quantile;

by State quantile;

run;

proc print data=Quantile;

by State;

title2 'yJ Quantiles';

var quantile yJ;

run;

  1. Check model fit by comparing posterior predictive distributions to reported helmet use effects

Our simulation follows Gelman et al. (1995), Section 6.8,

Next we simulate the posterior predictive distribution of a hypothetical replication of the [13 state studies].  Computationally, drawing from the posterior predictive distribution is nearly effortless given all that we have done so far:  from each of the [10,000] simulations from the [joint] posterior distribution of [model parameters], we simulate a hypothetical replicated dataset ... by drawing each [of 13 state effects] from a normal distribution with [simulated] mean and observed variance.  The resulting set of [10,000] vectors [of 13 state effects] summarizes the posterior predictive distribution...

In order to test the fit of the model to the observed data, we examine the posterior predictive distribution of the following test statistics:  the largest of the [13] observed effects, the smallest, and the average...  We approximate the posterior predictive distribution of each test statistic by the histogram of the values from the [10,000] simulations of the parameters and predictive data, and we compare each distribution to the observed value of the test statistic and our substantive knowledge of helmet effects.

The following SAS* procedures implement the model checking methodology presented by Gelman et al. (1995), Section 6.8.  We simulate reported helmet effects for each Data Network state given simulated model parameters and summarize 10,000 random draws as histograms for reported maximum, minimum, and average reported values.

data predict;

set Imputed;

array y[13];

retain;

y[J] = yJ;

if J = states then do;

yMax = y[1];

yMin = y[1];

yMean = y[1];

do I = 2 to states;

if y[I] > yMax then yMax = y[I];

if y[I] < yMin then yMin = y[I];

yMean = yMean + y[I];

end;

yMean = yMean / states;

keep yMax yMin yMean;

output;

end;

run;

proc gchart data=predict;

title2 'Posterior Predictive Distribution';

vbar yMax;

vbar yMin;

vbar yMean;

run;

proc means data=Reported;

title2 'Reported Effects';

var effect;

run;

proc sort data=Reported;

by state;

run;

proc print data=Reported;

var state effect;

run;

 

*SAS is a registered trademark of SAS Institute, Inc.

 

 
© Copyright 2000 - 2008 Strategic Matching, Inc. All rights reserved. Microsoft, Windows, and Access are trademarks of Microsoft Corporation. Last modified: Monday January 28, 2008.