[Vision2020] Body Count in Iraq

Sun Feb 13 12:07:26 PST 2005

Joan,

Just FYI re: The Lancet study - not that this will assuage the horror, 
but again, one must be cautious when using statistics.  The confedence 
interval is 95% from 8,000 to 196,000 deaths.  The sample was small, and 
the study includes all deaths, violent or natural. 

Dave Budge

 From The Economist 11/4/04

Money Quote:

    The study can be both lauded and criticised for the fact that it
    takes into account a general rise in deaths, and not just that
    directly caused by violence. Of the increase in deaths (omitting
    Fallujah) reported by the study, roughly 60% is due directly to
    violence, while the rest is due to a slight increase in accidents,
    disease and infant mortality. However, these numbers should be taken
    with a grain of salt because the more detailed the data--on causes
    of death, for instance, rather than death as a whole--the less
    statistical significance can be ascribed to them.

The Entire Article:

The Iraqi war

Counting the casualties
Nov 4th 2004
 From The Economist print edition

AP

A statistically based study claims that many more Iraqis have died in 
the conflict than previous estimates indicated

THE American armed forces have long stated that they do not keep track 
of how many people have been killed in the current conflict in Iraq and, 
furthermore, that determining such a number is impossible. Not everybody 
agrees. Adding up the number of civilians reported killed in confirmed 
press accounts yields a figure of around 15,000. But even that is likely 
to be an underestimate, for not every death gets reported. The question 
is, how much of an underestimate?

A study published on October 29th in the Lancet, a British medical 
journal, suggests the death toll is quite a lot higher than the 
newspaper reports suggest. The centre of its estimated range of death 
tolls--the most probable number according to the data collected and the 
statistics used--is almost 100,000. And even though the limits of that 
range are very wide, from 8,000 to 194,000, the study concludes with 90% 
certainty that more than 40,000 Iraqis have died.

Numbers, numbers, numbers

This is an extraordinary claim, and so requires extraordinary evidence. 
Is the methodology used by Les Roberts of the Johns Hopkins University 
School of Public Health, in Baltimore, and his colleagues, sound enough 
for reliable conclusions to be drawn from it?

The bedrock on which the study is founded is the same as that on which 
opinion polls are built: random sampling. Selecting even a small number 
of individuals randomly from a large population allows you to say things 
about the whole population. Think of a jar containing a million marbles, 
half of them red and half blue. Choose even 100 of these marbles at 
random and it is very, very unlikely that all of them would be red. In 
fact, the results would be very close to 50 of each colour.

The best sort of random sampling is one that picks individuals out 
directly. This is not possible in Iraq because no reliable census data 
exist. For this reason, Dr Roberts used a technique called clustering, 
which has been employed extensively in other situations where census 
data are lacking, such as studying infectious disease in poor countries.

Clustering works by picking out a number of neighbourhoods at random--33 
in this case--and then surveying all the individuals in that 
neighbourhood. The neighbourhoods were picked by choosing towns in Iraq 
at random (the chance that a town would be picked was proportional to 
its population) and then, in a given town, using GPS--the global 
positioning system--to select a neighbourhood at random within the town. 
Starting from the GPS-selected grid reference, the researchers then 
visited the nearest 30 households.

In each household, the interviewers (all Iraqis fluent in English as 
well as Arabic) asked about births and deaths that had occurred since 
January 1st 2002 among people who had lived in the house for more than 
two months. They also recorded the sexes and ages of people now living 
in the house. If a death was reported, they recorded the date, cause and 
circumstances. Their deductions about the number of deaths caused by the 
war were then made by comparing the aggregate death rates before and 
after March 18th 2003.

They interviewed a total of 7,868 people in 988 households. But the 
relevant sample size for many purposes--for instance, measuring the 
uncertainty of the analysis--is 33, the number of clusters. That is 
because the data from individuals within a given cluster are highly 
correlated. Statistically, 33 is a relatively small sample (though it is 
the best that could be obtained by a small number of investigators in a 
country at war). That is the reason for the large range around the 
central value of 98,000, and is one reason why that figure might be 
wrong. (Though if this is the case, the true value is as likely to be 
larger than 98,000 as it is to be smaller.) It does not, however, mean, 
as some commentators have argued in response to this study, that figures 
of 8,000 or 194,000 are as likely as one of 98,000. Quite the contrary. 
The farther one goes from 98,000, the less likely the figure is.

The second reason the figure might be wrong is if there are mistakes in 
the analysis, and the whole exercise is thus unreliable. Nan Laird, a 
professor of biostatistics at the Harvard School of Public Health, who 
was not involved with the study, says that she believes both the 
analysis and the data-gathering techniques used by Dr Roberts to be 
sound. She points out the possibility of "recall bias"--people may have 
reported more deaths more recently because they did not recall earlier 
ones. However, because most people do not forget about the death of a 
family member, she thinks that this effect, if present, would be small. 
Arthur Dempster, also a professor of statistics at Harvard, though in a 
different department from Dr Laird, agrees that the methodology in both 
design and analysis is at the standard professional level. However, he 
raises the concern that because violence can be very localised, a sample 
of 33 clusters really might be too small to be representative.

This concern is highlighted by the case of one cluster which, as the 
luck of the draw had it, ended up being in the war-torn city of 
Fallujah. This cluster had many more deaths, and many more violent 
deaths, than any of the others. For this reason, the researchers omitted 
it from their analysis--the estimate of 98,000 was made without 
including the Fallujah data. If it had been included, that estimate 
would have been significantly higher.

The Fallujah data-point highlights how the variable distribution of 
deaths in a war can make it difficult to make estimates. But Scott 
Zeger, the head of the department of biostatistics at Johns Hopkins, who 
performed the statistical analysis in the study, points out that 
clustered sampling is the rule rather than the exception in 
public-health studies, and that the patterns of deaths caused by 
epidemics are also very variable by location.

The study can be both lauded and criticised for the fact that it takes 
into account a general rise in deaths, and not just that directly caused 
by violence. Of the increase in deaths (omitting Fallujah) reported by 
the study, roughly 60% is due directly to violence, while the rest is 
due to a slight increase in accidents, disease and infant mortality. 
However, these numbers should be taken with a grain of salt because the 
more detailed the data--on causes of death, for instance, rather than 
death as a whole--the less statistical significance can be ascribed to them.

So the discrepancy between the Lancet estimate and the aggregated press 
reports is not as large as it seems at first. The Lancet figure implies 
that 60,000 people have been killed by violence, including insurgents, 
while the aggregated press reports give a figure of 15,000, counting 
only civilians. Nonetheless, Dr Roberts points out that press reports 
are a "passive-surveillance system". Reporters do not actively go out to 
many random areas and see if anyone has been killed in a violent attack, 
but wait for reports to come in. And, Dr Roberts says, 
passive-surveillance systems tend to undercount mortality. For instance, 
when he was head of health policy for the International Rescue Committee 
in the Congo, in 2001, he found that only 7% of meningitis deaths in an 
outbreak were recorded by the IRC's passive system.

The study is not perfect. But then it does not claim to be. The way 
forward is to duplicate the Lancet study independently, and at a larger 
scale. Josef Stalin once claimed that a single death is a tragedy, but a 
million deaths a mere statistic. Such cynicism should not be allowed to 
prevail, especially in a conflict in which many more lives are at stake. 
Iraq seems to be a case where more statistics are sorely needed.

Joan Opyr wrote:

>
> I am not as suspicious of The Lancet's figures as you are because The 
> Lancet is tracking the body count via Iraqi hospital reports.  Not 
> coincidentally, we began our bombing campaign -- our shock and awe -- 
> with targeted hospital bombings.  This suggests to me that we (meaning 
> the US) didn't want an accurate civilian body count.  But why 
> quibble?  Shall we split the difference between 30,000 and 100,000 and 
> call it 50,000?  I don't know about you (except that you are hardly 
> hunlike in any of your arguments) but 50,000 doesn't make me feel any 
> better.  And I don't think it's correct.  It is possible that The 
> Lancet is counting the victims of car bombings.  Those aren't directly 
> our fault -- though I suppose that, too, is arguable.  (Ted?  Tom?  
> Feel free to jump in here.)
>  
>
-------------- next part --------------
Skipped content of type multipart/related