Philip Morris
Epidemiology and Environmental Tobacco Smoke
Fields
- Type
- SCRT, REPORT, SCIENTIFIC
- BIBL, BIBLIOGRAPHY
- Area
- SCIENTIFIC AFFAIRS/BLACK LATERAL OLD S&T
- Characteristic
- PARE, PARENT
- Named Organization
- Ahf, American Health Foundation
- Arch Environ Health
- Epa, Environmental Protection Agency
- Medical College of Va
- Arch Environ Health
- Named Person
- Binder, R.
- Fisher, R.
- Friedman
- Garfinkel
- Kilpatrick, S.J.
- Lee, P.
- Uberla, K.
- Wynder, E.
- Fisher, R.
- Document File
- 2023512309/2023512515/Ets Issue Binder: Epidemiology
- Litigation
- Okag/Privilege Withdrawn
- Okag/Produced
- Master ID
- 2023512310/2514
Related Documents:- 2023512316-2317 Statistical Significance and Confidence Intervals
- 2023512329-2340 Environmental Tobacco Smoke and Lung Cancer: A Critical Assessment
- 2023512341-2348 What Is the Epidemiologic Evidence for A Passive Smoking - Lung Cancer Association?
- 2023512361-2362
- 2023512364-2440 A Dictionary of Epidemiology
- 2023512442-2514 News & Numbers A Guide to Reporting Statistical Claims and Controversies in Health and Other Fields
- Site
- R529
- Date Loaded
- 24 May 1999
- UCSF Legacy ID
- tjc02a00
Document Images
EPIDEMIOLOGY
AND
ENVIRONMENTAL TOBACCO'=SDld.CE

THIS ISSUE BINDER IS INTENDED TO PROVIDE A BASIC,
COMPREHENSIVE REVIEW OF THE SCIENTIFIC LITERATURE
REGARDING A SPECIFIC TOPIC ON ETS AND THE HEALTH OF
NONSMOKERS..
PRIMARY STUDIES AND: REVIEWS HAVE BEEN HIGHLIGHTED
TO IDENTIFY (1) USEFUL OR HELPFUL INFORMATION (YELLOW
HIGHLIGHT) AND (2) ADVERSE RESULTS OR OPINIONS (BLUE
HIGHLIGHT).

TABLE OF
CONTENTS
zo28s12312

TABLE OF CONTENTS
TAB
I. STATISTICS AND EPIDEMIOhOGY . . . . . .. . . . . . . . . . . 1
"Statistical Significance and Confidence Intervals"
II. ETS* .
. . .. . . . . . . . . . . . . . . . . . . . . . . .2
Scientific Method
Inadequacies of ETS Studies
References
III. ESSAYS ON ETS EPIDEMIOLOGY .
E. Wynder and G. Kabat
N. Mantel
IV. GLOSSARY OF TERMS .
Definitions
. .3
. .4
V. A DICTIONARY OF EPIDEMIOLOGY . . . . . . . . . . . . . . . 5
VI. APPENDIX .
News and Numbers (excerpts)
. .6
* For analysis and criticism of the epidemiologic studies on
ETS and lung cancer, heart disease and respiratory disease in
children, see specific Issues Binders.

STATISTICS AND EPIDEMIOLOGY
"Nevertheless, in a real sense, statistics is
the study of populations, or aggregates of
individuals, rather than of individuals.
Scientific theories which involve the properties
of large aggregates of individuals, and: not
necessarily the properties of the individuals
themselves . . . are essentially statistical
arguments, and are liable to misinterpretations
as soon as the statistical nature of the
argument is lost sight of."
Sir Ronald Fisher

nt7rit:ii
This matertal rnay oe
618 protectz o bv copyright June 9, 1986 Vol. 144 THE MEDICAL IOL/RNAL OF AUSTRALIA
Statistical significance and confidence
M ~.nywpen in. rhe Journal use
/ V/ surf~~lcatme[hods arsdbne of th.I j 1 aJms ot the revlew procw is to try
to ensure that appropriate methods have
been& used. Often papers rer+ort results of
comtsarative studies that art designed to
S atuwer questions such as whether one
treatment is superior to another for a
particular disease, or whether there is an
association between sottx form of behaviour
(for exampk, taking regular, exercise or
smoking) and the occvrrence of some
disease. Comparative studies are almost
invatiably carried out on a sample of
individuals who are chosen from the
populatiort, of individuals to whom it is
intended to generalize the results. Data are
collected on the sample in order to make
inferences on the population. Valid
inferences can only be drawn if the sample
is chosen.in such.a way that it is represen-
tative of the population. Otherwise a bias
could occvr; epidemiological methods are
designed' to eliminate such biases.
Since the aim of a statistical analysis is to
make inferences. it is paramount to express
whatever inferences that can be drawn in the
most informative way: There are several
methods of statistical inference, but the two
that are most commonly used are
significance testing and confidence interval
estimation. The former is well known and
is featured by quoting P values. Many
authors appear to be under the impression
that a profusion of P values is necessary:
regrettably this impression has been bolstered
in the past by editors of biological Ijournals.
Significance testing has its place buts as
mentioned by Healy in,1978,' "it, is widely
agreed among statisticians (if less so among
the more naive users of statistics) that;
significance testing is not the be-all and end-
all of the subject". In this leading article I
would'like to discuss tfie characteristics of:
both methods of' inference, show that a
confidence interval contains the result of a
significance test, but nou vice versa, and
suggest that confidence intervals are the
answers to the more interesting questions
that data can be used to answer.
Any particular study is based on a
particular sample: however, it is useful to
imagine that the study is repeated with a
different sample being selected each time.
These hypothetical studies will give different
results because they contain differenn
individuals, and individuals vary in any
characteristic because of biological varia-
bility. The differences are termed sampling
variability. It follows then that the results
than are obtained from a particular sample
can only be taken as an approximation to the
actual situation~ in the whole popultitaon.
Statistical methods are concerned »rh
assessing the degree of approximatton and
intervals
what may be reasonably inferred, given that
different sample would have produced a
different result.
The methods are based on the assumption
that it is a matter of chance which particular
subjects are in the sample that is befng
studied, and the sampling variability is thus
random variation which is determined by the
taws of probability. Therefore, the inferences
are expressed in terms of probability. The
situation is illustrated below.
Population
I f- - - - - - - sampling variation
Sample data
- - - - - - uncertainty
Inlerences on population
Taking a samplt from the population
involves sampling variation. As a conse-
quence of thit, inferences from the sample
data back to the population~ involve
uncertainty.
A statistical analysis may be thought of as
asking questions of the data. In an investi-
gation that compares two groups for the
mean value of. for example, blood pressure
or the prevalence of some disease, three
questions may be posed: Is there a difference
between the groups?: How large is the
difference?; and How accurately is the size
of the difference known?.
As erpressed, the first question expects the
answer, "yes"'or "no": although the answer
cannot be given in, precisely these terms, itt
is often rcduced~ to two possibilities. The
appropriate methodology is the significance
rest. The second question expects a numerical!
value to be the answer. This is an estimate
and, as it is a single value, is referred to as
a point estimate. in effea, the third'question~
asks how reliable this point estimate is: the
answer is a range of values which iis referred~
to as an interval estimate or a confidence
interval:
These questions represent two approaches
to inference: hypothesis testing and~
estimation. Although at first sight they
appeartobe quite different. in concept they
have much in common. Both make
inferential statements about the value of a
parameter. (ik parameter is an unknowmy quantity which partly or wholly characterizes
a population, for, example, a mean or a
measure of association.)
The significance test is an appropriate
technique when there is an a priori hypothesis
to test. For the purpose of the statistical test
this hypothests is expressed in nuffform -
such as whemo no difference exists between;
groups - and the test evaluates whether the
data are consistent with the null hyptxhesisf tf the data differ markedly from thosrwhich
would be expected under the null hypothesis,
to the extent that the probability of such an
extreme result is low, then it is said that the
result is statistically significant. Probability
is measured on a continuum between 0 and
I, but in significance testing a probability is
considered low if it is less than conventionali
values such as 0.05 (J4.) or 0.01 (1%). A
significant result is equated with the reyacsion
of the null hypothesis or the claim of a real
effect. By definition, when the null
hypothesis is true, significant results will
occur by chance with the same relative
frequency as the signifieance probability.
That is, real effects will be claimed when the
null hypothesis is true; however, the proba-
bility of this error (type I) is determined in
the data analysis.
One disadvantage of a significance test is
that: it may fail to detect a real effect:'that
is, although the null hypothesis is false, the
evidence is not strong enough to reject it. The
probability' of this error (type 11) can be
controlled' at the design stage only, by
appropriate selection of the satnple size, and
may be quite large. Thus, the trap of
equating non-sitnifrcance with no effect
must be avoided; failure to reject the null
hypothesis is not the same as accepting it.
In the approach of confidence interval
estimation no particular hypothesis is consi-
dered: ratherthe emphasis is on estimatingg
those values of the parameter withwhich,the
data are consistent. These valhes form a
range - the confidence interval. The range
is calculated so that there is a high proba-
bility - conventionally 95*t9 or 99'f. - that
it contains the true value of the parameter.
A significance test is essentially a test of
whether the data are consistent with a
specified parameter value, and the confi-
dence intervali contains those parameter
valucs with which the data are consistent.
Therefore, a Srtsignificance test,and a 95%
confidence interval': contain some infor-
mation ir. common: significance implies that,
the null hypothesis value is outside the confr-
dence interval; non-siSnificance implies that
the null hypothesis value is within.the confi-
dence interval. However, the confidence
inteeval contains more information because
it is equivalent to performing a significance
test for all values of the parameter, not just
a single value. A confidence interval enables
a reader to see how large the effect may be.
not simply whether it is different from zero.
The limitations of the interpretations that
are provided'by a significance test may'now
be considered.
The difference is sisnifrcanr:. This means
that there is a difference orin otherwordsr the size of the difference is not zero. We
know no more than this. The difference may
J
t

THE MEDICAL JOURNAL OF AUSTPALIA Vol. 144 June 9, 1966
be large and of great importance or it may
be small and of no practial importance. It
is tr,umdactory that the tea provides no way
of distinguishing between these quite
different possibilitia.
The d(fJerrnor Is nor sijeljuvnf, This
means that there is insufficient evidence to
enable us to conclude that there is a
difference. So the difference may well be
zero. But this is not: the satae as vying that
it is zero. The true difference may be quite
large. Again, it is unsatisfactory that this
possibifity is ijot addressed.
The coeciusicns that may be drawn from
a significance test are considered to be
incomplete because it is rarely that one is
interested solely in whether a null hypothesis
is or is not true; indeed' in many cases it may,
be recognized at the outset that the null
hypothesis is unlikely to be ttue.,Rather, the
question is how large is the difference and:
is it possibly large enough to be important?
The emphasis is on measuring rather than on
testing. The addition of the concept of an
important difference to that of a null
hypothesis means that there are four possible
interpretations to an analysis: (a) the
difference is significant and large enough to
be of praRical iinportanoe; (b) the difference
is significant but too small to be of practical
importance; fc1' the difference is nott
significant but may be large enough to be
importantt and fd1 the difference is not
significant and also not large enough to be
of practical importance.
pHtert.nc
Ynportant
NuM' 0
hypot6.a.
The size of differeace that is considered
to be large enough to be important is a
matter for debate, and genuine differences
of opinion may arise. It is a tnedieal, not a
statiuial, question, ahboujh a sssedsal
statistitzatt who is esperienoed in thesubject
area could contribute to setting a value. The
fact that agreratent on a unique value may
be impossible in no way detracts from the
argument. In fact. expressing the results as
a confidence tnterval enables interpretations
to be made for any particular value that is
considered appropriate.
These possibilities are illustrated in the
Figure where the confidence intervals are
shown. The significant and non-significant
cases are distinguished by the confidence
intervals that exclude or include zero respec-
tively. The main point is that in each case
the confidence intervali gives the range of
possible values for the true difference. Of
particular concern is Ic1. Here ther: rttay be
no true difference or there may be a luge,
important difference. In other words the
study is completely inconclusive. Such a
possibility is missed by the simple expression
"not signifianr" with its lure of equating
this falsely with "no effect". This situation
will arise with a studythat is carried out on
too small a sample and this is why good study
design demands attention to sample size to
try to prevent the occurrence of an incon-
clusive result. Altman found that it was
common for undue emphasis to be placed on
"negative" findings from small studies,'
ta (b)
tb) td)
L ~ l l
SIGNIFICANT NOT SIGNIFICANT
Nnportant
Not Important
Inconclu.iw Tru n.p.tJv
raault
FIGt/RE Conhdence intervals show.nS Jour ppss+ble conclusions in terms of stattttrcalsrgndrcance
and practtcal'xttportrnce.
619
while Freimen et al. noted that 'nesative'
trisls were often too sasall to aonai:ute a fair
teu of tbtmrpies.' Similarly, a ssgniGcance
test will contrast (b).s significant and (d) as
not sijnifiaar but fash to rec+t>Ssia tmt they
give essentiaQy the tsme eoodmion - d.f
any difference is too small ~to be iasportant.
As an example., consider some results
which were obtaiaetf by Garraway et aL from
a dinial trial' for the -agraseat of arwr
stroke in the elderty.' Of 155 puieau who
were tssaaaged in a txroke tmtt. 73 were
asxsssd as independeat when tbry wen
discharged front the trnft compared with 49
of 132 who wert: maaaged in a med"l tsust.
The simplest analysis shows that the
difference betweefl the sneass raw of the
two units is stipsific"t at the l% levd.
Therefore, a genuine effect has beea estab-
lished. To appreciate the importanca of this
effect the advantage of the svoke unit may
be measured by the difference bet..eea tbe
two units in the percentage of tubjea.s
who were discharge& as independent:
30.3% - 32.2% - 18. 1 %. This is the poiiu
estimate. The aaurae7 of this iesditnue is
given by its staadard erro>r (5.5) and the 95%
confidence limits (/.3% and 2g.9%).'iaus,
the gain could be as large as 29'h or as small
as 7%.
Recently, Gardner and Ahtnan have
arstted against the eaarsive use off hypothesis
testing and urged a Qeater use of confidence
intetvds,' In an appendix to their paper they
give methods to calculate confidence
intervals for the commonly occurring two-
sample comparisons.
in presenting the main results of a study
it is good practice to provide confidence
intervals rather than to restrict the analysis
to significance tesa. Only by so doing can
authors give readers sufficient information
for a proper conclusion to be drawn;
otherwise readen have to rely upon the
authors' own interpretation.' Therefore,
intending authors are urged to express their
main conctusions in confdertee interval form
(possibly with the addition of.a siPifiance
test, although strictly that would provide no
extra information). One of the aims of the
)ournal's statistical review process will be to
ensure that where possible this is done.
GEOFFREY BERRY
Associatc Profesnar or Bioaaustio
School of Public Heatth nad Tropieat Medtcine
The Utiiver0ty of Sydney
I. Healy M1R. It uatma . tnenre:'J R SurraSor A.
1971;, 1at: 3aS31J.
2. Aheua DG. Stauwtra Is awd+cat )oarnaL: Sta MsI
1912..1 : 5901.
1. Frerean /A. Cbalr.rs TC., Smith H it. Xa01er RR.
Tlr.unponvtct of Ea.. tAc rypr 11 aror aeG.rapie
ua n ,the ora+P and sourprem+m uf uye rasdamootmC
control trut. N Ewrt fM.d 1911; 299.' NOY9s
4 . Grvrs.,ay.wM. Akhw AJ. Prercou Rl. HocYer L.
Mwaernem of sc+wr r.rde to tBr efoaty: trebutuisry
rewhf of . toarolled trul. MMed'!' 19a0; 200:
IW4t0a3.
3. C+rdne. MJ. Altmao DG. Confdma war.ahntAn
ttue P.aluncaueutonP ruAer tBaa Eypotbau
«are{. A. Ned 1 19R6; 292: 74&750.

ENVIRONMENTAL TOBACCO SMOKE
Scientific Method
Scientific inquiry within an: epidemiologic study begins
with framing what is called a "null -hypothesis." The null-
hypothesis states, in this instance, that ETS is not associated
with a given disease state (e.g., lung cancer, heart disease, etc.).
Data are then collected and analyzed in order to test i.e., reject
or accept, the hypothesis.
One method which is used to assess the relationship of
the collected data to a given hypothesis is the test for statistical
significance.* Simply put, if the data examined yield a
statistically significant result (here, the relationship between
ETS exposure and a disease state), then~the scientist is permitted,
on the basis of those data, to reject the null-hypothesis. If the
statistical test is not significant, then the data do not support
rej~ection of the null hypothesis.
* By convention, a'p' (probability) value less than 0.05 is
deemed statistically significant. A'p' value less than 0.05
means that the observed results would occur by chance less
than 5 times out of 100.
"Confidence limits" are the values between which the risk
value can be expected to fall 95% of the time based on the
variability of the underlying data. When the 95% confidence
limits are both greater and~less than 1.00, the risk value is
considered not statistically significant, i.e., the results
are likely to be due to chance and do not support a judgment
regarding an association between exposure and disease.

There is no "absolute proof" involved, and there is
nothing immutable about the concept of significance
testing.
Statistical significance is, after all, a convention. But the
concept is illustrative, especially in the case of the association
between ETS exposure and lung cancer. To date, there have been 28
published reports on ETS and lung cancer, and only five have
achieved statistical significance. It is clear that the
preponderance of data do not permit rejection of the null
hypothesis, i.e., there is no association between ETS exposures
and lung cancer. In addition, virtually all of the individual
risks reported in such studies are less than 2, which, to the
epidemiologist, suggests a "weak" association which is probably
the result of bias or confounding!of factors unrelated to ETS.
Inadeguacies of ETS Studies
Epidemiologic studies are notoriously unreliable
in outcome. An observed relative risk of less
than 1.5-2.0 (some would up to 3.0) is
inadequate to reject the hypothesis of no
effect. The overall relative risk calculated
across studies is well below a minimal value
for seriously attributing it to the presence
of a real effect, i.e., it is within the range
easily due to the "noise" in epidemiologic
data resulting from the limitations and vagaries
intrinsic to the methodology and its
application. This same conclusion also applies
to nearly all of the studies on an individual
basis. Another reason for conservative
interpretation of the ETS studies is that
several studies are of poor quality (good
textbook examples of how not to do an
epidemiologic study) and some were originally
designed for a different, or broader, purpose
than assessing health risks from ETS exposure.

Sources of bias are present to varying degrees
in most of the studies. Lung cancer patients
may tend to overstate their exposure to spousal
smoking as an explanation for their illness.
Bias may result from depending on memory recall
of a subject's exposure to spousal smoking.
Estimates of relative risk may differ markedly
between data collected from the subjects and
data obtained from a surrogate, such as their
children. Histologic verification of lung
cancer was not conducted in all studies and
the error rate may be substantial, e.g., 13%
of the lung cancer cases in the case-control
study of Garfinkel et al. were found to be
incorrectly diagnosed when the histology was
reviewed by one of the authors. (From:
Summary of Public Docket Comments, Draft Risk
Assessment, U.S. EPA, Dec. 1990.)
Peter Lee, a statistician and epidemiologist from the
United Kingdom, has argued that the increased risks reported in
various epidemiologic studies are the result of an inherent bias
in study design rather than the result of any genuine effect from
exposure to ETS.1-5 Lee presents data which indicate that the re-
ported risks cannot be explained on the basis of either ETS expo-
sure or dose for the nonsmoker. It is Lee's contention that the
reported "risks" are the result of bias caused by a small number
of smokers who are misreported in~the studies as nonsmokers.
Other kinds of misclassification may contribute to the
repa:eted increase in lung cancer risks among nonsmokers, according
to several scientists. For example, none of the studies on ETS
and lung cancer provides direct observational information on ETS
exposures. Instead, spouses, next-of-kin or friends are asked to
estimate the amount of ETS to which they think the subject was
exposed. Such estimates may lead to a kind of misclassification,.

called exposure misclassification,6 which has been shown by
Garfinkel7, Friedman8 and others9-12 to lead to improper indices of
exposure and incorrect estimations of risk. In Garfinkel's study,
for example, relative risks varied from 0.83 and 0.77 when the
women with lung cancer or the husband was the respondent, to a
risk of 3.57 when a son or daughter responded.13 That means that
the reported risk for lung,cancer in the women exposed to ETS was
less than for women not exposed when either the women's or their
husband's estimates were used.
Dr. S. James Kilpatrick, a biostatistician from: the
Medical College of Virginia, has analyzed another form of misclassi-
fication, called differential misclassification, which results
"from the tendency of respondents to inflate the amount of ETS
exposure for lung cancer cases and deflate the report of exposure
for controls."6 Similarly, Dr. Ernst Wynder, President of the
American Health Foundation, notes that "relatives of a nonsmoking
lung cancer patient are more likely to report passive inhalation
exposure on the part of their relative than are relatives of a
control patient.nl4
A more subtle form of potential bias is known as
"publication bias", which stems from the apparent failure by
journals to publish studies which report negative or weakly positive
results.15,16 Scientists have recently expressed concern over the
growing trend among such journals to overemphasize (and hence to
publish) only those studies which report positive increases in
~ risk.17'18 Published studies which are combined for meta-analyses

therefore may not truly represent all investigations on the issue
of ETS exposures and lung cancer.
Most of the epidemiological studies on ETS and lung
cancer have failed to consider age differences, diet, occupation
and exposures to indoor or outdoor pollution as potential
conf ounding elements. The importance of such factors is
underscored by recently published reports from~Japan and China.19
24 The reports suggest that indoor pollution generated by kero-
sene heaters, coal stoves, liquified petroleum gas and exposures
to cooking oil vapors may be responsible for the increased risk of
lung cancer among Oriental women. Moreover, in 1989, researchers
in the U.S. reported that nonsmokers living with smokers consumed
less carotene (Vitamin A) than did nonsmokers who lived with other
nonsmokers. They concluded that "dietary beta-carotene intake is
a potential confounder and should be measured whenever possible in
studies of the relation between passive smoking and lung cancer."25
Dr. Karl Uberla of Germany recently explained why any
attempts to generalize about the significance of reported results
of epidemiological studies on ETS and nonsmoker lung cancer will
likely remain unconvincing, due to scientific deficiencies in each
of the studies.26 He wrote:
The majority of criteria for a causal
connection are not fulfilled. There is no
consistency, there is a weak association, there
is no specificity, the dose-effect relation
can be viewed controversially, bias and
confounding are not adequately excluded, there
is no intervention study, significance is only
present under special conditions and the
biologic plausibility can be j'udged~
controversially.

Given these difficulties in interpretation, it is
therefore not surprising that an eminent statistician~ should
conclude that "it is unlikely that any epidemiological study has
been, or can be, conducted which could permit
establishing that
the risk of lung cancer has been raised by passive smoking.
Whether or not the risk is raised remains to be taken as a matter
of faith according to one's choice."15
Thus, proponents of the ETS health issue are confronted~
with weak associations and generally statistically nonsignificant
risks in epidemiological studies on~ ETS. They are nevertheless
forced to posit a causal mechanism for their theoretical model
regarding health risks. They find no support in data from the
actual exposure studies on ETS which suggest that an average
nonsmoker is exposed, for example, to the nicotine equivalent of
one one-hundredth to one one-thousandth (or less) of a single
cigarette per hour. Such exposure data suggest that there is no
conclusive biological plausibility to the ETS health claim, and
that the reported risks in epidemiological studies may be
artefactual, and probably due to bias and unconsidered confounders.

REFERENCES
1. Lee, P., "Passive Smoking and Lung Cancer: Problems in
Interpreting the Epidemiological Data," Presentation to
Toxicology Forum, Washington, D.C., February, 1987.
2. Lee, P., "Does Breathing Other People's Tobacco Smoke Cause
Lung Cancer?," Br Med J 293: 1503-1504, 1986.
3. Lee, P., "Misclassification as a Factor in Passive Smoking
Risk," Lancet II: 867, 1986.
4. Lee, P., "Lung Cancer and Passive Smoking: Association or
Artefact Due to Misclassification of Smoking Habits?,"
Toxicoloerv Letters 35: 157-162, 1987.
5. Lee, P., "Passive Smoking and~ Lung Cancer Association: A
Result of Bias?," Human Toxicol 6: 517-524, 1987.
6. Kilpatrick, S., "Misclassification of Environmental Tobacco
Smoke Exposure: Its Potential Influence on Studies of
Environmental Tobacco Smoke and Lung Cancer," Toxicology
Letters 35: 163-168, 1987.
~ 7. Garfinkel, L., et al., "Involuntary Smoking and Lung Cancer:
A Case-Control Study," JNCI 75(3): 463-469, 1985.
8. Friedmany G., et al., "Prevalence and Correlates of Passive
Smoking," Am J Public Health 73(4): 401, 1983.
9. Pron, G., et al., "The Reliability of Passive Smoking Histories
Reported in a Case-Control Study of Lung Cancer," Am J
Epidemiol 127(2): 267-273, 1988.
10. Schenker, M., et al., "Assessment of Environmental Tobacco
Smoke Exposure in Epidemiologic Studies," Chest 91(2): 313-
314, 1987. Abstract.
11. Lerchen, L. and J. Samet, "An Assessment of the Validity of
Questionnaire Responses Provided by a Surviving Spouse," Am J
Epidemiol 123(3): 481-489, 1986.
to
12. Sandler, D. and D. Shore, "Quality of Data on Parents' Smoking ~
and Drinking Provided by Adult Offspring," Am J Epidemiol
124(5): 768-778, 1986. W
~
13. Mantel, N., " What is the Epidemiologic Evidence for a Passive ~
Smoking-Lung Cancer Association?" Indoor Air Quality, H. Kasuga ~
W
(ed.), Springer-Verlag, (Berlin Heidelberg 1990): 341-347. ~
14. Wynder, E., "Guidelines to the Epidemiology of Weak ~
Associations," Prey Med 16: 211-212, 1987.

15. Mantel, N., "Lung Cancer and Passive Smoking," Br Med J 294:
440, 1987.
16. Lee, P., "Deaths in Canada from~Lung Cancer Due to Involuntary
Smoking," CMAJ 137: 372-373, 1987.
17. Gordis, L., "Challenges to Epidemiology in the Next Decade,"
An J Epidemiol 128(1): 1-9, 1988.
18. Newcombe, R., "Towards a Reduction in Publication Bias," Br
Med J 295: 656-659, 1987.
19. Shimizu, H., et al., "A Case-Control Study of Lung Cancer in
Nonsmoking Women," Tohoku J Exp Med 154:389-97, 1988.
20. Wu-Williams, A., et al., "Lung Cancer Among Women in North-
east China," Brit J Can 62:982-87, 1990.
21. Sobue, T., "Association of Indoor Air Pollution and Passive
Smoking With Lung Cancer," Gan no Rinsho 36(3):329-33, 1990.
Translation.
22. Du, Y., "Indoor Air Pollution and Woman Lung Cancer," Indoor
Air 90 Proceedings I, Toronto, CA.: 59-64, 1990.
23. Mumford, J., et al., "Lung Cancer and Indoor Air Pollution in
Xuan Wei, China," Science 235: 217-220, 1987.
24. Gao, Y., et al., "Lung Cancer Among Chinese Women," Int J
Cancer 40: 604-609, 1987.
25. Sidney, S., et al. "Daily Intake of Carotene in Nonsmokers with
and without Passive Smoking at Home," Am J Epi 129(6): 1305-
1309, 1989.
26. Uberla, K., "Epidemiology: Its Scope and Limitations for
Indoor Air Quality," Indoor Air Quality: Symposium (Buenos
Aires, NAS of Buenos Aires, 1989): 45-60.

ESSAYS ON ETS
10382497

H. Kasuga (Ed.)
Indoor Air Quality
With 155 Figures and 190 Tables
Springer-Verlag Berlin Heidelberg New York
London Paris Tokyo Hong Kong.

Environmental Tobacco Smoke and'~ Lung Cancer:
A Critical Assessment*
E. Z. Wynder and G. C. Kabat
Summary
The possibility that exposure to environmental tobacco smoke (ETS) may increase the
lung cancer risk of nonsmokers has become a cause of public concern. It is unknown
whether the levels of carcinogens in the diluted sidestream smoke of tobacco products
thatreach the nonsmoker's lung are sufficient to induce cancer. Available epidemiologic
studies suggest a slight increase in the relative risk of lung cancer in~nonsmokers due to
exposure to ETS created by a smoking spouse. However, not all studies have found a
significant association. The epidemiologic studies aretxamined in the light of the criteria
of judgmenU of causality;, including strength of association, consistency temporality,
methodological issues, and biological plausibility. Suggestions for further research,
including studies in high-exposure populations and greater attention to histology, are
proposed..
Introduction
Epidemiologists, chemists, biologists, physiologists, physicians, and public health
officials have given much attention to the association of environmental tobacco smoke
(ETS) exposure and1he development of lung cancer in nonsmokers. A biological basis .
for such an association clearly exists because smoke constituents demonstrated to be-=~
carcinogenic in laboratory animals are inhaled and retained by the nonsmokei; ~
Metabolites of tobacco-specific smoke constituents have been identified in the saliva,
blood, and urine of nonsmokers after exposure to ETS (Greenberg et al. 1984; Hoffmann
et al. 1984; National Academy of Sciences 1986; USDHHS 1987; Sepkovic eti al. 1988).
Several epidemiological studies have found a positive association between ETS exposure
- usually, defined as being due to a smoking spouse - and lung cancer (Hirayama 1981!;
Trichopoulos et al. 1981; Correa et al. 1983; Sa.ndler et al. 1985; Garfinkel et al. 1985;
Akiba et al. 1986; DalagGr etal. 1986; Pershagen et al. 1987). Qther studies have found no
significant association (Garfinkel 1981!; Chan and Fung 1982; Koo et a1.,1983; Kabat and
Wynder 1984; Wu et al. 1985;! Lee et al. 1986). No consistent association has been
reported for lung cancer and exposure to ETS in childhood;,which might be expected to
exert a greater effect, especially when followed~by exposure throughout adulthood. Of
course, recall of ETS exposure in childhood is more difficult than recall of such exposure
in adulthood. ~
C
Research ~described herein was performed under USPHS, National Cancer Institute Program ~
Project Grant CA-32617:
~
.
H. Kacug. (Ed.) Indoor Air Quality Zti
0 Spritrger-Verlag, Berlin Heidelbergl990 W
1.J
0

6 E.,L. Wyndcr and G. C. Kabat
The epidemiological study of weak associations is burdened with problems that may
yield artifactual positive findings or may ahow negative findings where a real association
exists. The association of ETS and lung anar risk, even if weak, woul'd stil be of concern
as a public health problem in that most people are at one time or another exposed to
smoke from burning tobacco products and the exhaled pollutants of tobacca smokers. A
weak association in epidemiology requires careful examination and an understanding of
the variables in question and all of the factors influencing theassoeiation (Wynder 1987).
In this overview we critically examine the published studies on ETS exposure and lung
cancer to determine whether the evidence presented to date permits a sound conclusion as
to causation:
General i Exposure to ETS
At the ouuet' we need to emphasize that an association betweeni ETS and lung cancer
must be deemed possible. A recent survey of self-reported exposure in, a hospitalized
population revealed that 66% of men and 60% of' women had ETS exposure in
childhood; 32%a of the men and 61 % of the women reported ETS exposure in the home in
adulthood; and 60% of the men and 62%n of the women who worked outside the home
reported ETS exposure at work (Kabat and Wynder, unpublished data, 1987).
Critical Assessment
The first Surgeon-General's Report on Smoking and Health, published in 1964 (USPHS
1964), clearly delineated the criteria of judgment for causality. These criteria included:
the magnitude of the association, consistency, temporality, and biological plausibility.
Since these criteria were considered necessary to prove causation for a strong association,
namely;aetive smoking andlung cancer, theyshould be equaliyrsquired to determine the
causality of weak associations (Wynder 1987). Let us examine the epidemiological
evidence linking ETS with lung cancer, in respect to these criteria:
Strengrh of the Association
An association is generally considered weak if the odds ratio is under 3.0 and particularly
when it is under 2.0, as is the case in theselationship of ETS and'11tng,cancer (Table 1'): If'
the observed relative risk is small, itlis important to determine whether the effect could'be
due to biased selection of subjects, confounding, biased reporting, or anomalies of
particular subgroups.
Consistency
If an association is real, internal consistency shouldi be apparent within an& between
different studies. The majority, but not alllof the studies of ETS and'lung cancer have
shown a positive association for ETS-exposure due to a smoking spouse (Table 1). In
most of the studies, the confidence interval includes 1.0. While the prospective study by.
Hirayama (1981 a) among Japanese women showed a significant association with the
husband's smoking(largely adenocarcinomas), the prospective study among American

Environmental Tobacco Smoke and Lung Cancer: A Critical Assessment
7
Table 1. Summary of results of studies relating lung cancer risk in married women to their
husbands' smoking habits
Relative risk 95 % Confidence interval
Prospective swdJes
Hirayama (1981)
1.63
1.25-2.1 il
Garfinkel (1981) 1.18 0.90-1.54
Case-corttroJ srYdt'es
Trichopoulos et al. (1981)
2:1
1.18-3.78
Chan & Fung (1982). 0.75 0.44-1130
Correa et ali,(1983) 2.03' 0.93-5.03
Koo et al. (1983)i 1.54 0.90-2.64.
Kabat & Wynder (1984) 0.79 0.26-2.43
Wu et al. (1985) 1.2 0.6 -2.5
Garfinkel etal: (1985) 1.12' 0.74-1.69
Lee et a1. (1985) 1.03 0.41-2.47
AkibA et al'. (1986) 1.48 0.88-2.50
Perahagen et al., (1987), 1.28, 0.75-2:16
Table 2. Distribution of lung cancer by histologic groups in smokers and never-smokers. (From
Kabat and Wynder 1984)
Smokers Never-smokers
Males Females Males Females
(N = 1882) (N = 652) (N = 37), (N = 97)
[°kJ [tYo) [4'oJ [%J
Kreyberg I 63 52 35 21
Kreyberg lli 32' 43 54 74
Mixed and undifferentiated/anaplastic 5 5 11 5
women by Garfinkel (Q981),did not. It has been suggested that Japanese and American
women1 are exposed to differenU]trvels of ETS due to different conditions in the two
countries. Such differences could account for this disparity (I-Iirayama 1981 b).
Within those studies presenting specific histologic analysis, differences exist in
respect, to the type of lung cancer involved. In active smokers, tobacco smoke exposure
has a causative effect predominantly on squamous and small cellltypes of lung cancer
(ICreybergl);,with a lesser, though sti11 significanUCausative effect on the glandular type
(Kreyberg 1I) (Wynder and Stellman 1977). Among nonsmokers, however, the glandu-
lar type of lung cancer predominates among both men and women (Kabat and Wynder
1984) (Table 2). The effect of ETS would thus be expected to be primarily responsible
for the higher rate of' adenocarcinomas among nonsmokers.. The studies by Dalager
et al. (1986) and Pershagen et al. (1987),, however, suggest that the effect of ETS
exposure is limited to induction of squamous cell'lung cancer (Table 3). If this were, in
fact, the case, then only the squamous or small cell type of lung cancer in~nonsmokers

81 E. L. Wynder and~G. C. Kabat.
Table 3. Histology-specific odds ratios for spouse smoking from two studies
Study Histologic type N Odds ratio 95% C.I.
Dalager et al. Adenocarcinoma 16 1,.02 0.33- 3,16
(1986)
Squamousdc Small Cell Ca. 14' 2.88 0.91- 9:10,
Other 18 1.31 0.48- 3.57
1'etshagen ~et al. Squamous or Small iCell Ca. 20 3.3 1.1 -11.4
(1987)
Other 47 0.8 0.4 - 1.5
would be affected by ETS. Clearly, it, is importanT that investigations of the effect of
ETS exposure on lung cancer development in nonsmokers take histology into account,
so as to determine whether an effeet of ETS is limited to certain histological types.
Since smoking is more prevalent in lower income groups, at least among men,
lung cancer in nonsmoking women in these groups s'hould~ have a higher incidence.
Thus the influence of the level of education on smoking habits in the examined
population needs to be considered as a possible confounder. Few studies to date
have done this.
Merhodological7ssues
A particular concern in weak associations is reporting bias, that is, potentially
differentialreporting of exposures between cases and controls. In terms of ETS, does the
lung cancer patient report exposure to tobacco smokebe it at work, at home, at sociall
funetionsin childhood or adulthood, differently than the control? The case is likely to
have a different attitude toward this question than does the control, a handicap not
applicable to prospective studies. It, needs to be determined whether the case's attitude
towards questions on ETS exposure leads to under~ or overreporting. Cases are likely to
underreport their own smoking (Lee 1987), and they may tend to overreporU their
exposure to ETS and other potential hazards that could account for their ilihess. In
studies that use proxyreports, differenvrelatives mayrespond differently. Garfinkel etial.
(1985) provides some insight into this phenomenon by showing that if the response came
from the patient, the odds ratio was 1.0, if from the husband ~it was 0.92and'if from the
daughter or son, 3.19(Table 4). More work is needed on the validity of ETS-exposure
information obtained from different relatives before we can evalttate which of these
relative risks is closer to the truth.
In general, possible reporting bias represents a serious problem in case-control studies
because it can produce a systematic artefact. It is particularly worrisome in that it cannot
be effectively measured.
We also need to consider misclassification that can occur in both retrospective and
prospective studies. Lee has proposed (Lee et ali 1996; Lee 1987) that the reported ETS
effect on lung cancer risk can be explained by a misclassification of smokers as
nonsmokers. According to these studies, a substantiall percentage of respondents
misrepresent their smoking habits. Using a 10:0% misclassification rate of ex-smokers as
self-reported neversmokers coupled with the concordance of spouses' smoking,habits,

I
Environmental Tobacco Smoke and Lung Cancerc A Critical Assessment! 9
Tab1e4: Data from Garfinkel et al. (1985) by type of respondent
Husband's smoking habits at home
N of cases OR 95% C.i.
Self 16 1.00 0.55- 1.74
Husband 34 0.92 0.63- 1.34
Daughter/son 48 3.19 0.91-11.19
Other 36 0.77 0.57- 1.03
Fla:1. Odds ratio of male ex-smokers for Kreyberg i(N = 687) and Kreyberg 17 (N = 301) lung
cancer by years since quitting (controls = 6534);,Source: American Healtb Foundation data
Lee calculated that an apparent increase in lung cancer risk can be obtained among
nonsmokers married to smokers that approximates the increased risk observed in a
number of epidemiologic studies (Lee 1987). At the extreme, Garfinkel et al. (1985)
showed that 40% of lung cancer cases classified as 'nonsmokers' in the hospital chart
were in fact smokers as determined by interview. Although such a high rate of
misclassification does not occur when cases are interviewed'personally, to some extent
denial is likely to occur even then, particularly among ex-smokers who had stopped
smoking ten or more years ago. The risk of lung cancer among long-term ex-smokers,
and even among ex-smokers who quit more than 16 years earlier, does remain elevated
above the rate among those who never smoked (Fig: 1). Denial of past smoking may also
not be uncommon in populations where smoking is or was socially unacceptable, as is the
case among older Japanese women.
~
P
~
~
4
~.
~ .
11
© M
~ 1
~
~
W
IW
~
;~

10 E. L. Wynd'er and'G. C. Kabat'
7able S. Percent of lung cancer casa who never smoked by histologic group (A.H.F. data)
Males Females
KI Ki1l KI KIi'
[46I N' [`J6l N [4b] N' [g.o]1 N
1969-1973 1.2 488 5.6 142 I0:7 103 23.7 76
1974'-1,976 1.6 987 3.0 305 16.4 263 25.3 146
1977-1980 2.1 628 4.6 390 5.6 231 22.0 245
1981-1985 1.4 725 5.6 463 6.9 311! 16.6 294
Kreyberg I
Kreyberg II
Another problem for epidemiologists involves subgroup analysis (Stallones 1987),
Investigators are likely to examine numerous subgroupsand then prefer to present those
subgroups that best fit the hypothesis. This tendency represents an inherent problem in
epidemiology. The investigator should at~ a minimum give an idea of how many
subgroups were originally examined and how many subgroups were discardedl
Temporality
One of the factors that led to the conclusion that active smoking causes lung cancer was
that the increase in cigarette consumption preceded the increase in lung cancer rates, first~
in men and later in women. Enstrom (1979) has reported an increase in the lung cancer
rate in nonsmokers over recent years, suggesting that factors in addition to personal!
cigarette smoking influence lung cancer mortality rates. The groups examined, however,,
are not strictly comparable, and miselassification of smokers as nonsmokers in the
national surveys needs to be considered. Our data from a long-term, hospital,based case-
control study do not indicate an increase in the percentage of male nonsmokers with lung
cancer in either of the two main histologic groupings (Kieyberg I and II) over the lasv 30
years (Table 5).
In fact, the percentage of nonsmokers with lung cancer among women has declined,
which may be a conseqpence of the diminishing pool of women who have never smoked.
Bi ological Plausibility
Several studies have demonstrated that most tumorigenic agents are present in undiluted
sidestream smoke in higher concentrations than in mainstream smoke (Hoffmann et al.
1983; National Academy of Sciences 1986; Hoffmann and Wynder 1986) (Table 6),
Biochemical'studies indicate that nonsmokers exposed to ETS have levels of nicotine or
cotinine in the blood'~or urine that are about 1/100th the level seen in active smokers
(Table 7) (Jarvis et al. 1984; National'Aeademy of Sciences 1986). Some of ttie nicotine
measured in the blood and urine represents nicotine thatt is absorbed by, the saliva of
nonsmokers and'does not reach the lung directly (Jarczyk et al. 1987). It is important to

Environmental Tobacco Smoke and Lung Cancer: A Critical'Aaasment
11
Table 6. Distribution of compounds in undiluted cigarette mainstream smoke (MS) and sidestream
smoke (SS)
Nonfilter cigarettes
MS SS/MS
(A) Vapor phase
Carbon monoxide 10 - 23 mg 2.5:- 4.7
Carbon diottide 20 - 40, mg 8 - 1li
Benzene 20 - 50, µg 10
Formaldehyde 5 - 100: µg 0:1-50
Acrolein 50 - 1001 µg 8 - 15
Acetone 100 - 250~ µg 2 - 5
Hydrogen cyanide 400 - 5W yQ 0:1- 0.25
Hydraune 24 - 41 ng 3.0
Ammonia 50 - 170~ µg 40 - 170
Methylaminc 11.5 - 28;7'yg 4.2- 6.4
Nitrogen oxides 50 - 600~ µg 4 - 10
N-nitrosodimeffiylamine 10, - 180 ng 20 - 100
N-nitrosopyrrolidine 2 - 110 ng 6 - 30
(B) Particulate phase
Particulate matter
Nicotine
Phenol i
Catechol
Hydroquinone
Aniline
2-Toluidine
2-Naphthylamine
4-Aminobiphenyl
Benz(a)anthracene
Benzo(a)pyrene
N'-Nitrosonornicotine
NNK
Cadmium
Nickel
Polonium210.
15 - 40 mg 1.3- 1.9
1 - 2.5 mg 2.6- 3.3
60, - 140 µg 1.6- 3.0
100 - 350 µg 0.6- 0:9
110 - 300 µg 0.7- 0:9
360 ng 30,
30 - 160 ng 19
4,3 - 27 ng 30
2:4- 4.6ng 31
40 - 70 ng 2- 4
10 - 40 ag 2.5- 3,5
120 -3,700 ng 0.5- 3
120 - 950 ng 1- 4
100 ng 7.2
20 -3,000 ng 13 - 30
0.03- 1.0 pCi 1
note that nicotine occurs in ETS primarily as a vapor phase constituentrather than in the
particulate matter of the aerosol as is the case in mainstream cigarette smoke (Eudy et all
1987):. Measurement of nicotine or its metabolites will, therefore not reflect the
proportional uptake of particulate matter from ETS. In the light of our present
knowledge of dose-response in carcinogenesis and because the carcinogenic activity of
tobacco smoke as measured in anuaalaystems is relatively low, the question needs to be
raised whether the carcinogenic potential of inhaled ETS suffices to induce lung cancer.
Hoffmann and Hecht (1985) have proposed nicotine-derived nitrosamines in ETS as
organ-specific carcinogens for the lung. It is possible that these chemicals reach the lungs
in sufficient dose to induce neoplastic changes. These carcinogens may also be formed
endogenously from inhaled or ingested nicotine and appropriate nitrosating agents
(Hoffmann and Hecht 1985). Tumor promoters are less likely to play a role in ETS
.
a

12 E. L. Wynder and G. C. Kabat
Table 7. Approximate relations of nicotine u a parameter between non-smokers, passive smokers
and active smokers'.,(From Jarvis etal. 1984),
Nicotine/cot»nine Nonamokers without
ETS exposure
No.=46 Non-smokers with
ETS exposure
No.=541 Active
smokers
No.=94
Mean
value % of active
smoken
value Mean
value % of active
smokers
value Mean
value
Nicotine (ng/htl)
in plasma
1.0
7
0.8
5:5
14.8
inaaliva 3.8 0.6 5.5 0:8 673
in urine 3.9 0:2' 12.1P 0:7 1,750
Cotinine (ng/MlJ 275
in plasma 0.8 0i3 2:0 0.7 275
in saliva 0.7 0!2 2:Y 0.8 310
in urine 1.6 0!I 7:74' 0.6' 1,390
' Di}ierences between non-smokers exposed' to ETS compared with non-smokers withoutt
exposure
p < 0.01.
" p < 0.001
carcinogenesis than in active smoking because of their much lower concentration. In
general, tumor promoters are effective only when applied repeatedly in relatively large
amounts.
In considering the existing data on ETS exposure and'lung cancer, it is noteworthy
that Auerbach et al. (1961) showed only minor histological changes in the bronchial
epithelium of nonsmokers and found that the ciliated columnar epithelium that covers
their bronchi were largely intact. Deposition of carcinogenic smoke particulates can take
place only upon inhibition of the protective functioning of the lung clearance system.
Squamous cell lung cancer can arise only fiom~ ciliated columnar cells that have
undergone squamous metaplasia.
An active smoker with each puff from a cigarette inhales a volume of 35-50 ml of a
concentrated aerosol containing 3-5 billion particles per m] that adversely affect' the
protective cilia and mucous defense system of the bronchi (Ferin etal. 1965). The passive
smoker is at no time exposed with such force to such a highly polluted inhalant.
Furthermore, ET3 particles are more likely to be deposited in the upper respiratory tract
and not predominantly in the bronchi as is the case in active smoking. Thus, our
respitatory defense system may be able to deal more readily, with the relatively lighter
deposition of particles and exposure to volatiles in ETSas the observation by Auerbach
et ali (1961) would suggest.
Future Studies
Future epidemiological studies on the association of ETS with lung cancer should
attempt to avoid the pitfalls discussed above. The defuzitive evidence that a factor causes

Environmental Tobacco Smoke and'Lung Cancer: A Critical iAssessment 13'
human cancer requires support from descriptive, metabolic, and molecular epidemiola
gy:,
Beyond extension of prospective studies, such as those now in progress by Garfinkel and
Stellman at the American Cancer Society; we suggest:
1) Continuing ongoing case-control studies with special reference to histologic type and
careful consideration of methodological issues.
2) Estimating the relative importancc of ETS exposure in different settings - in ~ the
home, in the workplace, in social situations, and during transportation.
3) Further studying lung cancer rates among pipe and cigar smokers, and, if feasible,
among nonsmokers exposed to ETS from these products.
4) Studying lung cancer incidence in groups occupationa)ly exposed to high levels of
ETS at their worksite such as waiters, bartenders, train conductors, airplane
personnel, and office workers.
5)~ Studying bronchial epithelium in autopsy materiali of established never-smokers
whose exposure to ETS is known.
6) Determining the incidence of lung cancer by histological type in confirmed never-
smokers.,
7) Comparing the presence of adducts of tobacco-specific carcinogens with DNA in
smokers, passive smokers, and "never-smokers"'(Hoffmann and Hecht 1985; Hecht et
al. 1987) j
In summary, verification of the possible association of ETS and lung cancer represents an
important challenge to epidemiologists, laboratory scientists, and'public health authori-
ties. The public is entitled to inhale the cleanest possible air regardless of whether ETS is
proven to be cancer-inducingAfldditional efforts on the part of epidemiologists are
required to firmly, establish the nature and significance of the reported associations
between~passiwe smoking and lung cancer.
References
Akiba S, Kato S, Blot WJ (1986); Passive smoking and lung cancer among Japanese women.
Cancer Res 46:4804-4807
Auerbacb 0, Stout AP, Hammond'EC, et al' (1961) Changes in bronchial epithelium in relation to
cigarette smoking and in relation to lung cancer. N Eng] iJ Mbd 265:253-267
Cban WC, Fung SC (1982) Lung cancer in non-smokers in Hong Kong. In: Grundman E(ed)
Cancer tampaign+ vol 6. Cancer epidemiology. Fiscber, Stuttgast, pp~ 199-202
Correa P, Fontham E, Pickle L, Lin Y, Haettszel W (1983) Passive smoking and lung cancer.
Lancet 2:595-597 .
DalGger NA, Pickle LWMason TJ, et a1 (1986) The relation of~passive smoking to lung cancer..
Cancer Res 46:4808-48111
Enstrom JE (I979) Risingdung cancer mortality among nonsmokers. J Natl Cancer Inst 62:755-
760
Eudy LW; Thome FA, Heavner DL, Green CR, Ingebrsthsen BJ(1985) Studies on the vapor-phase
distribution of environmental nicotine by selected trapping and detection methods. Pres. 39th
Tobacco Cbcmists Res Conf, 1985, p 25
Ferin J; Urbankova G, Vlokova A (1965) Influence of tobacco smoke on the elimination of N,
partieles from the lungs. Nature 206:515-516
cancer mortalit
Time trenda in lun
amon
non-smokers and'a note on
assive ~
Garfinkel L(1981)
g
y
g
p
,
smoking. J Natl Cancer Inst 66:106M666
N:
Cj
~
~
~
:ca
1

14 E. L. Wynder and G. C. Kabat
Garfinkel L, Auerbach 0, Joubert L (1985) Involuntary smoking and lung cancer A case-control
study. J Nat] Cancer Inst 75:463-469
Greenberg RA, Haley NJ, Etzel RA, Loda FA (1984) Measuring the exposure of'infantsto tobacco
smoke: nicotine and eotinine in urine and saliva. N Eng1 J Med 310:1075-1078'
Hecht SS, Carmella SG, Trushirn N, Foiles PG, Lin D, Rubin YM, Chung FL (1987)',
Investigations on the molecular dosimetry of tobacco-specific N-nitrosamines. international'
Agency for Research on Cancer, Lyon [IARC Sci Publ 84:423-429]
Hirayama T (1981 a) ~Non-smoking wives.of heavy smokers have a higher risk of lung cancer: a
study in Japan,,Br Med J 282:1'B3-185
Hirayama T (1981 b) Nonsmoking wives of smokets have a higher risk oflttng cancer [Letter). Br
Med J i283 :916-9I7
Hoffmann D; Hecht S(i985)' Nicotine-derived' N-nitrosamines and tobacco-related cancer:
Current status and future directions. Cancer Res 45:935-944
Hoflmann D, Wynder EL (1986) Chemical'constituents and bioactivity of tobacco smoke. In:
Zaridze DG, Peto R(eds)Tobacco: A major international health harard. International'Agency
for Research on Cancer, Lyon, IARC Sei Publ'74:145-65
Hoffmann D, Haley NJ, Brunnemann KDAdams JD, Wynder EL (1963)'Cigarette sidestream
smoke: formation; analysis and model studies on the uptake by nonsmokers. US-Japan
meeting 'New Etiology of Lung Cancer', Honolulu,,Hawaiii March 21-23. 1983
Hoffmann D, Haley NJ, Adams JD, Brunnemann KD (1984) Tobacco sidestream smoke: uptake
by nonsmokers. Prev Med 13:608-613
Jarczyk L, Slerer G, Maltzan C, Luu HT, Adlkofer F(1987)ilntake of nicotine fronm ETS via
different inhalation routes. Proceedings of Third International' Conference on Indoor Air
Quality (tn prems).
Jarvis MTunstall-Pedoe H, Feyeubend'C;,Vesey CSaloojee Y(1984)iBiocbemical markers of
smoke absorption and self-reponed'exposure to passive smoking. 3 Epidemiol Comm Health
38:335-339
Kabat GC, Wynder EL (1984),Lung cancer in nonsmokers. Cancer 53:1214-1221
Koo LC, Ho JH-C, Saw, D (1983) Active and passive smoking among female lung cancer patients
and controls in Hong Kong. J Exp Clin Cancer Res 4:365-375
Lee PN'(1987) Lung cancer and passive smoking: assoeiation anartefact due to misclkasifiution of
smoking habits? Toxicol Letts 35:157-162
Lee PN, Chamberlain J, Alderson MR (1986),Relationship of passive smoking to risk of lung
cancer and other smoking-associated'diseases. Br J Cancer 54:97-1OS
National Academy of Sciences, National Research Council (1986) Environmental tobacco smoke.
Measuring exposures and assessing health effects. National Arademy Press, Washington DC
Persbagen G, Hrubec Z, Svenssen C(1987) Passive smoking and lung cancer. Am J Epidemiol
125:17-24
Sandier DP, Everson RB, Wilcox AJ (1'985) Passive smoking in adulthood and cancer risk. Am 11
Epidemiol 121:3748
Sepkovic D W, Axelrad CM, Colosimo SG, Haley NJ (1988) Measuring tobacco smoke exposure::
clinical applications and passive smoking. 80th Annual Meeting and Exhibition of the Air
Pollution Control Association, New York, NY, June 21~-261987 (in press)',
Stallones RA(198'7) The use and abuse ofsubgroup analysis inepid'emiologieal research. Prev Ivfed
16c183-194
Triehopoulos D, Kalandidi A, Sparros L, MacMahon B('1981)',Lung uncer and passive smoking.
Int J Cancer 27:1
[USDHHS]IUS Dept Health and Human Services, Public Health Service, Office of Smoking and
Health (1984)'The health consequences of'smoking: cardiovascular disease. A Report of the
Surgeon General. USDHHS, US Govt. Printing Offiec, [DHHS Publ No (PHS) 84-50204]
Washington DC
[USDHHS) US Dept Health and Human Services, Public Health Service, Centers for Disease
Control (1987) The Health Consequences of Involuntary Smoking, a Report of the Surgeon
General. USDHHS; UTS Govt. Printing Offiee,[DIdHS(CDC) Publ No 87-6398]; Washington
DC

EnvironmentaliTobaceo Smoke and Lung Cancer. A, Criuul!Assessment 15
jUSPHS],US Public Health Servicc. Smoking and Health. Report of the Advisory, Committee to
the Surgeon General of the Public Health Service (1'964)'US Dept of Health, Eduution; and
Welfare, Public Health Service, Center for Disease Control'(PHS Pub1 No 11',03) US Govt
Printing Office, Whahington DC
Wu AH, Henderson BEPike MC, Yu MC (1985) Smokingand other risk factors for lung cancer in
women. J Nat1 iCancer Inst 74:747-751
Wynder EL (ed) (1987) Workshop on guidelines to the epidemiology of weak associauona. Prev
Med 16:139-212
Wynder EL, Stellman ST (1977) Comparative epidemiology of tobacoorelated cancers. Cancer
Res 37:46084622
4

H. Kasuga (Ed.)
Indoor Air Quality
With 155 Figures and 190 Tables
Springer-Verlag Berlin Heidelberg New York
London Paris Tokyo Hong Kong
.. ,~; _ . . . .._~
~

I
What Is the Epidemiologic Evidence for a Passive Smoking-
Lung Cancer Association?
N. Mantel
Summary
Two survey articles of reports on the association of passive smoking with lung cancer
have recently appeared, and'also a comprehensive report on the subject of enviuonmen-
tal tobacco smoke by a committee of the National Research Council of the United'
States. The observed excess over a relative risk of unity, cannot be explained by chance.
Nor can it be fully accounted for by a particular source of bias, the false claims of being
non-smokers by individuals who were active or ex-smokers. That possible source off
bias leads, in one summary survey, to reducing a relative risk of 1.35 to 1.30, but from
1.34 to 1.15 in the National Research Council report. The latter report suggests that
statistical significance would no longer obtain, perhaps, particularlybecause of other
possible biases. However, to get an estimate of the correct relative risk due to passive
smoking, allowance has to be made for actual exposure to passive smoking of those not
exposed' at home. Thus, the 1'.30 is adjusted upwards, by 18 in one survey, to 1.53, butt
by only 8% in the National Research Council report to 1.24. The National Research
Council report had given an anticipated relative risk of 1.1 based on dosimetric
considerations. But it is suggested here that that eouldbe as low as 1.05, too low to be
detected in an epidemiologic investigation - in any case it wouldi be based on
hypothetical assumptions.
In November of 1986 there were two near-simultaneous review articles addressing the
subject of passive smoking and lirng cancer.,One was an invited'guest editorial by Blot
and Fraumeni in the Journal of the National Cancer Institute, the other a contemporary
theme discussion by Wald et al. in the British Medical Journal [1, 2].
There was substantial overlapping in the two articles of the various publications on
the subjectand on the basis of which the conclusion of a significant positive association
was made. The article by Wald et a]. gave, perhaps, more statistical detail' about the
results of the several studies covered. But, to my mind; there was uncriticaliacceptance of
the results of all the studies. Blot and'Fraumeni did suggest that there were some flaws in
a particular study, that by Hirayama [3], but decided that any inherent biases in, that
investigation coul& not have given rise to the observed elevated risk.
From their overall evaluation of 10 case-control studies (all 10 gave results for
females, five separately for males as well) and three prospective studies (two of these
eovere&males separately), which provided 20 separate relative risk (actually odds ratio))
values, Wald et al! came up with a summary relative risk of lung cancer due to passive
smoking of 1.35 (95% limits 1.19 to 1.54). They trim this down to 1.30 on the basis that
some of'the presumed non-smokers exposed to passive smoking were actually smokers.
Then, on the added basis that even those unexposed to passive smoking at home may still
have been exposed when awayfrom home, theyraise their estimate of relative risk to 1.53.
But note that this last modification presupposes the answerthat passive smoking does
H. Kuup (Ed.) Indoor Aii Quality
0 Springer-Verlag, Berlin Heidelberp 1990

342 N. Mantel
elevate the risk. For if it did not, there would be no basis for adjusting the 1.30 or 1.35
upwards to 1.53.
Blot and Fraumeni come up with a similar summary measure of relative risk for
passive smoking of 1.3 (95 %d limits of ll 1-1.5), but elevated to 1.7 (95% limits of 1.4-2.1)
for heavy, passive smoking. These authors suggest that heavy passive smoking is
equivalent, at least in terms of nicotine received, to smoking between 1/2 and 3 cigarettes
daily, and estimate that smoking a few cigarettes daily,would give rise to a relative risk of
about 1.5-fold to twofold.
While Blot and Fraumeni do not address the question of correct reporting of non-
smokiag status, Wald et al. do, having used this as a basis for lowering the relative risk
estimate from 1.35 to 1.30. Based on reports and communications from others, Wald et
al; estimate that persons reporting themselves as never having smoked (lifelong non-
smokers) comprise 2.1 % active smokers plus 4.9 % former smokers, for a total of 7% ever
smokers among the self-claimed never smokers. Wald et al. estimate thatthese 7% have a
combined relative risk of 2, making the assumption in doing this,thatthe active smokers
among the 7% smoked on average only a quarter as much as active smokers generally.
The relative risk of 2 for the 7% is computed as a weighted average of 3 for active
smokers, 1.5 for former smokers, among the 7%If7%n of reported never-smokers were actuallycx-smokers
or active smokerswhich
were they - the spouses, say, of smokers or the spouses of non-smokers?' ln my own
caitique of Hirayarna, I had suggested that this false reporting of non-smoking status
would preferentially be among those with smoking spouses [4]. If, for examplethe 7%
overall misreporting of non*smoking status concentrated among spouses of smokers, it
would be somewhat higher among persons with smoking spouses who, nevertheless,
claimed to be never smokers. Suppose we take it at 20%, in which case the reported
lifelong non-smokers relative risk would be 1.20. It could be substantially higher butSor
the assumption by Wald et al. tharthe active smokers among the reported never smokers
had sharply reduced levels of smoking. However, Wald et al. were ready to make only a
small reduction, in relative risk for this factor, from 1.35 to 1.30. Their speculative
inareasewhich might have no basis at allwas much greater, from 1.30 to 1.53.
The effect of false reporting of smoking status, specifically of non-smoking; could be
much sharperthan what Wald et al. have suggested.ln a study ofibiochemical markers of
smoke absorption, Jarvis et al. branded as "deceivers" 21 individuals who claimed to be
non-smokers [5]. These 21 displayed biochemical patterns very similar, to those of actual
smokers, not at all like those of accepted non-smokers. The 100 accepted non-smokers
comprised 46 without passive smoking, 54 with. Those 21 would constitute 21/121 or
about 17% of the total, and these would be active smokers, not just former smokers, or
eightfold greater than the 2.1 % Wald at al'L postulated. Perhaps in the epidemiologic
investigations made, false reporting of non-smoking status is at a much dower 1'evel, but it
would not take much false reporting to account fully forthe seeming association between
passive smoking and lung cancer.
Recently, a colleague expressed to me the thought that if passive smoking played no
role in lung eancer, why are we not finding many negative associations, nor anyy
significantly negative associations? Actually; six of the 20 relative risks reported in Wald
et al. are at 1.00:or smaller. And some of those reported as in excess of 1.00 conceal rates
of under 1.00. Thus, relative to the rate shown of 1.23 for the study reported by Garfinkel
et al., I have brought out in my own critique that that represented a composite ofdata for
various classes of~respondents [617j. Where the woman with lung cancer, was herself the
respondent (as to her husband's level oflsmoking) the relative risk:was 0:g3. Using the
husbands' responses, the relative risk was 0.77. It was only on the basis of responses by
t
1
t
I

What Is the Epidemiologic Evidence for a Aassive Smoking-Lung Cancer Awociation? 343
the sons and daughters, at a time long past when they would have left home, that a
relative risk of 3.57'emerged, sufficiently high to raise the overall'estimate of relative risk
to 1.23. As I indicated in my cri'tique, the replies by the children were more accusatory in
nature than revealing of'any true reiationship:
But even so, it would take 40 large studies to get on average a single seemingly
significant'negative association of lung cancer with passive smokingassuming statistical
testing at the 59c, two-tailed, level. But we have only 20 evaluations, with many so small
that they could'not possibly yield any apparently significant protective effect, not even in
the unrealistic situation that passive smoking was 100% protective. Suppose a study had
a null expectation of only2'or 3 passive smokers with lung cancer - then there would be
some observed number, 5 or 6 or 7 or 8 or more which would be significantly in excess of
expectation. But there would be no number, however small or even zeroa which would be
significantly below expectation. Yet justisueh low expectations characterize several of the
studies reported'on by Wald'et al. In one study, a relative risk of 2.29 is shown based on
only 2 actual cases of lung cancer in passive smokers, expectation 1.20. Another relative
risk of 2.45 is based'on 3 observed, 1i.77 expected. Fon one prospective study,,4,observed
cases have givcnrise to an estimated relat'ivezisk of 3.25, and in another 7 observed cases
gave rise to a relative risk of 2.25, suggestive of an expectation little in excess of 3. On the
other hand, the four, reported risks of under 1.00 had expectations variously of' 37.67,
34.08, 6.64 and 13.77.
Of concern to Wald ev al. was whether the various relative risks were homogeneous.
On this point they cite a chi-square test for heterogeneity of 20.01 on 19 degrees of
freedom, p> 0;2: However,,this is not so much evidence ofhomogsneity of relative risks
as ieis reflective of the high unreliability of the individual relative risks. For 8 of the 20:
relative risks shown, the upper limit on the relative risk exceeds the lower limit by a factor
of about 10 or more, that factor, attaining a value of 57 in one instance.
Blot and Fraumeni express concern about other long term~consequences of passive
smoking, particularly in connection with coronary artery disease. They cite a report by
Garland ev al. [8J who initially reported a relative risk due to passive smoking of death
from ischemic heart disease of 14.9, but seem unaware that the estimate of 14.9 has been
revised downwardto 2.7. Inthe report of the National Research Council [9], which I will
be discussing below, there is awareness of the downward revision, but not,.of the fact that
the suggestive significance of p< 0;10 is lost and becomes p<A.20.
That lung eancer may aggregate in families is also of concern to Blot and Fraumeni,,
who cite Ooi et al. on the subject [ 10]. Elsewhere, and yet to appear, I have suggested that
apparent familial aggregation, in the instance breast; cancer, may be a reflection of an
awareness bias rather than of true familial aggregation [11). If information about
relatives is not collbcted more directly,,the apparent aggregation based on reports from
the Index case may only reflect heightened knowledge by such cases of similar illnesses
about ~ relatives. But the report by Ooi et al. i's another instance, like that of Garlan&etal.,,
in which there has been unreliable statistical evaluation. Thus, Ooi el all initially reported
that the lung cancer risk increased eighteen-fold per 10-year age increase. By letter in the
October, 1986 issueaf the Journal of the National Cancer Institute they have revised that
factor downwards, giving separate factors for each I0-year age interval. From age 50 to
age 60, the factor is now reported at only 2.9.

The Report of the Committee on ~ Passive Smoking, Board on Environmental
Studies and Toxicology, National Research Council [9J
I have chosen to discuss the epidemiologic aspects of this Report separately, since it is
essentially the definitive work on current knowledge on environmental tobacco smoke. A
member: of the committee was Nicholas Wald, senior author of one of the articles
discussed above. The report contains a technical appendix which largely duplicates the
appendix in the article by Wald et al. and also repeats, with minor variations, the data of
Wald et al. The body of the report itself contains those same data, but recast differently,
and it is the same 13studies, with 20 relative risk values, which underlie the epidemiologic
aspeas ofthe Committee Report.
There are a great variety of issues which the Committee Report goes into, whether
physiochemistry; toxicology, assessment of exposures, use of questionnaires, exposure-
dose relationships, etc. Butmy concern at this time is the epidemiology. There could be a
point to estimating the annual number of lung cancer deaths in the United States due to
passive smoking, but that would have to be on the presumption that passive smoking
does play a causative role.
However, the Committee Report is quite restrained in its findings and leaves open the
question of whether anything has been established. If the apparent relative risk is
significantly greater than unity; the excess cannot be fully explained away by certain
biases considered. However, whether there is statistical significance in view of those
biases is not addressed.
From dosimetric considerations, the Report suggests that the excess risk of lung
cancer due to environmental tobacco smoke should be 1% of the excess risk due to active
smoking. This leads to a relative risk of 1.14 for men, perhaps less for women. From the
epidemiologic data, the summary relative risk is 1.34, butit is brought out that for United
States studies only the relative risk would be only 1.14. If only large studies are
considered, the overall relative risk would be 1.32:
Next addressed by'the Report is the effect of biases, particularly the bias associated
with the false reporting of individuals that they were not (or never have been) smokers.
This leads to a lowering of the estimated relative risk of 1.34 (or 1.30 to 1.34) to 1.1 S. But
note that on this same basis, Wald et aL were willing to reduce an apparent relative risk of
1.35 only slightly, to 1.30.
Yet another adjustment is made. If non-smokers are not exposed'to environmental
tobacco smoke at home, they might still be exposed to it away from home. An upward
adjustment of 8% on account of this yiel'ds 1115 X 1.08 = 1.24. This contrasts with the
upward adjustment of 18% made by Wald et all, who calculated 11.30 X 11.18 = 1.53. The
Committee Report differs markedly from the separate report made by one of its own
members.
In discussing, Wald et al. I suggested that the upward modification they have
presupposed a positive role for passive smoking. This same thing is true for the 8%
upward adjustment in the Committee Report: For purposes of evaluating the statistical
significance of the findings, the relative risk should be taken as 1.15, though the value of
1.24 might be appropriate for assessing the toll' in excess lung cancer due to passive
smoking assuming that there is causality. With the United States studies indicating an
unadjusted relativerisknf only 11.14rather than 1.34, both the 1.15 and the 1L24,might be
sharply lowered if intended to apply only to the United States.
But let me stay with the relative risk of 1.13 pri:or tothe 8% upwardadjustment. Is that
relative risk significantly in excess of 1.00? 1 suspecrnot. And even the question of bias
remains open. Both in the Committee Report and in the article by Wald et al., the only

What Is the Epidemiologic Evidence for a Passive Smoking-Lung Cancer Aeaociation7 345
1
I
biases factored in were just those that would fit into neat mathematicaGLormulas. More
subtle biases or ones that had not beenrthought of did notiget in. I gave an example above
of the use by Garfmkel et al. of the responses by sons and daughters of the level of
smoking by the fathers.
I might even speculate about publishing bias. If an investigator got a weakly, or
insignificantly negative resulvfor the role of passive smoking in lung cancerwould he
bother submitting it for publication7 And if he did, would itbe accepted for publication?
Postulating this kind of bias is not necessary for establishing that the 1'.15 relative risk is
likely not significant. But I bring it up in connection with a tendency I see towards
accepting uncriticallyor less critically manuscripts which are onthe right side of the fence
on the issue of passive smoking. A particular example was the publication of the article by
Garland et al. on passive smoking and ischemic heart disease mortality, the claims of
which fell apart on scrutiny.
Let me bring up now another thought. Some time ago the possibility of subtle or not-
so-subtle biases in case-control or other epidemiologic investigations was so much a
matter of'concern that it was suggested that unless the relative risk were atleast 2.0, any
increase in risk should not be accepted. Perhaps we can do better now and might employ a
less restrictive criterion.
Butl can see no relaxation to the point of accepting the relative risks now observed for
passive smoking in~ lung cancer. What we must accept is that it is unlikely that any
epidemiologic investigation has been on can be mounted which would establish a causal
role for passive smoking in lung cancer. Those who believe such a role exists should
continue to believe as much, and might even hazard estimates as to the resulting toll in
deaths and diseasewith other allowed to hold contrary beliefs. What would ~be incorrect
would be to claim that epidemiologic studies have established the correctness of, the
belief. If epidemiologic investigations cannot establish a role for passive smoking; the best
we can do is to make suppositions estimates of how great that role may be - and suchh
suppositions estimates can be too high if any of the underlying supposals are false. One
supposal would be that the dosage response curve is linear through the origin, another
that some particular biochemical measuresay level of'eotinine, is a proper measure of
the equivalent exposure to cigarettes of passive smoking: And;,I pointout, there coul&be
the assumption that the temperature at which tobacco smoke is inhaled is not relevant,
though Pwould think that fresh hot smoke would be more active than stale smoke.
With this thought in mind, we can pick up some clues from the report of Jarvis et a]'L
who, after, excluding "deceivers", report average cotinine levels in plasma, salivaand
urine of' 100 non-smokers to be at 0.55%+ 0.554$ and 0.364 respectively of those levels
from 94 smokers. Let us take it at 0.5%. If the average cigarette smoker has a relative risk
for lung cancer of 10.0 (enhancement of 900%a, though the enhancement may be 1,400%
for very active smokers)i this would put the enhanced risk due to environmental tobacco
smoke at 4.5%, for a relative risk of 1.045 (it would be 1.07 using the 1,400%
enhancement for very active smokers). That relative risk, 1.045would encompass both
passive smoking at home and away from home, including individuals not exposed to
passive smoking at home.
What matters, however, relative to the conduct of epidemiologic studies on the
subject, is the differential in relative risk between those knowingly exposed to passive
smoking and those who believe themselves unexposedi From data available in Jarvis et
al., it would appear that those seemingly not exposed to passive smoke (46 in number);
nevertheless have a relative risk of about 1'.02: For the 54 non-smokers claimed to be
aetuaTly'exposed to passive smoking; the relative risk based on cotinine levels would, in

346 N. Mantel
similar mannerbe 1!.07. Compared then to seeminglynontxposed to passive smoking,
the ealeulatedrelative risk for the known exposed to passive smoking would be 1.05. That
smallincreascin relative risk just would not show up on any epidemiologic investigation
and would be submerged, in any case, by other very likely biases. The National Research
Council report had suggested a relative risk, based on dosimetric considerationsof 1.14,
but on the assumption that enhancement in~risk due to an active smoking was 1,400%.
An enhancement of 900% would have led them to anticipated relative risk of 1.09. But
whether we use 1.05, 1.09, or 1.14; the effect would still be undetectable.
As a last point, I raise the issue of passive smoking effects on children.,If parents can
be shamed into not exposing their children to passive smoking, this is all well and good,
even if' the supporting basis is unsound. I note that the ill effects arise mostly in early
childhood, and have two questions. Have the passive smoking effects been isolated from
effects due to mother's smoking prior to the child's birth? To what extent has account
been taken that cigarette smoking concentrates in families with lower socio-economic
status, as evidenced by lower educational level and more unemployment etc. Rona et all
also brought in the factor of overcrowding at home in their report that passive smoking
resulted in some small reduction in the stature of children { 12]: Buteven Rona et al. failed
to take properly into account, as I have suggested, the role of some of these important
factors on smoking rates in their evaluation {i13].
What with subtle biases, not so subtle biasesand even extravagant errors, one should
not accept too readily claimed demonstrations of ill effects of passive smoking. Passive
smoking has been the favorite whipping boynf epidemiologists for too lbng already. The~,
public is entitled not to be unnecessarily exposed to environmental'tobacco smotebut
any panic is unjustified. ~
References
1. Blbt JB, Fnumeni JFJn(1986) Guest aditorial! Panive smoking and'dung cancer. Journal of
tbeNational Cancer Institute 77:993-1i000
2. Wild NJ, Nanchabal K, Thompson SG, Cucklr HS (1986) Contemporary theme. Does
breathing other people's tobacco smoke cause lung cancer? Br Med J 293:1,217-22
3. Hirayama T(1981) Non-smoking wives of beavy smokers have a higher risk of lung cancer: A
study from Japan. Br Med J 282:183-5
4. Mantel N' (1983) Guest editorial. Epidemiologic investigations: Care in conduct, care in
analysis and care in reporting. 3 Cancer Res Clin Oncol 105: 113-6
5. Jarvis M,Tunstall-Pedoe H, FeyerabendC, Vesey C,SalloojorY'(1984) Biochemical markers
of smoke absorption and self reported exposure to passive smoking. Journal of Epidemiology
and Community Health 38:335-9
6.,Garfinkel L, Auerbach 0, Joubert L (1985) Involuntary smoking and lung cancer: A case-
control study. Journal of the National Cancer Institute 75:46'3-9
7. Mantd N(1986)'Letter. Involuntary smoking and lung wncer-some object lessons. Journal of
the National Cancer Institute 76:1,261-3
8. Garland C, Barrett+Connor E, Suarez L, Criqui M, Wingard D (1985) Effects of passive
smoking on iscbemic beatt disease mortality of non-smokers: A prospective study. Am J,
Epidemiol 121:645-50
9. National Research Council (1986)'Committe on passive smoking, board on environmentall
studies and toxicology. Environmental tobacco smoke - measuring exposures and assessing
health effects. National Academy Press, Washingtonj DC
10. Ooi WL, Elston RC, Chen VWBailey-Wilson JE, Rothschild H(1986)lncreased funilial risk
for lung eancer, Journal of the National Cancer Institute 76:217-22
04
J

What Is the Epidemiologic Evidence for a Passive Smokiag-Lung Canar Association? 347
11. hSantel N(1989)'Letter: Familial breast cancer and the awareness bias. Am 3 Epidemiol (in
pma)
12: Rona RJ, Chian S, Florey C(1985)Exposure to cigarette smoking and c6ildren's growth. IntJ
Epidemiol 14:402-9
13: M'antel N (1986) Letter: Does passive smoking stunt the growth of children7 lnt J Epidemioli
15:427-8
I

GLOSSARY
Acute: Having a short course; of short duration.
Animal study: A controlled laboratory experiment in which~animals
are exposed to an agent and the biological effects of this
exposure are assessed. The exposure may be via food or water
(ingestion), by injection, by external application or by
inhalation. Typical effects that might be measured, are tumor
incidence or tissue and~organichanges.
Bias: Regarding epidemiologic studies, the operation, of factors
in a study's design or execution that erroneously lead to the
appearance of a stronger or weaker association between the
agent in question and disease than in fact exists.
Bioassay: The determination of the activity of a sample of an
agent by noting its effect on a live animal or an isolated
organ preparation.
Carcinogen: A substance or agent designated as capable of producing
or initiating cancer.
Carcinogen classification system: A system for stratifying the
weight of evidence for human carcinogenicity, for example,
the system followed by the EPA. The EPA system consists of

the following levels: Group A -- carcinogenic to humans;
Group B-- probably carcinogenic to humans; Group C -- possibly
carcinogenic to humans; Group D, -- not classifiable as to human
carcinogenicity;; and Group E -- evidence of non-carcinogenicity
for humans.1
Case-control study: A type of epidemiologic study whichi compares
diseased persons (cases), with nondiseased persons (controls)
in association with a common exposure to an agent.
Chronic: Persisting,over a long period of time. Regarding animal
studies, refers to administration of the test substance over
a period of several weeks or months.
Cohort study: An epidemiologic study which examines the d'evelopment
of a disease in a group (cohort) of persons who are currently
free of the disease. May assess exposure either prospectively
or retrospectively.
Confounding: As applied to epidemiologic studies, the situation
in whichl the relationship between an agent and a disease
appears stronger or weaker than it truly is due to the
influence of another unknown or unrecognized factor.
In
1. The definitions for carcinogen classification system, dose-
response assessment, exposure assessment, hazard
identification, risk assessment, risk characterization and
weight of evidence are taken from the EPA's 1986 "Guidelines
for Carcinogen Risk Assessment," 51 Fed. Recr. 185, 33992-34003.

confounding, the agent under consideration is associated'with
another agent (a confounding factor, or confounder) which is
itself associated with: either an increase or decrease in the
incidence of the disease.
Dose-response assessment: Part of a risk assessment. Defines the
relationship between the dose of an agent and the probability
of induction of a carcinogenic effect.
Environmental tobacco smoke (ETS)~: Consists of smoke originatingg
from the smoldering end of a tobacco product between puffs,
e.g., sidestream smoke, and of smoke exhaled~ by the smoker.
The components are released~ into the environment where they
are d'iluted'by ambient air and' undergo changes related to
aging over time.
Epidemiology: The branch of science concerned with the patterns
of disease in human populations and the various factors that
influence these patterns.
Exposure assessment: Part of a risk assessment. Identifies
populations exposed to the agent, describes their composition
and size, and presents the types, magnitudes, frequencies and
durations of exposure to the agent.

Hazard identification~: Part of a risk assessment. A qualitative
assessment of risk, dealing with the process of determining
whether exposure to an agent has the potential to increase
the incidence of cancer. It qualitatively answers the question
of how likely an agent is to be a human carcinogen.
In vitro: Literally, within glass;; used to refer to laboratory
procedures conducted'in a test tube or similar location, often
involving preparations of cells or tissues.
In vivo: Literally, within the living body; used to refer to:
laboratory procedures utilizing live animals.
Mainstream: smoke (MS): Tobacco~smoke drawn through~ the butt end
of a cigarette.
Meta-analysis: A statistical technique for combining studies into
a single analysis, designed to increase
the ability to
statistically detect an association if such an association is
present.
Mutagen: An~ agent that tends to increase the frequency or extent
of mutation, i.e., physical or biochemical changes in the,
genetic material of an organism. ©
CJ
CA
~
Cj
CA
W

,
)
Binder, R., et al., "Importance of the Indoor Environment in.
Air Pollution Exposure,'"' Arch Environ Health 31(6): 277-279,
1976.
Pharmacokinetics: The study of the action of chemical substances
in the body over a period of time, including the processes of
absorption, distribution, metabolism and excretion.
Relative risk: The ratio of the incidence rate of a disease among
individuals exposed~ to a particular risk factor to the
incidence rate among unexposed individuals.
Risk assessment:
The determination: of adverse health consequences
from exposure to toxic agents. [Will be carried out
independently from considerations of the consequences of
regulatory action.] Includes one or more of the following
components: hazard identification, dose-response assessment,
exposure assessment and risk characterization.
Risk characterization: Part of a risk assessment. Combines the
results of exposure assessment and dose-response assessment
to estimate a carcinogenic risk in~quantitative terms.
Risk management: A combination of risk assessment with the
directives of regulatory legislation, together with
socioeconomic, technical, political and other considerations,

to reach a decision as to whether or how much to control future
exposure to suspected toxic agents.
Short-term tests: In vitro (performed on cells or tissue cultures)
tests for mutations, including tests for chromosome
aberrations, DNA damage/repair and~ other transformations which
provide supportive evid'ence of cellular changes and may give
information on carcinogenic mechanisms.
Sidestream smoke (SS): Smoke originating from the smoldering end
of a tobacco product between puffs.
Statistical significance: A procedure to quantify the probability
that an observed' outcome, e.g., an association between an
exposure and a disease endpoint, arose from random variation
alone. The scientific community often uses 5% as a standard
level at which data are accepted as occurring other than by
chance. This means that there is a 95% probability that the
results are not attributable to chance.
Toxicology: The scientific study of poisons, their actions, their
detection and the treatment of the conditions produced by them.
Weight of evid'ence: A framework utilized by the EPA for judging
the likelihood that an agent is a human carcinogen. Three
major steps are involved: (1) characterization of evidence

from: studies and from animal studies, individually; (2)
combination of the characterizations of these two types of
data into an indication of the overall weight of evidence;
and (3) evaluation of all supporting information to determine
if the overall weight of evidence should be modified. [See
also definition for carcinogen classification systemi.]
10380571

DEFINITIONS
In medical research, there are two major types of studies:
experimental studies and observational studies.
An Exnerimental Study requires that the members of a
study population be assigned to either a treatment or control
group. The treated~ and~ untreated groups are then followed
prospectively to see whether the two groups subsequently differ in
their disease experience.
Def: An Observational Study is one in~ which the
treatment or exposure of interest is not assigned~but instead occurs
by choice or by happenstance.
The types of observational studies are the case report,
the cross-sectional study, the ecologic study, the case-control
study, and the cohort study (often called!prospective).
A Case Report is strictly speaking not a scientific
study but a description of a small number of persons with: an unusual
disease or an unusual change in their disease status.
A Cross-sectional Studv reports the characteristics
of a group of people at one point in time or a snapshot of their
health picture.

The Ecologic Study uses data that are routinely
collected (such as air pollution data) to study the occurrence of
disease among groups of people. For example, heart disease
incidence may be studied in a group of people where information on
national dietary habits are known.
A Case-control Study or Retrospective Study is one
that begins with study subjects who have the disease of interest
and a comparison group without the disease. The previous exposures
of both groups are investigated.
A Cohort Study or Prospective StudV is one in which
the researcher starts with one group of persons exposed to a factor
of interest and another comparable group that is unexposed:. These
groups are observed'
at a later time to see whether they have
d'eveloped differences which might be attributable to their
different exposures.
A Confounder is a factor which confuses the correct
interpretation of the data relating to a suspect and disease. The
confounding factor acts by being associated both~with the exposure
and the disease in a way that makes the exposure and the disease
seem to be related. An example, which was published in 1978,
related jet plane noise with an increased death rate. Upon
reexamination, it was found that persons exposed to jet noise lived~
in devalued housing close to airports and were of a less fortunate

socioeconomic strata. When a proper analysis of these other factors
was performed, jet noise was found to have no association with~
increased mortality.
Conf oundina is the process by which noncausal
associations between two factors is produced by any association
with a third~factor known as the confounder.
Bias is nonrandom error. Not related to the word bias
used in the sense of prejudice.
Risk (or absolute risk) is expressed~ as a death rate or
disease rate.
Relative Risk is the ratio or quotient of two risks or
absolute risks. It is also known as a risk ratio.
Odds Ratio is a measure of risk usually obtained from
case-control studies and mathematically close to relative risk.
p-value is a statistical estimate of the probability
that a finding is due to chance. By convention, a finding with a O'
p-value less than 5%, or sometimes 1%, is called statistically ~%
CJ
significant. ~
N
W
ILI
W

Statistical Association. Two factors are statistically
associated when there is a tendency for the factors to occur
together or to change together. The observed relationship is the
statistical association, measured in many ways.

General Definition of Bias:
I
I
I
I
I
I
I
i
I
I
I
I
Ik
,
Deviation of results or inferences from the truth; or processes leading to such
deviation. AnU trend in the collection, anaIysis, interpretation; publication or
reuiezo of data that can lead to conclusions that are systernatically different from
the truth.l
Definition of Specific Biases:
PubIication Biases:
Reviewer Bias - Systematic error due to failure of journal editors to accept and'1 publish
reports with negative, non-sigruficant; or contrary conclusions.
File-drawer Bias - Systematic error due to failure of authors to submit reports on
negative, non-significant, or contrary conclusions.
Researcher Bias - Systematic error due to failure of authors to include negative, non-
significant, or contrary conclusions in reports documenting multiple-endpoint studies.
Subject Biases:
Recall Bias - Systematic error due to differences between case and control subjects in
accuracy or completeness of' recall of prior events or experiences that may be related to
the medical endpoint of concern.
Reporting Bias - Sy*st'ematic error due to selective suppression (or revealing) by the
subject of informatioA such as past history of other disease that is related to the medical
endpoint of concern.
Misclassification Bias - Systematic error due to inclusion of subjects in case or controli
groups who do not meet exposure aiteria.
M'edical Biases:
Detection Bias - Systematic error due to differing methods of ascert~ainment, diagnosis,
or verification of cases between exposure gToups. N
C
Autopsy Bias - Systematic error resulting from the fact that autopsies represent a N
.
nonrandorn sample of' deaths. ~
21J
'r'r Dicioncni of tpiderniolbonJ, Sc:ond --di:ion. Ed: Last JTyi. Oxford University Press, New York
198& W
~
~
1 1_

Statistical Biases:
I
I
I
I
I
I
I
I
I
I
i
3
I
Design Bias - Systematic error due to faulty design of a study, including uncontrolled
confounding, poorly defined~ populations, and nonsimultaneous comparisons using,
historical controls.
Sampling Bias - Systematic error due to nonrandom indusion of subjects from the
reference population because of availability of subjects, willingness of subjects to
participate, criteria for selection, use of hospital cases and/or controls, and subse4uent!
follow-up failure, withdrawal or exclusion from the study.

k
5
I
l

A DICTIONARY OF
EPIDEMIOLOGY
SECOND EDITION
Edited for the
International Eprdemiological A.uocialion
bT
John M.. Last
TW9C7.TsEZo71
W
New York Oxford Toronto
OXFORD UNIVERSITY PRESS
1988

Oxford University Press
Oaford New York Toronto
Delhi Bombay Calcutta Madrar Karachi
Pcial'n6JaTa Singapore Hong Kong Tokyo
Nairobi Dar ea Salaam Cape Torn
Melbourne Auckland
and asaociaied companies in
Berhn Ibadan
C.opyright © 1988 by International Epidemiological Association, Inc.
Published bv Oxford Univenitv Press. Inc.,
200 Madison Avenue. New 1'ork. New York 10016
O.ford u a repstered irademark of Oxford University Preu
All riRhu rearrsed. No pan of this publiritinn mav be reproduced.
stored in a rrtneval rvrem, or tranamitied. in any form or bv anr means,
ek_rironic. mechanical. phoaoropvinR. recording. or aherwire,
without she prior permission of (hford Univenhy Presr.
Libnrv of Congress GauloRinR-in-Publication Data
A Dictionacy of epidemioloKr.
Includes b:h/io_Rraphies.
I. Epidemiob6v-Dictionaries. I. tast. John M.. 1926-
11. International EpideminloRiral Arrociation. --
IDNLM: 1. EpidemioloRv-dictionaricr.
WA 13 D5551
RA651.D55 1988 611.4'0)'21 B7-_l1109
ISBN 0-19-505400-6
15BN 0-19-505181--! tpbk.l
59CZ15t?'0z
2416111091551
Printcd in the Uniied Sutes of America
on acid-free paper
Foreword
The International Epidemiological Association is extremely
pleased that the Dicteonnry ojEpedcmiologY has been so successful
that a second edition has been demanded. As one of the Asso-
ciation's aims is to "spread the message," this work is an exam-
ple of "what we call it." Only if we all understand the same
thing when a particular term is used will the aim of the Asso-
ciation be capable of being fulfilled. This dictionary is funda-
mental to this objective.
W. W. Holland, MD FRCGP FRCP FFGM
President, lnternational Epidemiological Association

Preface
gsM1sCzoz
This dictionary, ictionary, appearing now in its second edition, is an at-
tempt to bring some order to the occasionally chaotic nomen-
clature of epidemiology. It is intended for all who are inter-
ested in epidemiology, especially those who are beginning to
study the subject, those whose first language is not English, and
those from other fields who need to know the terms epide-
miologists use.
Like all rapidly expanding sciences, epidemiology has been
confounded by the proliferation of words and phrases to de-
sc-ribe its co_ nccpts, principles, methods, and procedures. The
creation of new terms and disagreement about the meaning of
old ones can confuse beginners and established epidemiologists
alike.
Remarks by users of the first edition have reinforced the view
that the boundaries should be wide rather than narrow, that
the language should be simple, that some terms many epide-
miologists think everyone already knows should be included.
The second edition is larger than the first, partly for this rea-
son, and because terms omitted from the first edition have been
included and many old entries expanded.
The dictionary is not an index of permitted and proscribed
usage. I hope that it is authoritative without being authoritar-
ian. Where synonyms exist, the definition appears under the
most commonly used of these, but preference for one term over
another is not necessarily implied. In a few instances, the use
of a term is deprecated. Some terms that are properly de-
scribed as slang or jargon have been included because they are
widely used and their meaning is not always clear from the con-
text. Murphy's description of jargon is worth recalling: "ob-
scure and/or pretentious language, circumlocutions, invented
meanings, and pomposity delighted in for its own sake."
There was disagreement among the contributors to this edi-
tion about including certain acronyms and eponyms. An acro-
nym is a word made up of letters from two or more other words,
e.g. ANOVA for analysis of variance, or from initial letters, e.g.

preface viii
WHO for World Health Organization. All lay and technical vo-
cabularies contain acronyms; epidemiology has its fair share.
By convention, acronyms are spelt out the first time they ap-
pear in a text, and, if they are numerous, considerate editors
sometimes supply a glossary, oc at least list the acronyms along
with the words for which they stand in an index. Although this
dictionary is not the place for extensive mention of acronyms,
a few appeared in the first edition, and a few more appear
here.
Eponyms, the attachment of personal or place names to con-
cepts, diseases, methods or specific studies, also occur often
enough in published papers and books for us to recognize that
beginners need some guidance to the meaning of those most
widely used. Some appeared in the first edition, and a few have
been added to the second-though again this dictionary is not
the proper place for a full glossary of epidemiological eponyms
(where would such a glossary end!).
As was the case with the first edition, a large number of epi-
demiologists from many countries ntries have participated in this re-
vision. The original modest notices in a couple of journals and
a few casual remarks_ among friends produced a mailing list of
some forty persons, mainly in North America and the United
Kingdom. The mailing list rapidly grew until, by the fifth round
of correspondence in December 1986, there were 108 corre-
spondents in 25 countries. The list continued to grow after this
fifth and final round; but the published roster of names that
follows this preface is both more and less than the number of
active participants. Some seemingly inquired just from curiosity
and played no further part. Others wrote lengthy and often
vigorously argumentative comments and suggestions express-
ing not only their own views but those of colleagues in their
academic department or institution-in one instance,, col-
leagues elsewhere in that nation.
In addition to extensive comments from these correspon-
dents, I have made good use of other technical dictionaries and
glossaries in compiling this revision. All of these are listed in
the bibiliography, and many are also to_ be found in footnotes
that follow specific entries.
The compilers of dictionaries must exercise the greatest care
in the choice of words and in their arrangement. Most entries
in this dictionary have been repeatedly discussed with many
contributors, and in nearly all instances the wording has been
agreed upon by all; on the rare occasions when agreement eluded
us, the final decision was mine alone. Therefore, I accept full
responsibility for the deficiencies in the finished product.
The work has been sponsored by the International Epide-
49C?;TSC2'0%
ix
miological Association, which provided partial travel support
for me to attend two meetings in 1986; further support was
provided by the McLean Foundation and the Milbank Memo-
rial Fund. All royalties from the sale of this edition, like those
from the first edition, will go to the International Epidemiol-
ogical Association.
Finally, I thank Jeffrey House of Oxford University Press for
helpful advice and encouragement.
Ottawn, Canada
November 1987
J. M. L.
preface

Contributing Editors
J. H. ARRAMCON
Jerusalem. Israel
URSULA ACKERMAN-LIERRICH
Basel, Switzerland
RORERT ALLARD
Montreal, Quebec. Canada
JOHN C. BAILAR IH
Washington, DC, USA
CHRISTOP-HER 13ALDOCK
brisbane. Queensland, Australia
ROlERTO G. ISARUZZI
Sao Paulo, Brazil
ARRAM S. BENENSON
San Diego, CA, USA
ROGER BERNARD
Geneva, Switzerland
JEAN-FRANCOtS BOIVIN
Montreal, Quebec, Canada
BERNARO J. BRARIN
Madang, Papua New Guinea
C. RALPH BUNCHER
Cincinnati, OH, USA
BEVERLEY CARLSON
New York, NY, USA
JAMES CHIN
Berkeley, CA, USA
MICHEL COLEMAN
Oxford, England
L. CAYOLLA DA MOTTA
Lisbon, Portugal
GARET'H DAVIES
New Ilaw. Surrey. England
RICHARD DICKER
Atlanta, GA, USA
ALVAN R. FEINSTEIN
New Haven, CT, USA
DAVID FINNEY
Edinburgh. Scotland
JOSEPH L. FLEI55
New York, NY. USA
GARY D. FRIEDMAN
Oakland, CA. USA
MICHAEL GARRAWAY
Edinburgh, Scotland
SANDER GREENLAND
Los Angeles, CA. USA
TEE GUIDOTTt
Edmonton. Alberaa, Canada
WALTER W. HOLIAND
London, England
B_ARRARA HULKA
Chapel Hill. NC, USA
MICHEL IRRAHIM
Chapel lfill, NC. USA
LESUE M. IRwlc
Sydney. NSW, Australia
MILOS JENICEK
Montreal, Quebec. Canada
L. KARHAUSEN
Luxembourg, Luxembourg

contributing editors
HENK LAMRERTS
Amsterdam, the Netherlands
JOHN M. LAST
Ottawa, Ontario. Canada
DALE LAWRENCE
Atlanu. GA. USA
DAVID E. LILIENFELD
New York. NY. USA
GENEvICVE LOSLIER
Hudson, Quebec, Canada
ROSERT MACLENNAN
Brisbane, Queensland, Australia
MARGARET F. MCCANN
Chapel Hill, NC. USA
ANTHONY B. MILLER
Toronto. Ontario, Canada
KIUMARSS NASSERI
Teheran, Iran
JOHN S. NEURERCER
Kansas City, KA, USA
NORMAN D. NOAH
London. England
ROBERT OSEASOHN
San Antonio, TX, USA
HARRIS PASTIDES
Amherst. MA, USA
MIQUEL PORTA
Barcelona. Spain
DAVID RoetNsoN
GeneYa, Switzerland
GEOrrREY A. ROSE
London. England
xii
KENNETH J. ROTHMAN
Boston, MA, USA
JAMES J. SCHLFSSELMAN
Bethesda, MD, USA
B.A.SOUTHGATE
London, England
CLAUDE STROHMENGER
Ottawa. Ontario, Canada
IAN St1THERLAND
Cambridge, England
MERVYN SUSSER
New York, NY, USA
A.V. SWAN
London, England
RODOLio SARACCI
Lyon, France
MICHEL TIIURIAUx
Copenhagen, Denmark
B. TOMA
Maisons-Alfort. France
CARL TYLER
Atlanta. GA, USA
ROBERT B. WALLACE
Iowa City. IA, USA
STEPHEN D. WALTER
Hamilton, Ontario, Canada
KERR L. WHITE
Stanardsville, VA, USA
DONALD Wlc_tE
Ottawa, Ontario, Canada
CORRESPONDING EDITORS
ERIK ALLANDER
Huddinge, Sweden
ALBERTA ALZATE
Cali. Colombia
GEOrrREY A. ANDERSON
Ottawa. Ontario, Canada
MARY JANE ASHLEY
Toronto. Ontario, Canada
r
R.S. BHOPAL
Glasgow, Scotland
PATRICIA A. BUrr1ER
Houston. TX, USA
ARVIND A. CARPENTR
Oak Ridge. TN, USA
CARL J. CASPERSON
Atlanta, GA. USA
DENIS CHARPIN
Marseilles, France
GERALD ER_ALD R. CHASE
Denver. CO, USA
EMIL E. CRISTOEANO
Akron, OH. USA
ORESTES FAGET CEPERO
Havana. Cuba
ANNE HERSEY COULiON
Los Angeles. CA. USA
DAVID CUNDIrT
Trenton, NJ, USA
ROGER DETEtS
Los Angeles, CA, USA
ROnGER DOYLE
Buffalo.. NY, USA
GERARD DURo15
Panis, France
JACtjUELINE FARIA
Quebec City, Quebec. Canada
G.1. FoaREs
Edinburgh, Scotland
LINA FORCtER
Sydney, NSW. Australia
EDUARDO_ FRANCO
Sao Paulo, Brazil
JACK FROOM
Stony Brook, NY. USA
TRUt.S GEDDE-DA/IL
Oslo, Norway
O.N.GIU.
London, England
xsn
correaponding editors
DAVID GOLDSMITH
Edmonton, Alberta, Canada
PATRICIA GRAVGS
Madang, Papua New Guinea
VINCENT GUINEE
Houston, TX, USA
MATTI HAItAMA
Tampere, Finland
A. SCOTT HENDERSON
Canberra. ACT, Australia
CATHERINE HILL
Villejui(, France
ERNEST B. HOOK
Albany. NY, USA
J. JANCAR
Bristol, England
FINN KAMPER-JORGENSEN
Copenhagen, Denmark
H.W. KANIS
l.elysud, the Netherlands
JENNIrER KELSEY
New York, NY, USA
MARY CLAIRE KING
Berkeley. CA, USA
MAUa1CE KING
Leeds. England
GEORGE KNOx
Birmingham. England
JESS KRAUs
Los Angeles, CA. USA
H. OWVER LANCASTER
Sydney, NSW, Australia
F.D.K. LtoDEU.
Montreal, Quebec. Canada
IAN MCDOWEt1.
Ottawa. Ontario, Canada
H. MICHAEL MAETz
Birmingham, AL. USA
Luls MACJIo
Lisbon, Portugal
ssCzTsCzo%

corresponding editon
J.S. MALHI
Brighton. England
DIttEP V. MAVALANRER
Ahmedabad, India
DAVID MORRIS
London, England
C.S. MUIR
Lyon, France
ENRIQUE N,(JERA
Seville, Spain
KATE O'CONNOR
Ottawa. Ontario, Canada
H. TUNSTAt.L PEDOE
Dundee, Scotland
1. PLESKO
Bratislava, Czechoslovakia
PAULA RANTIKALLIO
Oulu, Finland
R.A. RoelNSON
St. Paul, MN. USA
ROGER ROCHAT
Atlanta, GA, USA
JEFFREY ROSEMAN
Birmingham, AL, USA
PHILIP ROSS
Honolulu, H1, USA
M. SARIC
Z.2breb, Yugoslavia
BjORN SMEDOY
Uppsala, Sweden
ANDREU SE('.URA
Barcelona, Spain
xiv
COLIN SOSKOLNE
Edmonton, Alberu, Canada
PAULA STEWART
Ottawa, Ontario, Canada
RORERT SPASOFF
Ottawa, Onurio, Canada
D.STRACHILOV
Soha, Bulgaria
HuGH TILSON
Research Triangle Park, NC, USA
TOSHIO TOYAMA
Tokyo,Japan
EDWARD J. TRAPIDo
Miami, FL. USA
STJEPAN VIDACEK
7agreb, Yugoslavia
ANNE WALLING
Wichita, KA, USA
RORERT WE3T
Cardiff, Wales
JAN WIENPAHL
Los Angeles, CA, USA
WU X1-KE
Hefei, Anhui, China
M.J. WYSOCKI
Warsaw, Poland
FARAT YUSUF
North Rydc, NSW, Australia
FRED ZERFAS
Los Angeles, CA. USA
SHUx1AN ZU
Hefei, Anhui. China
A Dictionary of Epidemiology
o4MzsCzoZ

UEZZ SC07:o%
t
AsoRnDN RATE The estimated annual number of abortions per 1000 women of f repro
ductive age (usually defined as age 15-44).
ABORTION RAT70 The estimated number of abortions per 100 live births in a given year.
AtSCIbSA The distance along the horizonul coordinate or x axis, of a point P from the
vertical or Y axis of a graph. PJee also AXIS, GRAPH, ORDINATE.
AR,iOL_UTE RISK Usuall?this term means the observed or calculated risk of an event in
a population under study, as contrasted with the relative risk. Sometimes, however,
it is a synonym for attributable fracuon, excess risk, or risk difference; because of
the inconsistency, this term should be avoided. See also RISK.
ACCEPTARLE RISK The risk that has minimal detrimental effects, or for which the ben-
elits outweigh the potential hazards. Epidemiologic study has provided data for
calculation of risks associated with many medical procedures and also with nccupa-
tional and environmental exposures; these data are used, for instance, in CLlNICAL
DECISION ANALYSIS.
ACaURwt:v The degree to which a measurement, or an estimate based on measure-
ments, represents the true value of the attribute that is being measured. See also
MEASUR[MENT-, PRORLEMS WITH TERMINOLOGY.
AtJQUAINTANCE NETWORK Group of persons in contact or communication among whom
transmission of an infectious agent and of knowledge, attitudes, and values is pos-
sible, and whose social interaction may have health implications. See also TRANSMIS-
SION Or INFECTION. ACQVIRED IMMUNODEF7ClENCY ><VNDROMC (Syn: acquired immune deficiency syndrome)
(AIDS) For surveillance purposes, the Centers for Disease Control, Allanu, Geor-
gia,' define a case of AIDS as an illness characterized by (1) one or more of a group
of opportunistic or indicator diseases that are indicative of underlying cellular im-
munodcficiency; (2) absence of all known underlying causes of cellular immuno-
deficienc-y and absence of 211 other causes of reduced resistance to opportunistic or
indicator diseases. Additional criteria are serum positive for HIV antibody, positive
cuhure for HIV, and reduction of T4 "helper" lymphocytes.
The opportunistic or indicator diseases associated with AIDS include certain pro-
tozoal and helminth infections, notably Pneweocyitu cnrinti pneumonia and toxo-
plasmosis; fungal infections, notably candidiasis of esophagus. Irachea, bronchi or
lungs and cryptococzosis, especially affecting the central nervous syste_ m; bacterial
infections, notably with certain mycobacleria; viral infections, notably cytomegalo,
viruw and herpes simplex; and cancer, notably Kaposi4 sarcoma and lymphoma
limited to the brain.
AIDS-related complex (ARC) is the combination of H I V positive test with lymph.

ADL scale 4
adenopathy and persistent low fever but without immunodeFiciency or opportunis-
tic diseases.
' I9R7 Revision of case definition of AIDS for surveillance purposes. MMWR 36, I5:4S-9S. 1987.
ACTIVITIES Or DAILY LtVING (ADL) SCALE A scale devised by Katz and others' to score
physical JIrilitv/disability; used to measure outcomes of interventions for various
chronic disabling conditions such as arthritis. The scale is based on scores for re-
sponses to questions about mobiliq, selftare, grooming, etc. This was the first widely
used scale of this type; othen, mostly refinements or variations of the ADL L snle,
have since been developed.
' Kau S. Ford. AB. Moskowitz. RN', Jackson. BA, Jaffe. M W: Studies of illness in the aged. The
index of ADL, a sundardi:ed measure of biological function. fAAfA 185:9f4-919. 1963.
ACTUARIAL RATE See FORCE OF MORTALITI'.
ACTUARIAL TA/LE See_ LIFE TARLE. ACUTE
1. Referring to a health effect, brief; sometimes loosely used to mean severe.
2. Referring to exposure. brief, intense, or short-term; sometimes specifically re-
ferring to brief exposure of high intensity. See also CHRONIC.
ADAPTATION A heritable component of the phenotype which confers an advantage in
survival and reproductive success._ The process by which organisms adapt to envi-
ronmenul conditions.
ADDIT7VE MODEL A model in which the combined effect of several factors is the suwn of
V the effects that would be produced by each of the factors in the absence of the
others. For example. if factor X adds x.c7r to risk in the absence of f', and if factor Y
adds YSli to risk in the absence of X, an additive model states that the two factors
together will add (x+r)g to risk. See also INTERACTION; LINEAR MODEL; MATHEMAT-
ICAL MODEL; MULTIPLICATIVE MqDEL.ADJUSTMENT A summarizing procedure for a statistical measure in
which the effects of
differences in composition of the populations being compared have been mini-
mized by statistinl methods. Examples are adjustment by regression analysis and
by standardization. Adjustment often is performed on rates or relative risks, com-
monly because of differing age distributions in populations that are being com-
pared. The mathematical procedure commonly used to adjust rates for age differ-
ences is direct or indirect standardization.
ADVERSE REACTION, SIDE EFF'ECT Any undesirable or unwanted consequence of a pre-
ventive, diagnostic, or therapeutic procedure.
AETIOI.OGY, AETIOLOCIC See ETIOLOGY, ETIOLOGIC.
AGE DEPENDENCY RATIO See DEPENDENCY RATIO.
AGENT (OF DISEASE) A factor, such as a microorganism, chemical substance, or form of
radiation, whose presence, excessive presence, or (in deficiency diseases) relative
absence is essential for the occurrence of a disease. A disease may have a single
agent, a number of independent alternative agents (al least one of which must be
present), or a complex of two or more factors whose combined presence is essential
for thrdevelopment of the disease. See also CAUSnu-T-Y; NECESSARY AND SUFFICIENT
CAUSE.
ACE-PERIOD COHORT ANALYSIS See COHORT ANALYSIS.
AGE-SEx PYRAMID See POPULATION PYRAMID. AGE-SEx_ REGISTER List of all clients or patients of a
medical practice or s_ervice, classi-
fied by age (birthdate) and sex; provides denominator for calculating age- and sex-
specific rates.
AGE-SPECIFIC FERTILtTY RATE The number of births occurring during a specified pe-
5 aealytic study
riod to women of a specified age group, divided by the number of person-years
lived during that period by women of that age group. When an age-specific fertility
rate is calculated for a calendar year, the number of births to women of the speci-
fied age is usually divided by the midyear population of women of that age.
ACE-SPEGIFIC RATE A nte for a specified age group. The numerator and denominator
refer to the same age group.
Example:
Age-specific death
rate (age Y5-34)
Number of deaths among residents
age 25-34 in an area in a year
x 100,000
Average (or midyear) population
age 25-34 in the area in that year
The multiplier (usually 100,000 or 1.000,000) is chosen to produce a rate that can
be expressed as a convenient number.
ACE STANDARDIZATION A procedure for adjusting rates, e.g. death rates, designed to
minimize the effects of differences in age composition when comparing rates for
dl(ferenl populations. See also ADJUSTMENT, STANDARDIZATION.
AGCREGATION S1AS (Syn: ecological bias) See ECOLOGICAL FALLACY.
AGING OF THE PQPOLAT7ON A demographic term, meaning an increase over time in the
proportion of older persons in the population. It does not necessarih imply an
increase in life expectancy or that "people are living longer than they used to." The
principal determinant of aging in the population has been a decline in the birth
rate: when fewer children are born than in prior vears, the result, in the absence
of a rise in the death rate at higher ages. has been an increase in the proportion of
older persons in the population. In developed societies, however. mortalitY change
is becoming a factor: little further mortality reduction can occur in thc lirsl hall of
life, so reductions are beginning to occur in the third and fourth quarters of lile,
leading 10 a rise in the proportion of older persons from this cause.
AIRBORNE INFECTION A mechanism of transmission of an infectious agenl bY particles,
dust, or DROPLET NUCLEI suspended in the air. See als0 TRANSMI3SION OF INFECTION.
ALGORrTHM Any systematic process that consists of an ordered sequence of steps with
each step depending on the outcome of the previous one. The term is conlmonly
used to describe a structured process, for insunce, relating to computer program-
ming or to health planning. See also DECISION TREE.
IrAt.GORmtM, CUNICAL (Syn: clinical protocol) An explicit description rsf sleps to be taken
in patient care in specified circumstances. This approach makes use of branching
logic and of all pertinent data, both about the patient and from epidemiologic and
other sources, to arrive at decisions that yield maximum benefit and minimum risk.
ALLELE Alternative forms of a gene, occupying the same locus on a chromosrnne.
ALPHA ERROR See ERROR. TYPE 1.
ALPHA LEVEL See SIGNIFICANCE LEVEL
,~NALYSIS OF VARIANCE A statistical technique that isolates and assesses the contribution
of categorical independent variables 10 Yariation in the mean of a continuous de-
pendent variable. The observations are classified according to their calegories for
each of the independent variables, and the differences between the categories in
their mean values on the dependent variable are estimated and tested for statistical
significance.
J NALYTIC STUDY A study designed to examine associations- colnmonly putative or hy-
polhr.,ized causal relationships. An analytic study is usually concerned with idrnli-
R.M4,VII /'SC[..0Z

anirn.f model 6
fying or measuring the effects of risk factors, or is concerned with the health effects
of specific exposure(s). Contrast descriptive study, which does not test hypotheses.
The common types of analytic study are CROSS-SECTIONAL, COHORT, and cASE-GON-
TROL. In an analytic study, individuals in the study population may be classified
according to absence or presence (or future development) of specific disease and
according to "attributes" that may influence disease occurrence. Attributes may in-
clude age, race, sex, other disease(s), geneuc, biochemical, and physiological char-
acteristics, economic status, occupation, residence, and various aspects of the envi-
ronment or personal behavior. See 2130 _ CASE CONTROL STUDY; COHORT STUDY; GROSS-
SECTIONAL S_TVDY; STUDY DESIGN. ANIMAL MODEL Study in a population of laboratory animals that uses
conditions of an-
imals analogous to conditions of humans to model processes comparable to those
that occur r in human populations. See a1s0 ExrERtMENTAL EPIDEMIOLOGY.
ANTAGONISM Opposite o_ f SYNERGtsM. The situation in which the combined effect of two
or more factors is smaller than the solitary effect of any one of the factors. In
elOAssAY, the term may he used to refer to the situation when a specified response
is produced by exposure to either of two factors but_ not by exposure to both to-
gether.
ANTTttaoroMETRY The technique that deals with the measurement of the size, weight,
and proportions of the human body.
ANrsrwororxtuc (adj.) Pertaining to an insect's preference for feeding on humans even
when nonhuman hosts are available.
ANTI/ODY Protein molecule formed by exposure to a"foreign" or extraneous substance,
e.g., invading microorganisms responsible for infection, or active immunization. May
also be present as a result of passive transfer from mother to infant, via immune
globulin, etc. Antibody has the capacity to bind specifically to the foreign substance
(antigen) that elicited its production, thus supplying a mechanism for protection
against infectious diseases. Antibody is epidemiologically important because its con-
centration (titer) can be measured in individuals, and, therefore, in populations_ .
See also SEROErIDEMiOIAGY. ANTtGEN A substance (protein, polysaccharide, glycolipid, tissue
transplant, etc.) that is
capable of inducing specific immune response. Introduction of antigen may be by
the invasion of infectious organisms, immunization, inhalation, ingestion, etc.
ANTICENIC DRIR This term describes the "evolutionary" changes that uke place in the
molecular structure of DNA/RNA in micro-organisms during their passage from
one host to another. It may be due to recombination, deletion or insertion of genes,
to point mutations, or to several of these events. This process has been studied in
common viruses, notably the influenza virus.' It leads to alteration (usually slow
and progressive) in the antigenic composition, and thus in the immunologic re-
sponses of individuals and populations to exposure to the micro-organisms con-
cerned. See also ANTIGE_NIC SHIrT.
' Palcse P. 1'oung JF: Variation of Influenza A. B, and C Viruses. Seuwr 215:1468-1473. 1982.
ANTIGENIC sHt" This term describes mutation, i.e.. a sudden change in molecular
structure of DNA/RNA in micro-organisms, especially viruses, which produces new
strains of the microorganism. Hosts previously exposed to other strains have little
or no acquired immunity. Antigenic shift is believed to be the explanation for the
occurrence of strains of the influenza A virus associated with large-scale epidemic
and pandemic spread. Antigenic shift is responsibk for the susceptibility of host
populations to a new strain of influenza virus. See als0 ANTICENIC DRIrT.
ANTICENICI7Y (Syn: immunogenicity) The ability of agent(s) to produce a systemic or a
local immunologic reaction in the host. -
,
7 u.ociation, a.rmmetrical
Asuovtrtus A group of taxonomically diverse animal viruses that are unified by an ep-
idemiologic concept, i.e., transmission between vertebrate host organisms_ by blood-
feeding (hematophagous) arthropod vectors such as mosquitoes, ticks, sand flies,
and midges. The term is a contraction of arthropod-borne virus.
The intenction of arbovirus, vertebrate host(s), and arthropod vector gives this
class of infections several unique epidemiologic features. See VECTOR-s_ORNE INEE6-
TION for definition of terms used to describe these features.
AREA 6AMrL_INC A method of sampling that can be used when the numbers in the pop-
ulation are unknown. The total area to be sampled is divided into subareas, e.g., by
means of a grid that produces squares on a map; these subareas are then numbered
and sampled, using a table of random numbers. Depending upon circumstances.
the population in the sampled areas_ may first be enumerated, then a second stage
of sampling may be conducted. -
ARUTHMEI7C MEAN The sum of all the values in a set of ineasuremenu, divided by the
number of values in the set.
AR17F7CIAL 1NrELLIGENCE A branch of computer science in which attempts are made to
V duplicate human intellectual functions. One application is in diagnosis, in which
computer programs are often based upon epidemiologic analyses of data in hospital
charts or other clinical records.
ASCERTAWMt7NT Tfle process of determining what is happening in a population or study
group, e.g., famih and household composition, occurrence of cases of specific dis-
eases: the latter is also known as case-finding.
ASCERTAINMENT s1A5 Systematic failure to represent equally 211 classes of nses or per-
sons supposed to be represented in a sample. This bias may arise because of the
nature of the sources from which persons come. e.g., a specialized clinic: From a
diagnostic process influenced by culture, custom, or idiosvncracy; or, for example,
in genetic studies, from the statistical chance of selecting from large or small fami-
lies.
ASSAY The quantitative or qualitative evaluation of a hazardous substance; ; the results
of such an evaluation.
Asso_CSwnON (Syn: correlation, (statistialJ dependence, relationship) Sutistical depen-
dence between two or more events, characteristics, or other variables. An associa-
tion is present if the probability of occurrence of an event or characteristic, or the
quantity of a variable. depends upon the occurrence of one or more other events,
the presence of one or more other characteristics, or the quantity of one or more
other variables. The association between two variables is described as positive when
the occurrence of higher values of a variable is associated with the occurrence of
higher values of another variable. In a negative association, the occurrence of higher
values of one variable is associated with lower values of the other variaMe. An as-
sociation may be fortuitous or may be produced by various other circumstances;
the presence of an association does not necessarily imply a causal relationship. If
the use of the term "association"is confined 10 situations in which the relationship
between two variables is statistically significant, the terms "statistical association" and
"sutistically significant association" become tautological. However, ordinary usage
is seldom so precise as this. The terms "association" and "relationshili" are often
used interchangeably.
Associations can be broadly grouped under two headings, symmetrical or non-
causal (see below) and asymmetrical or causal.
Ai6OCtATION, ASYMMETRICAL (Syn: asymmetrical relationship) The definitive conditions
of asymmetrical associations are direction and time. Independent variable X must
cause changes in dependent variable Y, and the "caucal" va6at.h m,.-r n..._.a- ?1.
C4.C?'TSCzOZ

association, direct 8
"effects." Bradford Hill' and othen'~' have pointed out that the (subjective) likeli-
hood of a causal relationship is increased by the presence of the following attri-
butes. However, temporality is the only indispensable condition among these.
I. Consistency-The association is consistent if the results are replicated when
studied in different settings and by different methods.
2. Strength-This is an expression of the disparity between the frequency with
which a factor is found in the disease and the frequency with which it occurs
in the absence of the disease. Not to be confused with statistical significance.
3. Specificity-This is established with the limitation of the association to a single
putative cause and single effect.
4._ Dose-response relationship-This is established when an increased risk or se-
verit%in disease occurs with an increased quantity ("dose'') or duration of ex-
posure to a factor.
5. Temporality-The exposure to a putative cause always pre_cedes, never fol-
lows, the outcome.
6. Biological plausibility-It is desirable that the association agree with current
understanding of the response of cells, tissues, organs, and systems to stimuli.
This criterion should not be applied rigidly. The association may be new to
science or medicine. As Sherlock Holmes advised Dr. Watson, "When you have
eliminated the impossible, whatever remains, however improbable, must be
the truth."
7. Coherence-The associations should not conflict with the generally known fact_s_
of the natural history and biology of disease.
8. Experiment-It is sometimes possible to appeal to experimental, or quasi-
experimental evidence, e.g., an observed association leads to_ some preventive
action. Does this action in fact prevent?
See also C_ AUSALITY: EVANS'S POSTULATES; KOCH'S POSTULATFS.
' Bndford Hill A: The environment and disease: As+mtiation or ousation. Proc Ror Sa Med 5H:295-
J00. 1965.
r5usxr MN': Judgment and causal inference. Aw J EPidrwiol 105:1-15, 1977.
sRothman t:J (Edl: Conal InJn.ur. Chestnut Hill, MA: Epidemiology Resources Inc., 1988.
ASSOCtAT1oN,. DutECT Directly associated, i.e., not via a known third variable: A-+B. Re-
fers onl% to causality.
AsSOCrATtON, rNDIRECT CAUSAL Two types are distinguished:
1. Association n of a factor C with disease A only because both are related to a
common underlying factor 8.
AV BN C
Alteration of factor C will not produce an alteration in the frequency to dis-
ease A unless an alteration in C affects B. It has been suggested that to avoid
confusion with the alternative meaning of indirect nuonotion, this type should
be called "secondary association."
2. Association of a factor C with disease A by means of an intermediate or inter-
vening factor B.
8
Cf ~A
Alteration of factor C would produce an alteration in the frequency of dis-
ease A. To avoid confusion, this type should be called "indirect causal asso-
ciation."
s
9 attsibutable fraction
AssoctATroN, sruRtovs A term, preferably avoided, used with different meanings by
different authors. )t may refer to artifactual, fonuitous, false secondary, or to all
kinds of noncausal associations due to chance, bias, failure to control for extraneous
variables, etc.
ASSOCIATION, lYMMETRlCAL An association is noncausal if it is symmetrical, as in the
statement F=MA (force equals mass times acceleration). This is a noncausal, non-
directional expression of the mathematical relationship between the physical prop-
erties of force, mass, and velocity. If one side of the equation is changcd, then thr
other must also change to maintain equilibrium.
Although epidemiologists are usually most interested_ in asymmetrical statements
that have direction, the symmetrical equation can be useful. For instance, preva-
lence can be expressed in terms of incidence and duration in the simple equation,
P=I xD. If two of these three elemenu_ are known, the third can be derived. See
also SYMMETRICAL RELATIONSN/P. ASSORTAnVE MATING Selection of a mate with preference (or aversion)
for a particular
genotype, i.e., nonrandom mating.
ASYMMETRICAL ASSOCIATION See ASSOCrAT1ON, ASYMMETRtCAL.
AsYMrrostC f ertaining to a limiting value, for example, of a dependent variable, when
the independent variable approaches zero or infinity. See LARGE sAMPL_E_ METNOO.
A6YMPTOTIC METHOD See LARGE SAMPLE METHOD.
ATTACK RATE Attack rate, or case rate, is a CUMULATIVE INCIDENCE RATE Often used for
particular groups, observed for limited periods and under special circumstances, as
in an epidemic.
The sccondary attack ratF is the number of cases among contacts occurring within
the accepted incubation period following exposure to a primary case. in relation to
the total of exposed contacts; the denominator may be restricted to susceptible con-
tacts when determinable.
Infection ratr is the incidence of manifest plus inapparent infections, which can be
identified, e.g., by SEROEPIDEMIOLOGY.
ATTR1sUTA.LE iRACTtoN (AF) (Syn: attributable proportion) A term sometimes_ usCd to
refer to the attributable fraction in the population, and sometimes to the attrib_ut-
able fraction among (he exposed. See also ATTRtRt1TAR_ L_E FRACTION (ExPOSEn); AT-
TRISUTAaLE FRACTION (POPULATION).
ATTRaRUTARLE F7lACT1ON (Exro_SED) (Syn: attributable proportion (exposed), attribut-
able risk, etiologic fraction (exposed)). With a given outcome, exposure factor and
population, the attributable fraction among the exposed is the proportion by which
the incidence rate of the outcome among those exposed would be reduced if the
exposure were eliminated. It may be estimated by the formula
AF =1,=1 1Y
c- .
where l, is the incidence rate among the exposed, l, is the incidence rate among
the unexposed; or by the formula
AF, - RR- I
where RR is the rate ratio. 1 fl,. It is assumed that causes other than the one under
investigation have had equal effects on the exposed and unexposed groups.
ArrRlarl!ARLE IntwcTaON (rorvunoN) (Syn: attributable proportion (population), eti-
ologic fraction (population), attributable risk). With a given outcome, exposure fac-
tor, and population, the attributable fraction among the population is the propor-
vM(v(~TS(.ZV%

attrsbulable number 10
tion by which the incidence rate of the outcome in the entire population would be
reduced if exposure were eliminated. It may be estimated by the formula
APP=IP-1
I
P
where lP is the incidence rate in the total population and /v is the incidence rate
among the unexposed; or by the formula
P,(RR - 1)
I + P,(RR - 1)
where RR is the rate ratio. //lP. It is assumed that causes other than the one under
investigation have had equal effects on the exposed and unexposed groups.
AYrRlMtri-ASLE NIMtER The number of new occurrences of a specific outcome attrib-
utable to an exposure; it may be estimated using the formula
AN=s~
where l, is the incidence rate among the exposed, I is the incidence rate among
the unexpostd, and N, is the number of persons in the exposed population. It is
assumed that causes other than the one under investigation have had equal effects
on the exposed and unexposed groups.
ATTRIRUTAaLE RIsK The rate of a disease or other outcome in exposed individuals that
can be attributed to the exposure. This measure is derived by subtracting the rale
of the outcome (usually incidence or mortality) among the_ unexposed from the rate
among the exposed individuals; it is assumed that causes other than the one under
investigation have had equal effects on the exposed and unexposed groups. Unfor-
tunatel), this term has been used to denote a number of different concepts, includ-
ing the attributable fraction in the population, the attributable fraction among the
exposed, the population excess rate, and the rate difference. Therefore, it should
be defined carefully by all who use it. See also ATTRIeUTAeLE FRACTION (ExPOSED);
POPULA-T-IONEXCESS RATE; ATTRIBUTABLE FRACTION (POPULATION); POPULATION AT-
TRIStIrARtt RISK: RATE DtFEERENCE.
A7T1<latITACLE RISK (EXPOSED) This term has been used with different connotations to
denote the attributable fraction among the exposed and the excess risk among the
exposed. See aISO ATTRIRUTApIE FRACTION (EXPOSED); RATE DIFFERENCE.
ArrRr.uTA.LE RISK (troevLATSoN) This term has been used with different connotations
to denote the attributable fraction in the population and the population excess risk.
See 2150 ATTRIRUTARLE FRACTION (POPUlAT10N); POPULATION EXCESS RATE.
A-TTRlatITAaLE RISK PERCENT Attributable fraction expressed as a percentage rather
than as a proportion.
ATTRIBUTABLE RtSK PERCENT (EXPOSED) This is the attributable fraction among the_ e_x-
posed, expressed as a percentage. See also ATTRIRUTAeLE FRACTION (ExPOSED).
ATTRIBUTABLE RISK PERCENT (ro.vLAT1oN) This is the attributable fraction in the pop-
ulation, expressed as a percenl]g0 See also ATTRIBUTABLE FRACTION (POPULATION).
ATTSttmtrrt A qualitative characteristic of an individual or item.
AUDIT An examination or review that establishes the extent to which a condition, pro-
cess, or performance conforms to predetermined standards or criteria.
AUTOPSY DATA Data derived from autopsied deaths, e.g., for study of natural history of
disease and trends in frequency of disease. Autopsies are done on nonrandomly
selected pcrsons in the population and findings should therefore be generalized
only with great caution.
K
i
AVERAGE Kendall and Buckland_'s Diclionary of Skuistical Trrm (4(h Edition, 1982) has
this to say: "A familiar but elusive concept. Generally an 'average' value purports
to represent or to summarize the relevant features of a set of values; and in this
sense the term would include the median and the mode. In a more limited sense
an 'average' compounds all the values of the set, e.g., in the case of the arithmetic
or geometric means. In ordinary usage, 'the average' is often understood to refer
to the arithmetic mean." See alSo MEASURE-S OF CENTRAL TEND_ ENCY.
AVERAGE LIFE EXPECTANCY See EXPECTATION OF UFE. AXIS
1. One of the dimensions of a graph. A two,dimensional graph has two axes, the
horizontal or x axis, and the vertical or y axis. Mathematically, there may be
more than two axes, and graphs are sometimes drawn with a third dimension;
the eye cannot comprehend more than three dimensions.
2. In NOSOt.OGV, an axis of classification is the conceptual framework, e.g., etio-
logic, topographic, psychologic, sociologic. The International Classification of
Disease, for example, is multiaxial; the primary axis is topographic (i.e., body
syslems)t secondary axes relate to etiology, manifestations of disease, detail et_ail of
sites affected, severity, etc.
sIA.MTsMYaz

13 bias
ACKCROl1ND LEVEI.. RATE The concentration. often low, at which some subsunce, agent.
or event is present or occurs at a particular time and place in the absence of a
specific hazard or set of hazards under investigation. An example is the background
level of naturally occurring forms of ionizing radiation to which we are all exposed.
sAR DtweR" A graphic technique for presenting DISCRErE DATA organized in such a
way that each observation can fall into one and only one category of the variable.
Frequencies are listed along one axis and categories of the variable along the other
axis. The frequencies of each group of obsenatio_ ns are represented by the lengths
of the corresponding bars. See atso HtSTOCRAM.
25
2s.s
23.2
9.e
0
e.?
6.8
6A
j
IM rA_ffl
aondrtqnt
Hfdrl
ArJhrfln Vntr01 MrOertentlon Dia6tlet Impuirmenlt,
-
and impovm.nla w,tharl htarl larer et,remiliet
rhRwnotitm inrolvemen/ Ond hips
Bar diagram. Pronr Susser. Watson. Hopper. 1985.
AYES' THEOREM A theorem in probability theory named for Thomas Bayes (1702-
1761). an English clergyman and mathematician; his Essar Tonards So/ving a Pro6lrrn
in lhE Doc(nnr of Chancri (1763, published posthumously), contained this theorem.
In epidemiology, it is used to obtain the probability of disease in a group of people
with some characteristic on the basis of the overall rate of that disease (the prior
probability of disease) and of the likelihoeids of that characteristic in healthy and
diseased individuals. The most familiar application is in CLINICAL DECISION ANALYSIS
where it is used for estimating the probability of a particular diagnosis given the
appearance of some symptoms or test result. A simplified version oT the theorem is
tk
y
P(DIS)!. P(SID)P(D)
P(SID)P(D) +P(SiD)P(D)
where D=disease, S- symptom. and Dsno disease. The formula emphasizes what
clinical intuition often overlooks, namely, that the probability of disease given this
symptom depends not only on how characteristic that symptom is of the disease but
also on how frequent the disease is among the population being served. "If you
hear hoof beats in the street, do not look for zebra."
The theorem can also be used for estimating exposure-specific rates from case
control studies if there is added information about the overall rate of disease in that
population.
Some of the terms in the theorem have special names. The probability of disease
given the symptom is called the "posterior probability." It is an estimate of the
probability of disease posterior to knowing whether or not the symptom was pres-
ent. The overall probability of disease among the population or our guess of the
probability of disease before knowing of the presence or absence of the symptom
is called the "prior probability." The theorem is sometimes presented in terms of
the odds of disease before knowing the symptom (prior odds) and after knowing
the symptom (posterior odds).
REHAVIORAL ERIDEJNIC An epidemic originating in behavioral patterns (as opposed to
invading microorganisms or physical agents). Examples include the dancing manias
of the Middle Ages, episodes of mass fainting or convulsions ("hysterical epidem-
ics"), crowd panic, or waves of fashion or enthusiasm. The communicable nature of
the behavior is dependent not only on person-to-person transmission of the behav-
ioral pattern but also on group reinforcement (as with smoking, alcohol, or drug
use). Behavioral epidemics may be difTicult to differentiate from, or may compli-
cate, outbreaks of organic disease, for example. due to contamination of the envi-
runment by a toxic substance.
EHAVIORAL RISK rACTOR A characteristic or behavior that is associated with increased
probability of a specified outcome; the term does not imply a causal relationship.
E_NCHMARK A slang or jargon term, usually meaning a measurement taken at the out-
set of a series of measurements of thc same variable, sometimes meaning the best
or most desirable salue of the variable. Because of uncertainty about meaning, the
term should not be used.
MENEIaT-COST RATIO The ratio of net present value of measurable benefits to costs.
Calculation of a benefit-cost ratio is used to determine the economic feasibility or
success of a program.
BERNOULLI DISTR1SUi70N The probability distribution associated with ewo mutually ex-
clusive and exhaustive outcomes, e.g., death or survival; a Bernoulli variable is one
that has only two possible values, e.g., death or survival. 5ec also etNOMiAL DIsT-RI-
RIrTIUN. ERKSONrS BIAS See BIAS, SELECTION.
ETA ERROR See ERROR. TYPE 11.
euS Deviation of results or inferences from the truth, or processes leading to such
deviation. Any trend in the collection, analysis, interpretation, publication, or re-
view of data that can lead to conclusions that are systematically different from the
truth. Among the ways in which deviation from thF truth can occur, are the fo)low-
ing:
1. Systematic (one-sided) variation of ine2sttrements from the true values (syn:
systematic error).
Ji'.PtE,YsV7,o7. 12

bina, ..certaiatoent 14
2. Variation of statistical summary measures (means, rates, measures of assoc-ia-
tion, etc.) from their true values as a result of systematic variation of measure-
ments, other flaws in data collection, or flaws in study design or analysis.
S. Deviation of inferences from the truth as a result of flaws_ in study design,
data collection, or the analysis or interpretation of results.
4. A tendency of procedures (in study design, data collection, analysis, in(erpre-
tation. review or publication) to yield results or conclusions that depart from
the truth.
5. Prejudice leading to the conscious or unconscious selection of study proce-
dures that depart from the truth in a particular direction, or to one-sidedness
in the interpretation of results.
The term 'bias" does not necessarily carry an impuution of prejudice or other
e subjective factor, such as the experimenter's desire for a particular outcome. This
differs from conventional usage in which bias refers to a partisan point of view.
Man), varieties of bias have been described.l
'Sackeu DL: Bias in analyuc research. f CAro" Dis 32:51-63. 1979.
IAS, AsctRTAtNMr.N'r Systematic error, arising from the kind of individuals or patients
(e.g., slightl} ill, moderately ill, acutely ifl) that the individual observer is seeing.
Also systematic error arising from the diagnostic process (which may be determined
by the culture, customs, or r individual idiosyncrasy of the_ person providing care for
the patient).
tAS, IN AsstntrnoN (Syn: conceptual bias) Error arising Irom laulty logic or premises
or mistaken beliefs on the pan of the investigator. False conclusions about the ex-
planation for associations between variables. Example: Having correctly deduced
the mode of transmission of cholera, John Snow concluded thal yelloN~ fever was
transmitted by similar means. In fact, the "miasma" theory would better fit the facts
of yellow fever transmission.
BIAS tN AUi'OPav tERSts Systematic error resulting from the fact that autopsies repre-
sent a nonrandom sample of all deaths.
BIAS, RER1tSUN'f See BIAS. SILEC_TION.
BIAS DUE TO CONFOUNDING See CONFOUNDING.
1.1s, DESIGN The difference between a true value and that actually obuined. occurring
as a result of faulty design of a study. Some examples are (I) uncontrolled studies
where the effects of two processes cannot be separated (confounding), (2) con-
trolled studies where observations are based on a poorly defined population, and
(3) nonsimultaneous comparisons, e.g., use of historical controls.
BIAS, DETECr/ON Due to systematic error(s) in methods of ascertainment, diagnosis, or
verification of cases in an epidemiologic survey, study, or investigation. Example:
Verification of diagnosis by laboratory tests in hospital cases, but failure to apply
the same tests to cases outside the hospital.
BIAS DUE TO DIGIT PREFERENCE See DIGtT PREFERENCE.
BIAS IN HANDL.INC OUTLIERS Error arising from a failure to discard an unusual value
occurring in a small sample, or due to exclusion of unusual values that should_ be
included.
BtAS, INFORMA77ON (Syn: observational bias) A flaw in measuring exposure or outcome
that results in differential quality (accuracy) of information between compared groups.
IIAt DUE TO INSTIUMENTAL CRROR Systematicerror due to faulty calibration, inaccur-
ate measuring instruments, contaminated reagents, incorrect dilution tion or mixing of
reagenu, etc.
Itw3 OF INYERPRETATtoN Error arising from inference and speculation. Sources of the
C
I
15 bias, .election
error include (I) failure of the investigator to consider every interpretation consis-
tent with the facts and to assess the credentials of each- and (2) mishandling of
cases that constitute exceptions to some general conclusion.
BIAS, INTERVItwtR Systematic error due to_ interviewers' subconscious bconscious or even con-_
scious gathering of selective data.
BIAS, °ttAD-TIME" A systemauc error arising when follow-up of two groups does not
begin at strictly comparable times. Occurs especially when one group has been di-
agnosed earlier in the natural history of the disease than the other group. See also
LERO TIME SI11rT. nAS, LtNCTH A systematic error due to the selection of a disproportionate number
of
long-duration cases (cases who survive longest) in one group and not in the other.
Can occur when prevalent cases, rather than incident cases, are included in a case
control study.
RIAR, MEASUREMENT Systematic error arising from inaccurate measure_ment (or caassifi-
cation) of subiects on the study variables.
RIAS, OtSERVER Svstematic difference between a true value and that actually observed
due to observer variation. Observer variation may be due to differences among
observers (interobserver variation) or to variation in readings by the same observer
on separate occasions (intraobserver variation). See also OBSERVER VARIATION.
BIAS IN T1/E PRESENTATION OF DATA Error due to irregularities produced by DIGIT PREF
ERENCE, incomplete data, poor techniques of ineasurement, or technically lly poor lab-
oratory standards.
IAS IN PtnucAT10N An editorial predilection lor publishing particular hndings, e.g.,
positive results, which leads to the failure of authors to submit negative findings lor
publication or failure of journal editors to accept and publish reports with negative
findings. This can distort the general belief about what has been demonstrated in a
particular situation.
BIAS OF AN ESTIMATOR The difference between the expected value of an estimator of a
parameter and the true value of this parameter. See also UNBIASSED ESTIMATOR.
RtA_s, RceAU. Syste_matic error due to differences in accuracy or completeness of recall
to memory of prior events or experiences. Example: Mothers whose children have
had or have died of leukemia are more likely than mothers of healthy living chil-
dren to remember details of diagnostic x-ray examinations to which thes_e_ children
were exposed in utero.
IA_s, REPORTalvG Selective suppression or revealing of information such as past history
ry
of sexually transmitted disease.
nAS, RtsroNSt Systematic error due to difference in characteristics between those who
choose or volunteer to participate in a study and those who do not.
t11A8, LAMPLUVG Unless the sampling method ensures that all members of the "universe"
or reference population have a known chance of selection in the sample, bias is
possible. The best way to ensure a known chance of selection for all is to use a
probability sampling method such as a ubk of random numbers.
BIAS, SELECTION Error due 10 systematic differences in characteristics between those
who are selected for study and those who are not. Examples include hospital cases
or cases under a physician's care, excluding those who die before admission to hos-
pital because the course of their disease is so acute, those not sick enough to require
hospital care, or those excluded by distance, cost, or other factors. Selection bias
also invalidates generalizable conclusions from surveys that would include only vol-
unteers from a healthy population.
A special example is BERKSON'S BIAS,I which Berkson charac-terizrd as the set of
4=461141sEZ0z

bias due to withdrawals 16
selective factors that lead hospital cases and controls in a case control study to be
systematically different from one another. This occurs when the combination of
exposure and disease under study increases the risk of hospital admission, thus
leading to a svstematically higher exposure rate among the hospital cases than the
hospital/ cnntrols. This in turn results in systematic distortion of the oDDS RATIO.
~ Berkson J: Limitations of the application of fourfold table analysis to hospital data. Bionwtnu
Bull
2:47-59, 1946.
SIAS DUE TO MT-t71DRAWALf A difference between the true value and that actually ob-
served in a study due to the characteristics of those subjects who choose to with-
draw.
BIt.La or MORTALtT'Y Weekly and annual abstracts of chrisienings and burials, distin-
guishing deaths from the plague, compiled for London (and some other cities),
especially in times of plague, from the English parish registers that started in 1538.
From 1629, the annual bill was published regularly and included a breakdown of
deaths bv cause. These records were the basis for the earliest vital statistics, com-
piled, analv:ed, and discussed by John Graunt in Natural and Political Obsrn atiom
on thi Bills of Mortality (1662).
t11MODAL DISTIlIaAT10N A distribution with two_ regions of high frequency separated by
a region of low frequency of observations. A two-peak distribution.
BINARY VARIA/LE A variable having on(y two possible values, e.g. on or off, 0 or I. See_
also sIT.
INOMIAL DISTRI/VTION A probability distribution associated with two mutua)Iv exclu-
sive outcomes. e.g., presence or absence of a clinical or laboratory sign, death, or
survival. The probability distribution of the number of occurrences of a binan
event in a sample of n independent observations. The binomial distribution is used
to model CUMUIUTIVE INCIDENCE RATFS and PREVALENCE RATES. The BERNOUt1.1 D15-
TRIItuT-loN is a special case of the binomial distribution with n= I.
IOASSAY The quantitative evaluation of the potency of a substance by assessing its ef-
fects on tissues, cells, live experimental animals, or humans.
Bioassay may be a direct method of estimating relative potency: groups of sub-
jects are assigned to each of two (or more) preparations; the dose that is just suffi-
cient to produce a specified response is measured, and the estimate is the ratio of
the mean doses for the two (or more) groups. In this method, the death of the
subject may be used as the "response.''
The indirect method (more commonly used) requires study of the relationship
between the magnitude of a dose and the magnitude of a quantitative response
produced by it.
not.oc/CAtL ruuststl.ITV The criterion that an observed, presumably or putatively causal
AssocursoN fits previously existing biological or medical knowledge. This judgment
should be used cautiously since it could impede development of new knowledge
that does not fit existing ideas.
IOLOGICAL TtANSMISSION See VE(,TOR-RORNE INFECTION.
/IOMETRY [literally, the mraeurrrnenl of (I(rJ The application of statistical methods to the
study of numerical data based on biological observations and phenomena. The term
was coined by W. F. R. Weldon (1860-_ 1906), a zoologist at University College,
London. FRANCIS GALTON has been called "the father of-biometry" for his applica-
tion of statistical methods to the analysis of biological variation. However, others
preceded him, e.g., QuE-rELET and tnuls.
IOSSATtsncs Application of aTAnsTICS to biological problems. The term is considered
17 blind(ed) study
by many biomedical scientists to mean the application of statistics specifically to
medical problems, but its real meaning is broader.
BIRAUD, YvES (1900-1965) French physician and statistician. He served the League of
Nations and later WHO as Director of Epidemiological and Statistical Services from
1925 to 1960. In 1960, he founded the first chair of Health Statistics in France, at
the Ecoll dr sant; publiqur in Rennes.
slRTtt cERTtFtCATE Official, legal document recording details of a live birth, usually
comprising name, date, place, identity of parents, and sometimes additional infor-
mation such as birth weight. It provides the basis for vital statistics of birth and
birthrates in a political or administrative jurisdiction, and for the denominator For
infant mortality and certain other vital raies.
SIRTH COHORT See COHORT.
a1RTH GOHORT ANALYSIS Ste COHORT ANALYSIS.
BIRTH INTERVAL Interval between termination of one completed pregnancy and the_
termination of Ihe next.
a1RTrt ORDER l"he ranking of siblings according ing to age, starting with the eldest in a
family. The ordinal number of a given live birth in relation to all previous live
births of the same women. Thus, 4 is the birth order of the fourth live birth occur-
ring to the same woman. This strict demographic definition may be loosened to
include all births, i.e., still-births as well as live births
sIRTH RATE A summary rate based on the number of live births in a population over a
given peritN.1, usually one year.
Number of live births to residents
in an area in a calendar year
Binh rate = x 1000
Average or midyear population
in the area in that year
utTtt wEIC/rT Infant's weight recorded at the time of birth and, in some countries,
entered on the birth certificate. Certain variants of binh weight are precisely de-
fined. Low birth weight (LBW) is below 2500 g. Very low birth weight (VLBW) is
below 1500 g. Ultralow birth weight (ULBW) is below 1000 g. Large for gestational
age (LGA) is birth weight above the 90th percentile. Average weight for gestational
age (AGA) (Syn: appropriate or adequate): birth weight between 10th and 90th
percentiles. Small for gestational age (SGA) (Syn: small for dates): birth weight
below 10th percentile.
srT Acronym for binary digit; the signal in computing. See also RYTE.
"sucR sox" A jargon lerm, meaning a method of reasoning or studying a problem,
in which the methods, procedures, etc., as such are not described. explained, or
perhaps even understood. Nothing is stated or inferred about the method: discus-
sion and conclusions re)ate solely to the empirical relationships observed. An alter-
native definition is the following: A method of formally relating an input, e.g.,
quantity of a drug absorbed over a period or a putative causal factor, to an output,
e.g., the amount of the drug eliminated in a given period, or an observed effect,
without making detailed assumptions about the mechanisms that have contributed
to the transformation of input to output within the organism (the "black box").
BtJND(ED) STUDY (Syn: masked study) A study in which observer(s) and/or subjects are
kept ignorant of the group to which the subjects are assigned, as in an experiment,
s4 eG YSE~Of,

blocked randomiratioo 18
or of the population from which the subjects come, as in a nonexperimenul study.
When both observer and subjects are kept ignorant, we refer to a doub)e-blind
study. If the statistical analysis is also done in ignorance of the group to which
subjects belong, the study is sometimes described as trip)e-blind. The intent of keeping
subjects and/or investigators blinded, i.e., unaware of knowledge that might intro-
duce a bias, is to eliminate the effects of such biases. To avoid confusion about the
meaning of the word "blind" some authors prefer to describe such studies as
..masked"
I/LOCKED RANDOMILATION See STRATIFIED RANDOMIZATION. The analogue in a r-andom-
ized experiment of individual matching in an observational study.
DODY MASS INDEX (Syn: Quetele_t's index) One of the anthropometric measures of body
mass. Defined as (weight) +(height)°. This measure has the highest correlation
with skinfold thickness or body density and in this respect is superior to the roN-
DERAL INDEX.
twoTaTnwr A technique for estimating the variance and the bias of an estimator by
repeatedly drawing random samples with replacement from the observations at hand.
One applies the estimator to each sample drawn, thus obtaining a set of estimates.
The observed variance of this set is the bootstrap estimate of variance. The differ-
ence between the average of the set of estimates and the original estimate is the
bootstrap estimate of bias.
BRCwrerotNT In helminth epidemiology, the critical mean worm)oad in a community,
below which the helminth mating frequency is too low to maintain reproduction. A
value exceeding the breakpoint of a wormload means that the wormload will in-
crease until equilibrium is reached: a value less than or equal to the breakpoint
means that the wormload will decrease progressively.
sY-rE A group of adjacent bits, commonly 9, 6, or 8, operating as a unit for storage and
manipulation of data in a computer. See also BIT.
6!,E2:Tqc ~©7
I
c
CALIPER MATCBINC S[e MATCHING.
CANADIAN MORTAt.r1T DATA f1AlE A large set of computer-stored death statistics; per-
sonal identifiers and causes of all deaths in Canada since 1950 have been compwer-
stored, and the death certificates have been preserved on microfiche. This data base
and record linkage have been used in some important historic_al cohort studies. See
also NATIONAL DEATH INDEX.
CANCER REGISTRY See REGISTER.
CARRIER
I. A person or animal that harbors a specific infectious agent in the absence of
discernible clinical disease and serves as a potential sourcE of infection. The
carrier state mav occur in an individual with an infection that is inapparent
throughout its course (known as healthy or asymptomatic carrier), or during
the incubation period, convalescence. and postconvalescence of an individual
with a clinicalh recognizable disease (known as incubator-% carrier or convalcs-
cent carrier). The carrier state mav be of short or long duration (temporar)
or transient carrier or chronic c_arrier).'
' Adapied from Conrrol of Coin'nunicable Dutav rn Man, 14th ed. N'ashinRtnn. DC: American Public
Healdh Association. 1985.
CARRYING CArACITY An estimate of f the numbers of people that a nation, region, or the
planet can sustain.
CASE In epidemiology, a person in the population or study group identified as having
the particular disease, health disorder, or condition under investigation. A variety
of criteria may be used to identify cases, e.g., individual physicians' diagnoses, re-
gistries and notifications, abstracts of clinical records, surveys of the general popu-
lation, population screening, and reporting of defects such as in a dental record.
The epidemiologic definition of a case is not necessarily the same as the ordinary
clinical definition.
CASE-RASE STUDY A study that starts with the identification and sampling of persons
with the disease of interest, and then samples the entire base population (of cases
and noncases) from which the original cases arose. This design is similar to a CASE
CONTROL sruDV in most respects, but cases may appear in the comparison (base)
sample as well as in the case sample.
C_ASE. COt.L1TERAL A case occurring in the immediate vicinity of a case which has been
the subject of an epidemiological investigation; a term used mainly in malaria con-
trol programs, equivalent to the term contact as used in infectious disease epide-
miology.
CASE COMrARISON STUDY See CASE CONTROL STUDY.
CASE COMPEER STUDY See CASE CONTROL STUDY.
19

case control .tu..y 20
CASE CONTROL 6TUDY (Svn: case comparison study, case compeer study, case history
slud), case referent studi, retrospective study) A study that starts with the idencifi-
cation of persons with the disease (or other outcome variable) of interest, ano a
suitable control (comparison, reference) group of persons without the disease. The
relationship of an attribute to the disease is examined by comparing the diseased
and nondiseased with regard to how frequentlyy the attribute is present or, if quan-
titative, the levels of the attribute, in each of the groups.
Such a study can be called "retrospective" because it starts after the onset of
disease and looks back to the postulated causal factors. Cases and controls in a case
control study may be accumulated "prospectively;" that is, as each new case is di,
agnosed it is entered in the study. Nevertheless, such a study may still be called
'retrospective" because it looks back from the outcome tu its causes. The terms
'tases" and "controls" are sometintes used to describe subjects in a RANDOMIZEu
cONTROLLED TRtAL_ but, the term "case control stud)" should not be used to describe
such a study.
The terms "case control study" and "retrospective study" have been used most
often to describe this method. Other terms also used are listed above. The concept
of the case-control studc is lo be found in the works of I'.C.A. Louis;' the firsl
explicit description of the method is contained in a paper by William Augustus Guy,
who reported his analvsis of the relationship between prior occupational exposure
and the occurrence of pulmonary consumption to the Statistical Society of London
in 1843.2 The evolution of the case-control study thereafter has been described br
Lilienfeld and Lilienfeld.' The first modern use of the method was a case-control
study of breast cancer, reported by hne-Claypon' in 1926: Irom that time onward.
casetontrol studies became increasingly popular and widely used.
'Louis I'CA: Researches on PMhisise Anatomical. 1'adhuloRiol and Therapeutical. (Trans. N.H.
N`olshel. London: Svdcnlum Societr. 1844.
'Gui, WA: Contributions to a knowiedRc of the influence ol emplmmcros on hcaldh.J Rm SWt Sw
6: I St7-21 I. 11443.
'Lilicnfeld AM. Lilienkld D: A cemury of ose-comnd studies-proRrrss. J CArnn !1u 52:5-13.
1979.
' Lane-Clacpon ) E: A further report on cancer of the breast. Rrp Pub llltA Alyd Subj 32. London:
HAtSO. 1926.
CASE iATALfII' RATL The proportion of cases of a specified condition which are fatal
within a specified time.
Number of deaths from a disease
Case fatality rate (usually (in a given period) x I(10
expressed as a percentage) ~ Number of diagnosed ;es oT that diseas_e
(in the same period)
This definition can lead to paradox when more persons die of the disease than
develop it during a given period. For instance, chemical poisoning that is slowly but
inexorably fatal may cause many persons to develop the disease over a relatively
short period of time, but the deaths may not occur until sonte years later and may
be spread over a period of years during Mhich Ihere are no new cases. Thus, in
calculating the case faulity rate, it is necessary to acknowledge that the time dimen-
sion varies: it may be brief, e.g., covering only the period of stay in a hospital, of
futite duration, e.g., one year, or of longer duration still. The term "case fatality
rate" is then better replaced by a term such as "survival rate" or by the use of a
SURVIVOR3HIP TARLE. See a130 ATTACK RATE.
21 causation of disea.e
CASE HISTORY BTi/DY
1. Synonym for CASE CONTROL STUDY.
2. In clinical medicine, a case report, or a report on a series of nses.
CASE REFERENT STUDY See CASE CONTROL STUDY.
CATASTROPHE THEORY A branch of mathematics dealing with large changes in the total
system that may result front small changes in a critical variable in the system. An
example is the sudden change in the physical state of water into steam or ice with
rise or fall of temperature beyond a critical level. Certain epidemics, gene frequen-
cies. and behavioral phenomena in populations may abide by the same mathernati-
cal rule. Herd immunity is an example.
CATCHMENT AREA Regitln, which may be well- or ill-defined, from which the clients of
a particular health facility are drawn.
CAUSALITY The relating of causes to the effects they produce. Most of epidemiology
concerns causality and several types of causes can be distinguished. It should be
clearlv stated, hoivever, that epidemiologic evidence by itself is insufficient to estab-
lish causality.
A cause is termed "necessar)" when it must always precede an effect. This effect
need not he the sole result of the one cause. A cause is termed "sufficient" when it
inevitably initiates or produces an effect. Am given rau_ se may be necessary, suffi-
cient, neither, or both. These possibilities are explained below.
Four conditions under which independent variable X may cause Y
variable X may cause Y
Xis Xis
necessary suflicient
1. + +
2. +
3. - +
4. - -
I. X is necessary and sufficient to cause 1'. Both X and Y are always present
together, and nothing but X is needed to cause )'; X-+1'.
2. X is necessary but not sufTicient to cause Y. X must be present when l' is pres-
ent, but F is not always present when X is. Some additional factor(s) must alsu
be present: X and Zz+Y.
3. X is not necessary but is sufficient to cause l'. 1' is present when X is but X
may or may not be present when )' is present, because Y has other causes and
can occur without X. For example, an enlarged spleen can have many separate
causes that are unconnected with each other; X-+Y; Z-)'.
4. X is neither necessary nor sufTicient to cause )'. Again, X may or may not be
present when ) is present. Under these conditions, however, if X is present
with Y, some additional factor must also be present. Here X is a contributory
cause of )' in some causal sequences; X and Z-+)': W and Z-.Y. These relatiorr
ships and the logic of causal inference are discussed in Cau+e! Infrrrncr.'
'Rmhman KJ (Ed) Cawof ln/rr.'vr. Chestnut Hill, MA: Epidemiolog} Resources Inc.. 1988.
C_AUSAIION OF DrlEASE. FACTORS IN The following factors have bcen differentiated (but
they are not mutually exclusive):
PrrdiupoeingJactors are those that prepare, sensitize, condition, or otherwise create
a situation such as a level of immunity or state of susceptibility so that the host
tends to react in a specific fashion to a disease agent, personal interaction, environ-
mental stimu)us, or specific incentive. Examples include age. sex, marital status,
t~ez T-sC zo%

causes of death 22
family size, educational level, previous illness experience, presence of concurrent
illness, dependency, working environment, and attitudes toward the use of health
services. These facton may be "necessary" but_ are rarely "sufficient" to cause se the
phenomenon under study.
EnabGng jodon art those that facilitate the manifestation of disease, disability, ill-
health, or the use of services or conversely those that faciliute recovery from illness,
maintenance or enhancement of health status. or more appropriate use of health
services. Examples include income, health insurance coverage, nutrition, climate.
housing, personal support systems, and availability of medical care. These factors
may be "necessary" but are rarely "sufficient" to cause the phenomenon under study.
Precnpitntiniq Jadon are those associated with the definitive onset of a disease, ill-
ness, accident, behavioral response, or course of action. Usually one factor is more
important or more obviously recognizable than others if several are involved and
one may often be regarded as "necessary." Examples include exposure to specific
disease, amount or level of an infectious organism, drug, noxious agent, physical
trauma, personal interaction, occupational stimulus, or new awareness or knowl-
ed ge.
Rnnjorcing jactors are those tending to perpetuate or aggravate the presence of a
disease, disabihty, impairment. attitude, pattern of behavior, or course of action.
They may tend to be repetitive, recurrent, or persistent and may or may not nec-
essarily be the same or similar to those categorized as predisposing, enabling. or
precipitating. Examples include repeated exposure to the same noxious stimulus (in
(he absence of an appropriate immune response) such as an infectious agent. work,
household, or interpersonal environment, presence of financial incentive or disin-_
centive, personal satisfaction, or deprivation._
CAUSES OF DEATH See DEATH CERTIFICATE.
CAUS_ E-DELETED uFt TABL[ A life table constructed using death rates lowered by elim-
inating the risk of dying from a specified cause: its most common use is to calculate
the gain in life expectancy that would result from the elimination of one cause.
CAUSE-srECtnC suTE A rate that specifies events, such as deaths, according to their
cause.
eENSOtuNC This term refers to the loss of subjects from a follow-up study; the occur-
rence of the event of interest among such subjects is unceruin after a specified time
when it was known that the event of interest had not occurred; it is not known.
however, if or when the event of interest occurred subsequently. Such subjects are
desc-ribed as censored. For example, in a follow-up study with tnyocardial infarction
as the outcome of interest, a subject who has not had an infarct but is killed in a
traffic crash in year 6 is described as censored as of year 6, since it cannot be known
when, if ever, he might have had an infarct at a later year of follow,up. This is
censoring by competing risk; other varieties include loss to follow-up and termina-
tion of the study. Examination of data for censoring requires thc use of special
analytic methods, such as life table analysis.
cENSUS An enumeration of a population, originally intended for purposes of taxation
and milita -ry service. Census enumeration of a population usually records identities
of all persons in every place of residence, with age, or birth date, sex, occupation,
national origin, language, mariul sutus, income, and relationship to head of house-
hold, in addition to information on the dwelling place. Many other items of infor-
mation may be inclu_ded, e.g., educational level (or literacy), and health-related data
such as permanent disability. A de facto census allocates persnns according to their
location at the time of enumeration. A de jure census assigns persons according to
tbEir usual place of residence at the time of enumeration.
23 class
cENSUS 'ntACr An area for which details of population structure are separately tabu-
lated at a periodic census; normally it is the smallest unit of analysis of (published)
census tabulations. Census tracts are chosen because they have well-defined bound-
aries, sometimes the same as local political jurisdictions, sometimes defined by con-
spicuous geographical features such as main roads, rivers. In urban areas census
tracts may be further subdivided, e.g., into city blocks, but published tables do not
contain details to this level.
CENTaLE See QUANTILFS.
C_E_SSATION EIFERIMENT Controlled study in which an attempt is made to evaluate the
termination of an exposure to risk such as a living habit that is considered to be of
etiologic importance.
CHART The medical dossier Of a patient. See also 1NFORMATION SVSTEM; MEDICAL RE-
CORD.
ettteR Dtcrt A single digit. derived from a multidigit number such as a case identifi-
cation numlxr, that is used as a screening test for transcription errors.
CHEMOrROrxvLAXIs The administration of a chemical, including antibiotics. to prevent
the development of an infection or the progression of an infection to active mani-
fest disease.
CHEMOI7IERARY The use of a chemical to_ treat a clinically recognizable disease or to_
limit its further progress. -
CHILD DEATH RATE ThFnumber of deaths of children aged 1-4 years in a given year
per 1000 children in this age group. This is a usEful measure of the burden of
preventable communicable diseases in the child population.
CHt-SQUARE (Xr) DIST/UnUTtON A variable is said to have a chi-square distribution with
A degrees of freedom if it is distributed like the sum of the squares of K indepen-
dent random variables, each of which has a normal distribution with mean z_ero and
variance one.
cHt-sqUARE (Xr) TFST Any statistical test based on comparison of a test statistic to a chi-
square distribution. The oldest and most common chi-square tests are for detecting
whether two or more population distributions differ from one another; these tests
usually involve counts of data, and may involve comparison of samples from the
distributions under study, or the comparison of a sample to a theoretically expected
distribution. The Pearson chi-square test is probably the best known; another is the
Manlel-Haenszel test. (Statisticians disagree about the terminal letter; a bare ma-
jority of those who contributed to the discussion of this entry prefer "chi-square"
rather than "chi-squared." Either usage is acceptable.)
estRZSOMS This word, which appears in Blt.ts oF MORTAIJTV, means infants who die
txfore formal baptism; therefore, the number recorded in Bills of Mortality can be
used to estimate (albeit inaccurately) neonatal death rates in studies of historical
demography and epidemiology.
cHRDNtc I. Referring to a health-related state, (asting a long time. 2. Referring to ex-
posure, prolonged or long-term, often with specific reference to low-intensity. 3.
The U.S. National Center for Health Statistics defines a "chronic" condition as one
of three months' duration or longer.
ct.ASS A term used in the theory of frequency distributions. The total number of ob-
servations made upon a particular variate may be grouped into classes according to
convenient divisions of the variate range in order to make subsequent analyses less
laborious, or for other reasons. A group so determined is called a "class." The
variate values that determine the upper and lower limits of a class are called "class
boundaries," the interval between them is the class interval, and the tn-oucnrv fall-
YS(iC. Y9(.%OC.

classification 24
C(ASSIFICATION (Syn: categorization) Assignment to predesignated classes on the basis
of perceived common characteristics. A means of giving ordcr to a group of discon-
nected facts. Idealh, a classification should be characterized by (I) naturalness-the
classes correspond to the nature of the thing being classified. (2) exhaustiveness-
every member of the group will fit into one (and only one) class in the system, (3)
usefulness--the classification is practical, (4) simplicity-the subclasses are not ex-
cessive, and (5) constructability-the set of csasses can be constructed by a demon-
strabl) systematic procedure.
CL_ASSIFICATtoN oP DISEASES Arrangement of diseases into groups having common
characteristics. Useful in efforts to achieve standardization, and therefore compa-
rability, in the methods of presentation of mortality and morbidity data from dif-
ferenl sources. May include de a systematic numerical notation fnr each disease entry.
Examples InclUde the INTERNATIONAL CLAS-IIFICAT-ION OF DISASFS, IN,IURIES, AND
CAUSES OF DEATH (ICD) and the INTERNATIONAL CLASSIFICATION OF HEALTH PROSLEMS
IN PRIMARY CARE (ICHPPC).
CLASS, SOCIAL A method of socially stratifving populations, e.g., according to education,
income, or occupation. See a1S0 SOCIOECONOMIC CLASSIFICATION.
CLINICAL DECISION ANALYSIS Application Of DECISION ANALYSIS in a clinical setting wilh
the aim of applying epidemiologic and other data on probability of outcomes when
alternative decisions can be made, e.g., surgical imervention or drug treatment for
mvocardial ischemia.
CLtNICAL EPIDEMIOLOCdST A practitioner of clinical epidemiology.
CLINICAL EPIDEMIOLOGY While some epidemiologists deplore any adjectival qualifica-
..
tion of the discipline, a subspecialty of clinical epidemiology is sufficiently demar-
cated to justify definition. There are plenty of suggested definitions. Johu R. Paul'
proposed "A marriage between quantitative concepts used by epidemiologists to
study disease in populations and decision-making in the individual case which is the
daily fare of cliniwl mec(icine." Patient care is central to Seckett's delinitiont: "The
application, by a phvsician who provides direct patient care, of epidemiologic and
biometric methods to the study of diagnostic and therapeutic processes in urder to
effect an improvement in health." While limiting the discipline to medical graduates
in clinical practice, this definition is conceptually close to the definition of clinical
decision analysis: the proper distinction between clinical epidemiology and clinical
decision analysis may be that the epidemiologist works with a defined pnpulation,
even if it is a population of patients rather than a community-based population with
numerator and denominator in the conventional epidemiolugic sense; clinical deci-
sion analysis can be applied to a single patient. Abramson's definition' is "The use
of epidemiological principles, methods and findings in personal health care or
community-oriented primary care, with special refcrence to applications in diag-
nostic and prognostic appraisal, decisions concerning care and the evaluation of
care. The term sometimes refers to anv epidemiological study conducted in a clin-
ical setting." Weiss' defines clinical epidemiology as "The study of variation in the
outcome of illness and of the reasons for that variation." The existence of the above
and other subtly different definitions suggesu_ that this branch of epidemiology
remains inchoate.
' f Chn fwtrst 17:519-54 I. 1938.
'Ar, f t(nMruol fl9:125-128. 1969.
' Personal communicatiun. 1986.
'C6nicof Eptdnniolog.. New York: Uxford University Press, 1986.
CLINICAL TRIAL (Syn: therapeutic trial) A research activity that involves thr administra-
tion of a test regimen to humans to evaluate its efficacy and safety. The term is
25 cohort slopes
subject to wide variation in usage, from the first use in humans without any control
treatment to a rigorously designed and executed experiment involving test and con-
trol treatments and randomization.
tion.
See alSO COMMIINITI' TRIAL.
C_LINIMETRICS Feinstein,' who coined this term, defines it as the domain concerned with
indexes, rating scales, and other expressions that are used to describe or measure
svnlptoms, physical signs, and other distinctly clinical phenomena in clinical medi-
cine. Such measurements, of course, are an essential part of many epidemiologic
studies.
'Feinslein AR: C/initn.fnn. New Haven and l.ondon: Yale University Press, 1987.
CLOSED CouORT A population in which membership begins at a defined time or with a
defined event and ends only through occurrence of the study outcome or the end
of eligibility for membership. An example is a population of women in labor being
studied to determine the vital status of their offspring (i.e., whether live or still-
born).
CLUSTER ANALYSIS A set of statistical methods used to group variables or observations
into strongly interrelated subgroups.
CLl/iTERING (Svn: disease cluster, time cluster, time-place cluster) A closely grouped
series of events or cases of a disease or other health-related phenomena with well-
defined distribution patterns, in relation to time or place or both. The term is nor-
malle used to describe aggregation of relatively uncommon events or diseases, e.g.,
leukemia, multiple sclerosis.
CLUSTER SAMPLING A sampling method in which each unit selected is a group of per-
sons (all persons in a city block, a family, etc.) rather than an individual.
CootNG Translation of information, e.g., questionnaire responses. into numbered cate-
gories for entry in a data processing system.
COEFFICIENT OF VARIAT-lON The ratio of the standard deviation to the mean. This
is meaningful onlv if the variable is measured on a ratio scale. See MEASUREMENT
SCALE.
COHORT )from Latin cohon, warriors, the tenth part of a legionj
I. The component of the population born during a particular period and iden-
tified by period of birth so that its characteristics (e.g., causes of death and
numbers still living) can be ascertained as it enters successive time and age
periods. -
2. The term "cohort" has broadened to describe any designated group of per-
sons who are followed or traced over a period of time, as in COHORT STUDY
(prospective study).
COHORT ANALYSIS The tabulation and analysis of morbidity or mortality rates in rda-
tionship to the ages of a specific group of people (cohort), identified at a particular
period of time and followed as they pass through different ages during part or all
of their life span. In certain circumstances, e.g., studies of migrant populations,
cohort analysis may be performed according to duration of residence of migrants
in a country rather than year of binh, in order to relate health or mortality expe-
rience to duration of exposure.
COHORT COMPONENT METHOD A method of population projection that takes the popu-
lation distributed by age and sex at a base date and carries it forward in time on
the basis of sepante allowances for fertility, mortality, and migration.
COHORT EFFECT Sef GENERATION EFFECT.
COHORT INCIDENCE See INCIDENCE.
COHORT SLOPES Arrangement of data so that when ploued graphically, lines connect
points representing the age-specific rates for population segments from the same

co6orx etudr
500
Cohort
200
100
50
20
10
5
2
1
0.5
0.2
0.1
26
curves for yeara of birth, 1860-1950*
~
- - 1
- t9
193
1940
0
~9_50
20
1900
40 60
Age
.
The tine associated with each year indicates death rates
by age-group /or persons born in that year
1870
1880
80
100
Cohort slopes (tuberculosis mortality rates of successive birth generations). Death rates for
tuberculosis, by age, United States, 19(N)-19fi0 (per 100,t/W population).
ErorA Susser, Watson, Hopper, 1985.
generation of birth (see diagram). These slopes represent changes in rates with age
during the life experience of each cohort.
COHORT STUDY (Syn: concurrent, follow-up. incidence, longitudinal, prospective study)
The method of epidemiologic study in which subsets of a defined population can
be identified who are, have been, or in the future may be exposed or not exposed,
or exposed in different degrees, to a factor or factors hypothesized to influence the
probability of occurrence of a given disease or other outcome. The alternative terms
for a cohort study, i.e., follow-up, longitudinal, and prospective study, describe an
essential feature of the method, which is observation of the population for a sufli-
cient number of person-years to genente reliable incidence or mortality rates in
the population subsets. This generally implies study of a large population, study
for a prolonged period (years), or both.
Co1NTERYENT1oN In a RANDOMIZED CONTROLLED TRIAL, the application of additional di-
agnostic or therapeutic procedures to members of either or both the experimental
and the control groups.
COLD CHAIN A system of protection against high environmental temperatures for heat-
labile vaccines, sera, and other active biological prepantions. Unless the cold chain
is preserved, such preparations are inactivated and immunization procedures, etc.
will.be ineffective. f'reservation of the cold chain is an integral part of the WHO
expanded program on immunization in tropical countries.
CoLLSNEARSTY Very high correlation between variables.
COLONIZATION See INrECTION.
COMMENSAL Litenlly, eating together (sharing the same table); an organism t_hat lives
harmlessly in the gut. See also xENOeloTlc.
COMMON SOURCE EPIDEMIC (Syn; common vehicle epidemic) See EPIDEMIC, COMMON
SOURCE.
so
1890
27 community trial
COMMON vEHta,E sPRUD Spread of disease agent from a source that is common
to those who acquire the diseast, e.g., water, milk, shellfish, foods, air, or syringe
contaminated by infectious or noxious agents. See also TRANSMISSION oF INFEC-
TION. - - -
COMMUNICASL_E DtSEASE (Svn: infectious disease) An illness due to_ a specific infectious
agent or its toxic products that arises through transmission of that agent or its
products from an infected person, animal, or reservoir to a susceptible host, either
directly or indirectly through an intermediate plant or animal host, vector, or the
inanimate environment. See also TRANSMISSION OF INFECTION.
COMMUNIGASLE PERIOD The time during which an infectious agent may be transferred
directly or indirectly from an infected person to another person, from an infected
animal to man, or from an infected person to an animal, including arthropods._ See
alSO TRANSMISSION or INFECTION. CoMMUNrtv A group of individuals organized into a unit, or
manifesting some unifying
trait or common interest; loosely, the locality or catchment area population for which
a service is provided, or more broadly, the state, nation, or body politic.
COMMUNrTY DIAGNOSIS T'he process of appraising the health status of a community,
including assembly of vital statistics and other health-related statistics and of infor-
mation pertaining to determinants of health, such as prevalence of tobacco smok-
ing. and examination of the relationships of these determinants to health in the
specified community. The term may also denote the findings of this diagnostic pro
cess. Community diagnosis may attempt to be comprehensive, or may be restricted
to specific health conditions. determinants, or subgroups. J.N. Morris' identified
community diagnosis as one of the uses of epidemiology.
'Br Mrd J 2:l95-401 1955.
COMMUNr7Y HEALTH See PUSUC HEAL_TH.
COMMUNrrY MEDICINE Since the late 1960s, this term has gained wide currency as the
preferred name for important activities concerning health care in the community.
There are several different definitions, including the following.
I. The field concerned with the study of health and disease in the population of
a defined community or group. lu goal is to identify the health problems and
needs of defined populations, to identify means by which these needs should
be met, and to evaluate the extent to which health services effectively meet
these needs.
2. The practice of medicine concerned with groups or populations rather than
with individual patients. This includes the elements listed in definition I, to-
gether with the organization and provision of health care at a community or
group level.
3. The term is also used to describe the practice of medicine in the community,
e.g., by a family physician. Some writers equate the terms "family medicine"
and "community medicine"; others confine its use to public health practice.
4. Community-oriented primary health care is an integration of community
medicine with the primary health care of individuals in the community. In
this form of practice the community practitioner or community health learn
has responsibility for health care re both at a community and at an individual
level.
See also PuSLIC HEALTH: SOCIAL MEDICINE.
C_OMMUNITY TAIAL Experiment in which the unit of allocation to receive a preventive or
therapeutic regimen is an entire community or political subdivision. Examples in-
clude the trials of fluoridation of drinking water, and of heart disease prevention
in North Karelia (Finland) and California. See also CLINICAL TRIAL.

comorbidity 28
CDMOwnotTY Disease(s) that coexist(s) in a study participant in addition to the index
ex
condition that is the subject of study.
C_O_ MPARISON GROUP Any group to which the index group is compared. Usually synony-
mous with control group.
ooMrenNG c:A'usE When a previously common cause of death becomes rare, other causes
become more prominent. These other causes are referred to as competing causes.
For instance, among voting adults, pneumonia and other infections were a common
cause of death until about midway through the 20th cenwry; their control has
brought to prominence some competing causes of death, notably malignant disease
and suicide.
COMPETING RISK An event that removes a subject from being at risk for the outcome
under investigation. For example. in a study of smoking and cancer of the lung, a
subject who dies of coronary heart disease is no longer at risk of lung cancer, and
in this situation, coronary heart disease is a competing risk.
COMPLETED FERTILIYY RATE The number of children born alive per woman in a cohort
of women by the end of their child-bearing years.
COMPLE7lNG THE CWNICAL rtcruRE The use of epidemiology to define all modes of
presentation of a disease, and/or all possible outcomes. One of the "uses of epide-
miologv" identified bY J.N. Morris.'
'lir Mrd J 2:395-IOI, 1955.
coMrLETtON RATE The proportion or percentage of persons in a SURVEY for whom
complete data are available for analvsis. See also RESroNSE. RATE.
COMPOsITE INDEx An index, such as the Apgar score. Tumor/Nodes/Metastales (TNM)
stage of cancer, that contains contributions from categories of several different vari-
ablcs.
COMPUTYR A programmable electronic device that can be used to store and manipulate
data in order to carn out designated functions. The two fundamental components
of a computer are hardscare, i.e., Ihe actual electronic device. and software. the
instructions or program used to carn out the function. Computer science has cre-
ated a large language of its own, describing types of computers (main-frame. micro.
digital. analogue, etc.) and all aspects of the process. Most of the terms used in this
field are defined by AJ Meadows. M Gordon, and A Singleton.'
~f>rrnonarnof Nrr/nforwletum T-rAnoloRr. London: (:emury, 19N2.
CqNCORDANCE_ Pairs or groups of individuals of identical phenotype. In twin studies, a
condition in which both twins exhibit or fail to exhibit a trait under investigation.
CONCORDANT A term used in TwIN STUDIES to describe a twin pair in which both twins
exhibit a certain trait.
CONCURRENT STUDY See COHORT STUDY.
CONDITIONAL PROBABILITY The probahility of an event, given that another event has
occurred. If U and E are two events and P(. . . ) is "the probability of (. ..)." Ihe
conditional probabilit} of !), given that F occurs, is denoted L'(njE), where the ver-
tical slash is read "giren"and is equal to P(D and E)//'(E.). The event E is the "con-
ditioning event.'- Conditional probabilities obey all the axioms of probabi(ity theor).
See aISO RAYES' TNEOREM; PRORARILITY T/IEORY.
C_O_NFIDENCE INTERVAL A range of values for a variable of interest, e.g., a rate, con-
structed so that this range has a specified probability of including the true value of
the variable. The specified probability is called the confidence level, and the end
points of the confidence interval are called the confidence limits.
C_ONFOUNDING Ifrom the Latin conJundcrr, to mix togetherj I. A situation in which the effects of two
processes are not separated. The dis-
29 contamination
tortion of the apparent effect of an exposure on risk brought about by the
association with other factors that can influence the outcome.
2. A relationship between the effects of two or more causal factors as observed
in a set of data, such that it is not logically possible to separate the contribution
that any single causal factor has made to an effect.
9. A situation in which a measure of the effect of an exposure on risk is distorted
because of the association of exposure with other factor(s) that influence the
outcome under study.
CONF'OUNDING VARIASLE (Syn: confounder) A variable that can cause or prevent the
outcome of interest, is not an intermediate variable, and is not associated with the
factor under investigation. Such a variable must be controlled in order to obtain an
undistorted estimate of the effect of the study factor on risk.
eONSANGUINE Related bv a common ancestor within the previous few generations.
CONSISTENCY I. Close conformity between the findings in different samples, strata, or popu-
lations, or at different times or in different circumstances, or in studies con-
ducted by different methods or different investigators. Consistency may be
examined in order to study effect modification. Consistency of results on rep-
lication of studies is an important criterion in judgments of causality.
2. In statistics, an estimator is said to he consistent if the probability of it yielding
eslimates close to the true Yalue approaches one as the sample size grows larger.
CONTACT (or AN tNFECnoN) A person or animal that has been in such ch association with
an infected person or animal or a contaminated environment as to have had op-
portunity to acquire the infection.
CONTACT, DIRECT A mode of transmission of infection between an infected host and
susceptible host. Direct contact occurs when skin or mucous surfaces touch, as in
shaking hands, kissing, and sexual intercourse. See aISO CONTAGION; TRANSMISSION
OF INFFCTION.
CONTACT, INDIRECT A mode of transmission Of infection involving FOMITFS or VECTORS.
Vectors may be mechanical (e.g.. filth flies) or biological (the disease agent under-
goes part of its life cycle in the VeC1or ipec/es). See also TRANSMtSS1ON Or INFECTION.
CONTACT, PRIMARY Person(s) in direct contact or associated with a communicable dis-
ease case.
CONTACT, SE_COND_ ARY Person(s) in contact or associated with a primary contact.
CONTAGION The transmission of infection by direct contact, droplet spread, or contam-
inated FoMtTFS. These are the modes of transmission specified by FRAGA570R/US in
f)r Conlnl;ioru (1546); contemporary usage is sometimes looser, but use of this term
is best restricted to description of infection transmitted by direct contact.
CONTAGtouS Transmitted by contact; in common usage. "highly infectious."
CONTAINMENT The concept of regional eradication of communicable disease, first pro-
posed by Soper in 1949 for the elimination of smallpox.' Containment of a world-
wide communicable disease demands a globally coordinated effort so that countries
that have effected an interruption of transmission do not become reinfrcled follow-
ing importation fTom neighboring endemic areas.
'f an American Heahh Organizalion, OSP, CE7, W-15, Washington DC. 1919.
CONTAMINATION I. The presence of an infectious agent on a body surface; also on or in clothes,
bedding, toys. surgical instruments or dressings, or other inanimate articles or
substances including water, milk, and food. Pollution is distinct from contam-
ination and implies the presence of offensive, but not necessarily infectious,
ty8c7Wc~!0711

contingency table 30
matter in the environment. Contamination of a body surface does not imply a
rarrler State. See also TRANSMISS/ON OF INFEGT-RON.
2. The situation that exists when a population being studied for one condition
or factor also possesses other conditions or factors that modify results of the
study. In a RANDOMIZD CONTROLLED TRIAL, the inadvertent application of the
experimental procedure to members of the control group, or inadvertent fail-
ure to apply the procedure to members of the experimental group.
t eINTINGENCV TABLE A tabular cross-classification of data such that subcategories of one
characteristic are indicated horizontally (in rows) and subcategories of another
characteristic are indicated vertically (in columns). Tests of association between the
characteristics in the columns and rows can be readily applied. The simplest contin-
gency table is the fourfold, or 2 x 2 table. Conungency tables may be extended to
include several dimensions of classification.
CONTINGENT PARIABLE See INTERMEDIATE VARIABLE.
CON-TaNU1NG SOURCE ErIDEMIC (Olll'RREAK) An epidemic in which new cases of disease
occur over a long period. indicating persistence of the disease source.
CONTNVOUS DATA, CONTINUOUS YAR1AtLt Data (variable) with a potentially infinite
number of possible values along a continuum. Data representing a continuous vari-
able include height. weight, and enzyme output.
CONTROL
1. (v.) To regulate, restrain, corrsec-t, restore to normal.
2. (n. or adj.) Applied to many communicable and some noncommunicable con-
ditions. "control" means ongoing operations or programs aimed at reducing
the incidence and/or prevalence, or eliminating such conditions.
3. (n.) As used in the expressions case-control study and randomized control(led)
trial. "control" means person(s) in a comparison group that diffen, respec-
tively, in disease experience or allocation to a regimen, from the subjects of
the study.
4. (v.) In statistics, "control" mmeans to adjust for or take into account m extraneous
influences or obser.ations.
5. (adj.) In the expression "control variable" we refer to an independent variable
other than the hypothetical causal variable that has a potential effect on the
dependent variable and is subject bject to control by analysis.
The use of the noun "control" to describe the comparison groups in a case con-
trol study and in a randomized control(led) trial can confuse the uninitiated. e.g.,
ethical review committees; the essential ethical distinction is that there may be no
intervention in the lives or health status of the controls in a case-control study,
whereas controls in a randomized controlled trial may be asked to undergo a pro-
cedure or regimen thal may affect their health; their informed consent is therefore
essential. Consent may not be required (save to gain access to medical records) to
study controls in a case-control study. As M.W. Susser' has pointed out, the use of
the word `'control" as verb, adjective, and noun may confuse even careful readers.
The verb is best used in the sense of controlling sources of extraneous variation in
the dependent variable, whether by design or analysis. The verb is also used in the
sense of controlling disease or its causes. The adjective is best used to describe
control variables in contradistinction to uncontrolled and confounding variahles.
The adjective also can be used to describe a control group assembled for compari-
son with a group of cases or with an experimental group. The noun is best used to
designate the members of a control group.
'Gaaw! T-hrnAing in Uw Nfa6h S.vrrrt. New York: Oxford, 1973.
CONTROLS, HISTORICAL Persons or patients used for comparison who had the condition
sSMTSC701710
31 cost-utility .nalysis
or treatment under study at a different time, generally at an earlier period than
the study group or cases. Historical controls are often unsatisfactory because other
factors affecting the condition under study dy may have changed to an unknown ex-
tent in the time elapsed.
ooNTxols, HoslrTAL Persons used for comparison who are drawn from the popula-
tion of patients in a hospital. Hospital controls are often a source of SEIICTION
6IAS. tXrNTROLS, MATCHED Controls who are selected so that they are similar to the study
group, or cases, in specific characteristics. Some commonly used matching variables
are age, sex,, race, and socioeconomic status. See also MATCHING.
CONTRUIS, NEICHBORHOOD Persons used for comparison who live in the same locality
as cases artd therefore may resemble cases in environmental and socioeconomic
criteria.
txsNrROLa, suuNc Persons used for comparison whn are the siblings of cascs and
therelore share genetic makeup.
COORDINA7Ta In a two-dimensional graph, the values es of ordinate and abscissa that de-
fine the locus or position of a point.
tJORDON SANt7AIRE The barrier erected around a focus of infection. Used mainly in the
isolation procedures applied to exclude cases and contacts of life-threatening com-
municable diseases from society. Mainly of historical interest.
coRREt.AnoN The degree to which variables change together.
CORRELA'AON COEF/JCrEN7 A measure of association that indicates the degree to which
two variables have a linear relationship. This coefficient, represented by the letter
r, can vary between + I and - I; when r=± I. there is a perfect positive linear
relationship in which one variable varies directly with the other; when r- - 1,
there is a perfect negative linear relationship between the variables. The measure
can be generalized to quantify the degree of linear relationship between one vari-
able and several others, in which case it is known as the multiple correlation coef-
ficient. Kendall's Tau, Spearman's Rank Correlation, and Pearson's Product Mo-
ment Correlation tests are special varieties with occasional applications in
epidemiology. M.G. Kendall and W.R. Buckland's Aktionary of Statahcal Trr.ns' gives
details. -
' London: l.onRman, 1983.
60RR6LATJON, NON3ENSE A meaningless correlation between two variables. Nonsense
correlations sometimes occur when social, economic, or technological changes have
the same trend over time as incidence or mortality rates. An example is correlation
between the birth rate and the density of storks in parts of Holland and Germany.
See also CONFOUNDING; ECOLOGICAL FALLACY.
cDS-r-..ENErtT ANALYSIS An economic analysis in which the costs of medical care and
the loss of net earnings due to death or disability are considered. The general rule
for the allocation of funds in a cost-benefit analysis is that the ratio of marginal
beneht (the benefit of preventing an additional case) to marginal cost (the cost of
-
preventing an additional case) should be equal to or greater than I.
C06T-E/TEC'TIVENESS ANALYlLS This form of analysis seeks to determine the costs and
effectiveness of an activity, or to compare similar alternative activities to determine
the relative degree to which they will obtain the desired objectives or outcomes.
The preferred action or alternative is one that requires the least cost to produce a
given level of effectiveness, or provides the greatest effectiveness for a given level
of cost. In the health care field, outcomes are measured in terms of health status.
oosr-tmu'n Awu-nus An economic analysis in which outcomes arc measured in terms
of their social v.iue.

eovariate 32
COVARIATE A variable that is possibly predictive of the outcome under study. A covar-
iate may be of direct interest to the study or may be a confounding variab)e_ or
effect modifier.
'
COVERAGE A measure of the extent to which the services rendered cover the potential
need for these services in a community. !t is expressed as a proportion in which
the numerator is the number of services rendered, and t}u denominator is the
number of instances in which the service should have been rendered. Example:
Annual obstetric coverage ~
in a community
Number of deliveries attended by a
qualified midw:ife or obstetrician
Expec number of deliverie"uring
the year in a given community
COX MODEL See PROPORTIONAL HAZARDS MODEL.
CRITERION Aprincipk or standard by which something is judged. See also STANDARD.
CRONRACH'S ALPHA (Syn: internal consistencv reliability) An estimate of the correlation
between the total score across a series of items from a rating scale and_ the total
score that would have been obtained had a comparable series of items been em-
ploved.
CROSSrCULTURAL STUDY A stu_ dy in which populations from different cultural back-
grounds are compared.
CROSSOVER DESIGN A method of comparing two or more treatments or interventions in
which the subjects or patients, upon completion of the course of one treatment, are
switched to another. In the case of two treatments. A and B, half the subjects are
randoml)' allocated to receive these in the order A. B and half to receive them in
the order B. A. A criticism of this design is that effects of the first treatment may
carry over into the period when the second is given.
CROSS-PRODUCT RATIO See OnD5 RATIO. CROSS-SECTIONAL STUDY (Syn: disease frequency survey. preca)ence
study) A study that
examines the relatjonship between diseases (or other health-related characteristics)
and other variables of interest as they exist in a defined population at one particular
time. The presence or absence of disease and the presence or absence of the other
variables (or, if the) are quantitative, their level) are determined in each member
of the study population or in a representative sample at one particular time. The
relationship between a variable and the disease can be examined (I) in terms of the
prevalence of disease in different population subgroups defined according to the
presence or absence (or level) of the variables and (2) in terms of the presence or
absence (or level) of the variables in the diseased versus the nondiseased. Note that
disease prevalence rather than incidence is normally recorded in a cross-sectional
study. The temporal sequence of cause arid effect cannot necessarily be determined
in a cross-sectional study. See also MORBIDITY svavEV.
CRUDE DEATH RATE See DEATH RATE.
CUMULATIVE DEATH RATE The proportion of a group that dies over a specified time
interval. It may refer to all deaths or to deaths from specific cause(s). If follow-up
is not complete on all persons the proper estimation of this rate requires the use of
methods that take account of CENSORtNG. Distinct from roRCE OF MORTALITY.
CUMULATIVE INCIDENCE, CUMULAl7VE INCIDENCE RATE The number or proportion of a
group of people who experience the onset of a health-related event during a spec-
ified time interval; this interval is generally the same lur all members of the group,
V
±13 cy.t count
but, as in lifetime incidence, it may vary from person to person without reference
to age.
CUMULATtVE INCIDENCE RATIO The ratio of the cumulative incidenc_e_ rate in the ex-
posed to the cumulative incidence rate in the unexposed.
CUSUM Acronym for cumulative sum (of a series of measurements). This is a useful waV
to demonstrate a change in trend or direction of a series of measurements.' Ca(-
culation begins with a reference figure, e.g. the expected average measurement. As
each new measurement is observed, the reference figure is subtraned, and a cu-
mulative total is produced by adding each successive difference. This cumulative
total is the cuswn.
'Aldrrwn M: An Imroducdon ro Epidemiologc, 2nd ed. London: Macmillan. 1483.
CYCLICITY, SEASONAL The annual cycling of incidence on a seasonal basis. Certain acute
inlectious diseases, if of greater than rare occurrence, peak in one season of the
year and reach the low point six months later (or in the op(wsite seas(m). The onset
ol some symptoms of some chronic diseases also mac show this amplitudinal cy-
clicitv. l)ernugraphic phenomena such as marriage and births, and murtabty from
all causes and certain specific causes, ma)' also exhibit seasonal cyclicity.
CYCUCITY, SECULAR Long-term (greater than one year) cycling of disease incidence. For
example, measles in a large, unimmunized population has a high incidence every
second year: hepatitis A has a higher incidence every seventh year. Such cvcling is
the result of continuous exhaustion and replacement of susceptib)es in a relatively
stable population. Secular cyclicity may have large interval swings as in the recur-
rence of pandemics of influenza.
CYST COUNT See WORM COUNT.

D
DATA DREDGING A jargon term, meaning analyses done on a post hoc basis without
benefit of prestated hypotheses, as a means of identifying noteworthy differences.
Such analyses are sometimes done when data have been collected on a large num-
ber of variables and hypotheses are suggested by the data; the scientific validity of
data dredging is at best dubious, usually unacceptable.
DATA rRoctrssrNCConversion (as by computer) of crude information into usable or
storable form. Data generated by epidemiologic studies are usualh transferred to
punch cards or optical mark-sense forms and thence to a computer for storage and
retrieval. The term is often lornely used to mean also the statistical analysis of data
bV a computer program. See also PUNCH CARD.
DEATx cERTtncATE A vital record signed_ by a licensed physician or, in some nations,
by another designated health worker, that includes c2usc of death, decedent's name.
sex, birthdate, and place of residence and of death. Occupation, binhplace, and
other information may be includ_ed. Immediate cause of death is recorded on the
first line, followed by conditions giving rise to the immediate nuse; the underlying
cause is entered last. The underlying cause is coded and tabulated in official putr
hcations of cause-specific mortality. Other significant conditions may also he re-
CAUSE OF DEATH
I
DirF.sE or condition directly
lmdint to death
(a) ..................
due to
(or u a oonsequcnce oq
Antecedent oruter
Morbid conditions, If.ny,
pvinR rise to the above nuse,
stating the underlying con-
dition lart
II
Other significant conditions
contributin6lo the death, but
not related to the disease or
condition causing it
+ n,. rw. ... - n. ..w
..c 1. we.q .ht bwr. I.v.,r. w
.,,,, r r, o 'o. ... n, d... . q u.
(bl ................
due to
(or as a consequence on
(cl
International Standard Death Certibcate.
449MIsC%
Aypa.~n
.~...r ..,....
sr~l .M ~w.h
Fe.. -...
35 deduction
corded separately, as is the mode of death, whether accidental or violent, etc. The
most important entries on a death certificate are underlying causes of death and
cause of death. These are defined in the Ninth (1975) Rwition of the In_lcntational
Elauificalion of Diseases, as follows:
Causes of death: The causes of death to be entered on the medical certifrcate of
cause of death are all those diseases, morbid conditions, or injuries that either re-
sulted in or contributed to death and the circumsunces of the accident or violence
which produced any such injuries.
Undar7ring carue of deadc The underlying cause of death is (1) the disease or injury
that initiated the train of events leading to death, or (2) the circumstances of f the
accident or violence that produced the faul injury.
Personal identifying information such as birthplace, parents' names (las( name at
birth), and birthdates are included on death cenificates in some jurisdictions; this
extra information makes possible a range Of RECORD LINKAGE studies.
DEATH tuTE An estimate of the proportion of a population that dies during a specified
period. The numerator is the number of persons dying during the period; the
denominator is the size of the population, usually estimated as the mid-vear popu-
lation. The death rate in a population is generally alculated by the formula
Number of deaths during
a specified period
Number of persons ai risk
of dying during the period
This rate is an estimate of the person-time death rate, i.e., the death rate per 10'
person-years. If the rate is Iow, it is also a good estimate of the cumulative death
rate. This rate is also called the crude death rate.
DEATn RECtstauTtON AREA A geognphic area for which mortality data are published.
D_ EC_LSION ANALYSIS A derivative of operations research and game theory that involves
identifying all available choices and potential outcomes of each, in a series of deci-
sions that have to be made about aspects of patient care--<liagnostic procedures,
therapeutic regimens. prognostic expectations. Epidemiologic data play a large part
in determining the probabilities of outcomes following each choice that has to be
made. The range of choices can be plotted on a decision tree, and at each branch.
or decision node, the probabilities of each outcome that can be predicted are dis-
played. The decision tree thus portrays the choices available to those responsible
for patient care and the probabilities of each outcome that will follow the choice of
a particular action or strategy in patient care. The relative worth of each outcome
is preferably also described as a utility or quality of life, e.g., a probability of life
expectancy or of freedom from disability.'
' Pauker SG. Kauirer JP: Ihcision analysis. N EnRf f Med lI6:250-258. 1987.
DECISION TtttE The alternative choices expressed_ in quantitative terms, available at each
stage in the process of thinking through a problem, may be likened to branches,
and the hierarchical sequence of optrons to a tree. Hence, decision tree It is a
graphic device used in DECr51ON ANALYSIS, in which a series of decision options are
represented as branches and sub.uquent possible outcomes are represented as fur-
ther branches. The decisions and the eventualities are presented in the order they
are likely to occur. The junction where a decision must be taken is called a decision
node.
DEDUGTION Reasoned argument proceeding from the general to the particular.
x 10

degrees of freedom 36
DEGREES OF FREE_DOM (dJ) ThE number of independent comparisons that on be made
between the members of a sample. This important concept in statistical testing can-
not be defined briefly. It refers to the number of independent contributions to a
sampling distribution (such as X', t, and F distribution). In a CONTINGENCY TABLE it
is one less than the number of row categories multiplied by one less than the num-
ber of column categories.
DEMAND (FOR HEALTH SERVICES) Willingness and/or ability to seek, use, and, in some
settings, to pay for services. Sometimes further subdivided into exprened demand
(equated with use) and potential demand, or NEED.
DEMOGRAPHIC TRANSITION The transition from high to low fertility and mortality rates,
usuallc related to technological change and industrialiration.
DEMOGRAPHY The study of populations, especially with reference to size and density,
fertilili. mortality, growth, age distribution, migration, and VITAL STATISTIC_S, and
the interaction of all these with social and economic conditions.
DEMONST7rA'FION MODEL An experimental health care_ facility, program. or system with
built-in provision (or measuring aspects such as costs per unit of service, rates of
use bt patients or clients, and outcomes of encounters between providers and users.
The aim usually is to determine the feasibility, efficacy, effectiveness, and/or effi-
ciencv of the model service.
DENOMINAT'OR The IUwerpUrtion of a fraction used to calculate a rate or ratio. The
lwpulation (or population experience, as in person-years, passenger-miles, etc.) at
risk in the calculation of a rate or ratio. See 2150 NUMERATOR.
DENSITY OF POPULATION Dernographic term meaning numbers of persons in relation to
availablc space.
DENSITY SAMPL7NG A method of selecting controls in a CASE CONTROL STUDI" in which
cases are sampled only from incident cases over a specific time period, and controls
are sampled and interviewed throughout that period (rather than simpli at one
Ixrint in time, such as the end of the period). This method can reduce bias due to
changing exposure patterns in the source population.
DEPENDENCY RATIO Proportion of children and old people in a population in compari-
son to all others, i.e.. the proportion of economically inactive to economically active;
"children" are usually defined as ages under 15 and "old people" as ages 65 and
over.
DEPENDENT VARIAIHLE
I. A variable the value of which is dependent on the effect of other variable(s)
[independent variable(s)] in the relationship under study. A manifestation or
outcome whose variation we seek to explain or account for by the influence of
independent variables.
2. In statistics, the dependent variable is the one predicted by a regression equa-
lion.See a150 INDEPENDENT VARIABLE.
DESCRIPTIVE C_RIPTIVE STVDY A study concerned with and designed only to describe the_ existing
distribution of variables, withnut regard to causal or other hypotheses. Contrast
analytic study. An example is a community health survey, used to determine the
health status of the people in a community. Descriptive sludies, e.g., analyses of
cancer registry dau, can be used to measure risks.
DESIGN See RE4FARC/1 DEStGN.
DESIGN VARIASLE
I. A study variable whose distribution in the subjects is determined by the inves-
tigator.
37 disease
2. In statistics, a variable taking on the value I to indicate membership in a par-
ticular category and 0 or - I to indicate nonmembership in the category. Used
sed
primarily in ANALYSIS Or VARIANCE.
DETERMINANT Any factor, whether event, characteristic. or other definable entity, that
brings about change in a health condition, or other defined_ characteristic. See aluO
CAUSALITY, FACTORS IN. DIADNOSIS The process of determining health status and the factors
responsible for
producing it; may be applied to an individual, family, group, or community. The
term is applied both to the process of determination and to its Findings. See also
DISEASE LABEL.
DIAGNOSTIC INDEII A system for recording diagnoses, diseases, or problems of patients
or clients in a medical practice or service, usually including identifying information
(name, birthdate, sex) and dates of encounters. See also tBOOR.
DIFFERF-N-TIAL The difference(S) shown in tabulation of health and_ vital statistics ac-
cording to age, sex, or some other factort age differentials are the differences re-
vealed in the tabulations of rates in age-groups, sex differentials are the differences
in rates between males and females, income differentials are differences between
designated income categories, etc.
DIGIT EREFERENCE A preference for certain numbers that leads to rounding off mea-
surements. Rounding off may be to the nearest whole number, even number, mul-
tiple oF 5 or 10, or (when time units like a week are involved) 7. 19, etc. This can
IX a form of ORSERYER VARIATION, or an attribute of respondent(s) in a survey.
DIMENSIONALrrY The number of dimensions, i.e., scalar quantities, needed for accurate
description of an element of a vector space.
DtRECr ADJUSTMENT, DIRECT STANDARDIZATION See STANDARDIZATION.
DISARILITY "fempnrary or long-term reduction of a person's capacity to funclion in so-
Cletl'. See a150 INTERNATIONAL CLASSIFICATION OF IMPAIRMENTS. DISABILITIFS, AN1/
HANDICAIK for the official WHO definition.
DISCORDANT A term used in TWIN 9TUDIES_ todescrlbe a twin pair in which one twin
exhibits a certain trait and the other does not. Also used in matched pair case
control studies to describe a pair whose members had different exposures to the
risk factor under stud). Only the discordanl pairs are informative about the asso-
ciation between exposure and disease.
DISCRETt DATA Data that can be arranged into naturally occurring or arbitrarily se-
lected groups or sets of values, as opposed to data in which there are no naturally
occurring breaks in continuity, i.e.. CONTINUOUS DATA. An example is number of
decayed, missing, and filled teeth (DMF).
D_ 1_SC_RIMINANT ANALYSIS A statistical analytic technique used with discrete dependent
variables, concerned with separating sets of observed values and allocating new val-
ues; can sometimes be used instead of regression analysis. Kendall and tiuckland'
refer to this as "discriminatory analysis" and d_ escribe it as a rule for allocating
individuals or values from two or more discrete populations to the_ correct poptda-
tion with minimal probability of misclassification.
' Kendall MG, 8uckland W R: A Dwtiopan of Sta/u/icat Tm%%, 4th ed. (.ondon: LonRman, 19A2.
DISEASE Literally, dia-ea.u, the opposite of rasr, when something is wrong with a(xxlily
function. The words "disease," "illness," and "sickness" are loosely interchangeable,
but are better regarded as not wholly synonymous. M. W. Susser has suggested that
they be used as follows:
Disease is a physiological/psychoingical dysfunction.
Illness is a subjective state of the person who feels aware of not being well:
8&Uz1SCzo%

disease frequency survey 38
Sickness is a sule of social dysfunction, i.e., a role that the individual assumes
when ill.
DISEASE FREQUENCY SURVEY See CROSS-SECTIONAL STUDY; MORDITITY SURVEY.
DIS_ E_ASE LAaEL The identity of the condition from which a patient suffen. It may be
the name of a precisely defined disorder identified by a batte -ry of tests, a probabil-
ity statement based on consideration of what is most likely among several possibli-
ties, or an opinion based on pattern recognition. Use of the word "label" can convey
stigma, so Ihis term should be used with care, if at all. See also DIAGNOSIS.
DISEASE ODDS RATIO See ODDS RATIO. DISEASE, rREEUNICAL Disease with no signs or symptoms, because
they have not yet
developed. See also INAPPARENT INFECTION.
DISEASE REGISTRY See RECISTER. REGISTRY.
DISEASE, SURCUN/CAL A condition in which disease is delecUb)e by special IestS but does
not reveal itself by signs or symptoms.
DISEASE TAIONOMY See TAXONOMY OF DISEASE.
DISINFECTION I:illing of infectious agents outside the body by direct exposure to chem-
ical or physical agents.
Concurrent disinfection is the application of disinfective measures as soon as pos-
sible after the discharge of infectious material from the body of an infected person,
or afler the soiling of articles with such infectious discharges, all personal contact
with such discharges or articles being minimized prior to such disinfection.
Terminal disinfection is the application of disinfective measures afler the patient
has been removed by death or to a hospiul, or has ceased to be a source of infec-
tion, or alter other hospital isolation practices have been discontinued. Terminal
tion,
disinfection is rarely practiced; terminal cleaning generally suffices, along with air-
ing and sunning of rooms, furniture, and bedding. Disinfection is necessary only
for diseases spread by indirect contact; steam sterilization or incineration of bed-
ding and other items is desirable after a disease such as plague or anthrax.l
' 8enenson AS (Ed): Control o/ Cowaurucab4 Diuates in Man, 14th ed. Washington DC: American
Public Health Asso_ciation 1985.
DI_SINFESTATION Any physical or chemical process serving to destroy or remove unde-
sired small animal forms, particularly arthropods or rodents, present upon the per-
son, the clothing, or in the environment of an individual, or on domestic animals.
Disinfestation includes delousing for infestation with Prdiculur huRwnw humanus,
the body louse. Synonyms include the terms "disinsection" and "disinsectization"
when insects only are involved.
DtsrRtstmON The complete summary of the frequencies of the values or calegories of
a measurement made on a group of persons. The distribution tells either how many
or what proportion of the group was found to have each value (or each range of
values) out of all the possible values that the quantitative measure can have.
DISTRIRU7ION-FREE METHOD A method which does not depend upon the form of the
underlying distribution.
DtSTRIRtrr1ON FUNC7ION A function that gives the relative frequency with which a ran-
dom variable falls at or bclow each of a series of values. Examples include the
normal distribution, log-normal distribution, chi-square distribution, t distribution,
F-distribution, and binomial distribution, all of which have applications in epide-
miology.
'DMF The abbreviation DMF stands for decayed, missing, and filled teeth. Lowercase
letten, i.e., dmf, are used for deciduous dentition, upper case for permanent teeth.
The DMF number is widely used in dental epidemiology.
39 dynamic population
DoaE-RLSroNSE RELATSONaxtr A relationship in which a change in amount, intensity,
or duration of exposure is associated with a change-either an increase or a de-
crease-in risk of a specified outcome.
DO_ U_ tLE-RLIND TRIAL A procedure of blind assignment to study and control groups and
blind assessment of outcome, designed to ensure that asceruinment of outcome is
not biased by knowledge of the group to which an individual was assigned. "Dou-
ble" refers to both parties, i.e., the observer(s) in contact with the subjecls, and the
subjects in the study and control groups. Se_e_ also RL_IND ExPERtMENT; RANtK)MIZED
CONTROLLED TRIAL.
DRIFTSee GENETIC DRIFT SOCIAL DRIFT.
DROrLE:r NUCLEI A type of partick implicated in the spread of airborne infection. Droplet
nuclei are tiny particles (1-I0 µm diameter) that represent the dried residue of
droplets._ They may be formed by (I) evaporation of droplets coughed or sneezed
into the air or (2) aerosolization of infective materials. See a1soTRANSMtSS1ON OF
INFECTION. - --- -- - DROrotrr A person enrolled in a study who becomes inaccessible or ineligible
for fol-
low-up, e.g., because of inability or unwillingness to remain enrolled in the study.
The occurrence of dropouts can lead to biases in study results.
DUMMY VARIARLE See INDICATOR VARIARtt.
DYNAMIC POPU_LATION A population that gains and loses members; all natural popula-
tions are dynamic, a fact recognized by the term "population dynamics," used by
demographers to denote changing composition. See also POPULATION DYNAMICS;
STARLE POPULATION.
sSC%TsMo%

EARLY WARNING SYSTEM In disease surveillance, a specific procedure to detect as early
as possible any departure from usual or normally observed frequency of phenom-
ena. For example, the routine moniloring of numbers of deaths from pneumonia
and influenza in large American cities is an early warning system for the identifi-
cation of influenza epidemics. In developing countries, a change in children's av-
eragc weights is an early warning signal of nutritional deficiency.
E_-SOOK Method (developed by Eimerl)' of recording encounters in primary medical
care: encounters are arranged by problem or diagnostic category, thus making it
easy to count the number of persons seen (and the number of times each is seen)
according to problem or diagnostic category in a given period of time. Widely used
in epidemiologic studies of primary medical care. S_e_e_ also AGE-SEX REGISTER; DIAG-
NOSTIC INDEX.
' Eimerl TS: Organized curicnity. f Cofl Gm Praclil l:246-252. 1960.
ECOLOGICAL ANALYSIS Analysis based on aggregated or grouped data; errors in infer-e ence mas result
because associations may be artifactually created or masked by the
aggregation process.
ECOLOGICAL CDRRELATtoN A correlation in which the uniLs studied are populations rather
Ihan individuals. Correlations found in this manner may not hold true for the in-l dividual members
of these populations. See 2150 ECOLOGICAL FALLACY.
ECOLOGICAL FALLACY (Syn: aggregation bias, ecological bias)
I. The bias that may occur because an association observed between variables on
an aggregate level does not necessarily represent the association that exists at
-
an individual Ievxl.
2. An error in inference due to failure to distinguish between different levels of
organization. A correlation between variables based on group (ecological)
characteristics is not necessarily reproduced between variables based on indi-
vidual characteristics: an association at one level may disappear at another, or
even be reverxd. Example: At the ecological level, a correlauon has been found
in several studies between the quality of drinking water and mortality rates
from heart disease; it would be an ecological fallacy to infer from this alone
that exposure to water of a particular level of hardness necessarily influences
the individual's chances of getting or dying of heart disease.
ECOLOGICAL S'rUDY A study in which the units of analysis are populations or groups of
people, rather than individuals. An example is the study of association between
median income and cancer mortality rates in administrative jurisdictions such as
stales and counties.
ECOLOGY The study of the relationships among living organisms and their environ-
ment. "Human ecology" means the study of human groups as influenced by envi-
ronmental factors, often including social and behavioral factors.
, ) environment
EcosYSrEM The plant and animal life of a region considered in relation to the environ-
mental factors that influence it; more specifically, the fundamental unit in ecology,
comprising the living organisms and the nonliving elements that interact in a de-
fined region.
EtrFEcr The result of a cause. In epidemiology, frequently a synonym for EFTECT MF.A-
SURE.
EFSEcnvENrss The extent to which a specific intervention, procedure, regimen, or ser-
vice, when deployed eployed in the field, does what it is intended to do for a defined pop-
ulation.
ulation.
EFFECT MEASURE A quantity that measures the effect of a factor on the frequency or-
risk of a health outcome. Three such measures are attributable fractions, which
measure the fraction of cases due to a factor; risk and rate differences, which mea-
sure the amount a factor adds to the risk or rate of a disease; and risk and rate
ratios, which measure u_ re the amount by which a factor multiplies the risk k or rate of
disease.
EFtECT MODIFIER (Syn: conditional variable, moderator variable) A factor that modifies
the effect of a putative causal factor under study. For example, age is an effect
modifier for many conditions, and immunization status is an effect modifier for the
consequences of exposure to pathogenic organisms. Effect modification is detected
bv varying the selected effect measure for the factor under studv across levels of
another factor. See also CAUSALITI', FAC-T-ORS IN; INTERA(TION.
EfCECTVE SAMELE SIZE Sample size_ after dropouts, deaths. and other specified exclu-
sions from an original sample.
EFFICACY 7'he extent to which a specific intervention, procedure, regimen. or service
produces a beneficial result under ideal conditions. Ideall), the determination of
efhcacV is based on the results (if a RANDOMIZED CONTROLLED TRIAL
EFFaCtENCY 1. The effects or end-results achieved in relation to the effort expended in terms
of money, resources, and time. The extent to which the resources used to
provide a specific intervention, procedure, regimen, or service ol known effi-
cacy and ellectiveness are minimized. A measure of the economy for cost in
resources) with which a procedure of known efficacy and effectiveness is car-
ried out.
2. In statistics, the relative precision with which a particular study design or es-
timator will estimate a parameter of interest.
EGG COUNi See WORM COUNT.
ELIMINATION See ERADICATION (OF DISEASF.).
EMPIRICAL Based directly on experience, e.g., observation or experiment, rather than
on reasoning alune.
ENCOUNTER A lace-lo_ -face transaction between a personal health worker and_ a patient
or client.
ENDEMIC DISEASE The constant presence of a disease_ or infectious agent within a given
geographic area or population group: mav also refer to the usual prevalence of a
given disease within such area or group. See also HOLOENDEMIC DIiCASE; HYRERF.N-
DEMIC DIS_EASE.
END RESULTS S_ee OUTCOMFS.
ENvIRONMENT All that which is external to the individual human host. Can be divided
into physical, biological, social, cultural, etc., any or all of which can influence health
status of populations.
,0

epidemic 42
EPIDEMIG (from the Greek epi (upon), dl'+nw (people)( The occurrence in a community
or region of cases of an illness, specific health-related behavior, or other health-
related events clearly in excess of normal expectancy. The community or region,
and the period in which the cases occur, are specified precisely. The number of
cases indicating the presence of an epidemic varies according to the agent, size, and
type of population exposed, previous experience or lack of exposure to the disease,
and time and place of occurrence; epidemicity is thus relative to usual frequency of
the disease in the same area, among the specified population, at the same season of
the year. A single case of a communicable disease long absent from a population or
first invasion by a disease not previously recognized in that area requires immediate
reporting and full held investigation; two cases of such a disease associated in time
and place may be sufficient evidence to be considered an epidemic.
The word may be used also to describe outbreaks of disease in animal or plant
populations. See also EPlzoonC; EPORNITHIC.
EPIDEMIC, COMMON lOURCE (Syn: common vehicle epidemic, holomiantic disease) Outr
break due to exposure of a group of persons to a noxious influence that is common
to the individuals in the group. When the exposure is brief and essentially simul-
uneous, the resultant cases alt develop within one incubation period of the disease
(a "point" or "point source" epidemic).
The term "ho)omiantic disease" was used bc Stallvbrass (1931) to describe out-
breaks of this type, but as with several other terms created from Greek or Latin
roots, transmission to epidemiologists who lacked a classical education, did not take
place.
EPIDEMIC CVRVE A graphic plotting of the distribution of cases by time of onset.
EPIDEMIC, MATHEMATICAL MODEL OP See MATHEMATICAL MODEL.
EPtDEMIC, ro1NT sorMCE See EPIDEMIC, COMMON SOURCE.
E_PID_ EMIOLOGtST An investigator who studies the occurrence of disease or other health-
related conditions or events in defined populations. The control of disease in pop-
u)alions is often also considered to be a task for the epidemiologist, especially in
speaking of certain specialized fields such as malaria epidemiology. Epidemiologists
mav, study disease in populations of animals and plants, as well as among human
populations. See also CLINICAL EPIDEMIOLOGIST.
EPIDEMIOLOGY The study of the distribution and determinants of health-related sutes
or events in specified populations, and the application of this study to control of
health problems.
There have been many definitions of epidemiology. In the past 50 years or so,
the definition has broadened from concern with communicable disease epidemics
to take in all phenomena related to health in populations.
The Oxford EnRluh Didiorary (OED) gives as a definition: "That branch of medical
science which treats of epidemics" and cites Parkin (1873) as a source. However,
there was a"l.ondon Epidemiological Society" in the 1850s. The identity of the
scholar who first used the word at that time has been lost. EpidrmioloRia appears in
the title of a Spanish history of epidemics, Epidnniologia oQariola, Madrid, 1802.
, EPtdfmic is much older. The word appears in Johnson's Diclionary (1775), and
OED gives a citation dated )Ci03. The word was, of course, used by Hippocrates.
EPIDEMIOLOGY, ANALYTIC Sft_ ANALITIC STUDY.
EPIDEMIOLOGY, Drs_eRIrr1YE Study of the occurrence of disease or other health-related
characteristics in human populations, General observations concerning the relation-
ship of disease to basic characteristics such as age, sex, race, occupation, and social
class; also concerned with geographic location. The major characteristics in descrip-
43 ethics
uve epidemiology can be classified under the headings: persons, p)ace, and time.
See also OBSERVATIONAL STUDY.
EPIDEMIOLOGY, E7[PERIMENTAL See EXPERIMENTAL EPIDEMIOLOCY.
EPISODE Period in which a health problem or illness exists, from its onset to its resolu-
tion. See also ENCOUNTER.
EPIzooTle An outbreak (epidemic) of disease in an animal population (often with the
implication that it may also affect human populations).
EroRNITwIC An outbreak (epidemic) of disease in a bird population.
ERADICATION (of DISEAaE) Termination of all transmission of infection by extermina-
tion of the infectious agent through surveillance and containment. Eradication, as
in the instance of smallpox, was based on the joint activities of control and surveil-
lance. Regional eradication has been successful with malaria and in some countries
appears close to succeeding for measles. The term "elimination" is sometimes used
to describe eradication of diseases such as measles from a large geographic region
or political jurisdiction.
ERROR
1. A false or misuken result obtained in a study or experiment. Several kinds of
error can occur in epidemiology, for example, due to bias.
2. Random error is the portion of variation in a measurement that has no ap-
parent connection to any other measurement or variable, generally regarded
as due to chance.
3. Systematic error, which often has a recognizabfe source, e.g., a faulty measur-
ing instrument, or pattern, e.g., it is consistently wrong in a particular direc-
tion. See also etws.
ERROR, TYPE 1(Syn: alpha error) The error of rejecting a true null hypothesis. See also
SI(;NIrICANCE LEVEL; STATISTICAL TEST.
ERROR, TYPE 11 (Syn: beta error) The error of failing to reject a false null hypothesis.
See aI30 POWER; STATISTICAL TEST.
ESTIMATE A measurement or a statement about the value of some quantityy is said to be
an estimate if it is known, believed, or suspected to incorporate some degree of
error.
ESTIMATOR In statistics, a function for computing estimates of a parameter from ob-
served data.
ET-/tlcs The branch of philosophy that deals with the distinction between dghl and
wrong, with the moral consequences of human actions. Ethical principles govern
the conduct of epidemiology, as they do all human activities; the ethical issues that
are specific to epidemiological practice and research include informed consent, con-
fidentiality, and respect for human rights. The issues have been defined, described,
and discussed by many writers and by special committees under the auspices of
research granting agencies and other official bodies in many countries.'
ISee, for example, the following: Curran WJ: Protecting confidentiality in epidemiologic investi-
gations hy the (:enten for Disease Control. N EnRf f Med 914:1027-_ 1028, 1986.
Susser MW, Stein 2, Kline J: Ethics in epidemiology. Ann Avr Acad Pot Sor Sn 437:12R-141, 1978.
Commonwealth of Australia. National Health and Medical Research Council, Medical Resrarch
E.thics Comminee: Repor( on E(hics in Epidemiological Research. Canberra. 1995.
Stolky PD: Faith, evidence and the epidemiologist. J PuMK HradA Pot 6:37-12, 1985.
Gordis, L, Gold E. Sel(xr R: Privacy and prolection in epidemiologic and medical rexarch: Chat-
lenge and responsiblity. Ain f Epidaniol 105:163-168, 1977.
National Academy of Scienccs, Institute of Medicine: Ethiu of HaaBl1 Carr. Washing(on, DC. 1974.
Tancredi LR (ed): Ethical issues in epideminlogic research (Vol. VI1, series in Psychosocial Epide
miology). New Brunswick, NJ: Rutgers University Press, 1986.
T6C(r Y.7(sZF>IG

ethnic group 44
EntNIC GROUP A social group characterized by a distinctive social and cultural tradition,
maintained within the group from generation to generation, a common history and
origin, and a sense of identification with the group. Members of the group have
distinctive features in their way of IifR, shared experiences, and olien a common
genetic heritage. These features may be reflected in their health and disease expe-
rience. See also RACE.
ETlot.ocY Literally, the science of causes, causality; in common usage, cause. See also
CAUSALIT\'; PATHOGENESIS.
ETIOLAGIC FRACTION (EXPOSED) See ATTRIDUTARtE FRACTIUN(EXPOSED).
ETIOLOGIC FRAC_TION (POPUL_ATION) See ATFRIRUTARLE FRACTION (PnPULATIUN).
EVALUATION A process that atlempts to determine as svstenlatically and objectivels' as
possible the relevance, effectiveness, and impact of activities in the light of their
objectives. Several varieties of evaluation can be distinguished, e.g., evaluation of
structure, process. and outcome. See also CLINICAL TRIAL$ EFFECTIVENESS; EFFICACr;
EFF-ICIENCY: IIEALTH SERVICES RESEARCII; PROGRAM EVALUATION ANn REVIEW TECH-
NI(jUES( QUALITV Or CARE.
EvAN's Pos-rut.ATES Expanding biomedical knowledge has led to revision of HENLt:'S
and RoClt's POSTULATES. Alfred Evans' developed those that follow, based on thc
Henle-I:och ntodel.
1. Prevalence of the disease should be significanth higher in those exposed to
the hypothesized cause than in controls not so exposed.
2. Exposure to the hypothesized cause should_ be more frequent among those
with Qte disease than in controls without the disease-when all other risk
factors are held constant.
3. Incidence of the disease should be signihcantly higher in those exposed to
the hvpothesized cause than in those not so exposed, as shown by prospective
studies.
4. The disease should follow exposure to the hypothesized causative agent with
a distribution of incubation periods on a bell-shaped curve.
5. A spectrum of host responses should follow exposure to the hypothesized
agent along a logical biological gradient from mild to severe.
6. A measurable host response following exposure to the hypothesized cause
should have a high probability of appearing in those lacking this before ex-
posure (e.g., an(ibody, cancer cells), or should increase in magnitude if pres-
ent before exposure. This response pattern should occur infrequenth in per-
sons not so exposed.
7. Experimental reproduction of the disease should occur more Frequently in
animals or man appropriately exposed to the hypothesized cause than in thusc
not so exposed; this exposure may be deliberate in volunteers. experimen-
tally induced in the laboratory, or may represent a regulation of natural ex-
posure.
8. Elimination or modification of the hypothesized cause should decrease the
incidence of the disease (i.e.. attenuation of a virus, removal of tar from
i .
clgarettes).
9. I'revention or modification of the host's response on exposure to the hypoth-
esized cause should decrease or eliminate the disease (i.e., immunization, drugs
to lower cholesterol, specific lymphocyte transfer factor in cancer).
10. All of the relationships and findings should make biological and cpidemio-
logic sense.
' Ecans AS: Causation and disease: The lienle-Koch postulates revisited. Ya4 f Biol Mrd 49:175-
195. 1976.
zGEZTSC?'Oz
45 explanato -ry variable
EXAt:F stETxoD A statistical method based on the actual, i.e., "exact"probability distri-
bution of the study data, rather than on an approximation such as_ the normal or
chi-square distribution; for example. Fisher's exact test.
EXACT TES'r A statistical test based on the actual null probability distribution of the
study data, rather than, say, normal approximation. The most common exact test
is the Fisher-Irwin test for fourfold tables.
EXC6SS RATE AMONG EXPOSED See RATE DIFFE_RENCE.
EXCES5 RISR A term sometitMs used to refer to the POPULATION EXCESS RATE and_ Some-
ttme5 10 RISK UIFFERENCE.
EXPANDED rROGRAMME ON IMMUNIZATION Part of the effort to achieve "Health for All
by the 1'ear 2t/(>U," under the auspices of WHO, UNICEF, and other international
and bilateral aid agencies. This is a program of immunizing against diphtheria.
tetanus. measles, pertussis, poliomyelitis, and tuberculosis, conducted especially in
developing countries.
EXPECTATION OF LIFE (Syn: life expectancy or expectation) The average number oF
years an individual of a given age is expected to live il current mortality rates con-
tinue to appic. A statistical abstraction based on existing, age-specific death rates.
Life rx1xctann al birth (1,J: Average number of years a newborn baby can be ex-
pected tu live if current mortality trends continue. Corresponds to the total number
of years a given birth cohort can be expected to live, divided by the number of
children in the cohort. Life expectancy at birth is partly dependent on mortalil) in
the first year of life and is lower in poor than in rich countries because of the higher
inlam and child mortality rates in the former.
Lije rxfxcdancy at a gilMn age, age x fl,): The average number of additional years a
pcrson age x would live if current mortality trends continue to apply, based on the
age-specific death rates for a given year.
Life expectrncv is a hypothetical measure and indicator of current health and
mortalitv conditions. It is not a rate.
EXPERIMENT A studv in which the investigator intentionally alters one or more factors
rs
under controlled conditions in order to study the effects of so doing.
EXPERIMENTAL PERIMENTAL EPIDEMIOLOGY In modern usage. this term is often equated with RAN-
DOMIZED CONTROt1ED TRIAIS. To GREENWOOD and other epidemiologists in the 1920s,
it meant the study of epidemics among colonies of experimental animals such as
rats and mice. The original meaning of the term is preferable; if the word "exper-
iment" is qualified by the adjective "epidemiologic" it is a synonym for RANDUMtzED
CONTROLLED TRIAL. See a1S0_ ANIMAL MODEL.
EXPERIMENTAL STUDY A study in which conditions are under the direct control of the
investigator. In epidemiology, a study in which a population is selected fur a planned
trial of a regimen whose effects are measured by comparing the outcome of the
regimen in the exfxrimental group with the outcome of another regimen in a con-
trol group. To avoid s1As members of the experimental and control groups should
be comparable except in the regimen that is offered them. Allocation of individuals
to experimental or control groups is ideally by randomizatiom. In a RANUOMIZED
CONTROLLED TRIAL, individuals are randoml) allocated; in some experiments, c.g.,
Iluoridation of drinking water, whole communities have been (nonrandomly) allo-
cated to experimental and control groups.
EXPLANATORY STUDY A study whose main objective is to explain, rather than merely
describe, a situation, by isolating the effects of specific variables and understanding
the mechanisms of action. See also PR_AGMATIC SruoY. EXPLANATORY VARIABLE
I. A variable that causally explains the association or outcome under study.

expoaed 46
2. In sutistics, a synonym for INDEPENDENT VARIABLE.
ExrotaD In epidemiology, the exposed group (or simply, the apo.+ed) is often used to
connote a group whose members have been exposed to a supposed cause of a dis-
ease or health state of interest, or possess a characteristic that is a determinant eterminant of
the health outcome of interest.
ExPOSURE..
I. Proximity and/or contact with a source of a disease agent in such a manner
that effective transmission of the agent or harmful effects of the agent may
occur.
2. The amount of a factor to which a group or individual was exposed: some-
times contrasted with dose, the amount that enters or interacts with the orga-
nism.
nism.
'J. Exposures may of course be beneficial rather than harmful, e.g., exposure to
immunizing agents.
EYP96t/RE4DDS RATIO See ODDS RATIO.
ExtostntE stAnO The ratio of rates at which persons in the case and control groups of
a C_ASE CONTROL STUDY are exposed to the R15K FACTOR (or to the protective factor)
of interest.
txrRtsstvstY In genetics, the extent to which a gene is expressed.
ExrttAPOLATE. ExTRArouTtoN To predict the value of a variate outside the range of
observations; the resulting prediction. See also INTERPOt.ATE.
EXTRINSIC INCUBATION PERIOD Time required for development of a disease agent in a
vector from the time of uptake of the agent to the time when the vector is infective.
See also INCUBATION PRIOD; VECTOR-RORNE INFECTION.
UGUZTsEZOz
F DtsttttntmoN (Syn: Variance ratio distribution) The distribution of the ratio of two
independent quantities each of which is distributed like a variance in normally dis-
tributed samples. So-named in honor of R.A. Fisher who first described this distri-
bution.
FI ("F one") Term used in genetics to describe first-generation progeny of a mating.
FACTOR (Syn: determinant)
I. An event, characteristic, or other definable entity that brings about a change
in a health condition or other defined outcome. See also CAUSALITY, CAUSA-
TION OF DISEASE, FACTORS IN.
2. A synonym for (categorical) independent variable, or more precisely, an in-
dependent variable used to identify, with numerical codes, membership of
qualitatively different groups. A causal role may be implied, as in "overcrowd-
ing is a factor in disease transmission" where overcrowding represents the
highest level of the factor "crowding."
FACTOR ANALYSI3 A set of statistical methods for analyzing the correlations among sev-
eral variables in order to estimate the number of fundamental dimensions that un-
derlie the observed data and_ to describe and measure those dimensions. Used fre-
quently in the development of scoring systems for rating scales and questionnaires.
FACTORIAL DESICN A method of setting up an experiment-or study to assure that all
levels of each intervention or classificatory factor occur with all levels of the others.
FALSE NECAnvE Negative test result in a subject who possesses the attribute for which
the test is conducted. The labeling of a diseased person as healthy when screening
in the detection of disease. See aISO SCREENING; SENSITIVITY AND SPECIFICITY.
FALSE trosmvt Positive test result in a subject who does not possess the attribute for
which the test is conducted. The labeling of a healthy person as diseased when
screening in (he detection of disease. See also SCREENINC; SENSITIVITY AND SPECIFIC-
ITY. iAMIL1AL DISEASE Disease that exhibits a tendency to familial occurrence. Familial oc-
currence of disease may be due to genetic transmission, intrafamilial transmission
of infection or culture, interaction within the family, or the family's shared experi-
ence. including its exposure to a common environment.
vAMILY A group of two or more persons united by blood, adoptive or marital ties, or
the common law equivalent; the family may include members who do not share the
household but are united to other members by b)ood, adoptive or marital, or equiv-
aknt ties. Epidemiologic studies may be concerned with family members or with
those who share the same household or dwelling unit.
FAMILY, E.xTENDED A group of persons comprising members of several generations united
by blood, adoptive and mariul, or equivalent ties. See also rAMILY, NUCLEAR.

family conuct disease 48
FAMILY CONTAf.T DtSEAbE Disease that occurs among members of the family of a worker
who is exposed to a toxic substance and carries this home on his person or his
clothing, causing exposure to other family members.
FAMILY, NUCLEAR A group of persons comprising members of a single or at most two
generations, usually husband-wife-children, united by blood or adoptive and mar-
ital or equivalent ties.
FAMILY OF CLAS31F1CATIONS In nosology, a set of related classification systems descrili-
ing different aspects of health problems. For example, the International Classifica-
tion of Disease, the International Classification of Health Problems in Primary Care,
the International Classification of Impairments, Disabilities and Handicaps, and the
specialty subclassifications for oncology. psyc-hiatry, ctc. developed by WHO work-
ing groups constitute a"family of classifications."
FAMILY STUDY An epidemiologic study of a family or a group of families. The term has
been used to describe surveillance of family groups, e.g., for tuberculosis. In ge-
netics, investigation of families showing an unusual characteristic in order to detcr-_
rnine whether the characteristic clusters in certain families and if so, why.
FARR. WtLUAM (1807-1883) A medical graduate who became the first compiler of ab-
stracts (statistician) to the Registrar-General in the newly established General Reg-
ister O(Tice of England in 1839 and remained there for more than 40 years. In his
Annua/ ReporD, the combination of facts on death rates and vivid language drew
attention to many inequalities of health and sickness experience between "healthY"
and "unhealthy" districts in England. His many contributions to vital statistics and
epidemiology are contained in his monograph Vital Stntutic.i (London, 1885). These
include a statement of the relationship between incidence and prevalence, the con-
cepts of persvn-years, retrospective and prospective approaches. observed and ex-
pected numbers of events, the first workable Nosot.ocY, and empirical laws about
the natural history of epidemics.
FATALITY RATE The death rate observed in a designated series of persons affected by a
simultaneous event, e.g., victims of a disaster. A term to be deprecated, because it
Can be Confused Wilh CASE FATALITY RATE. FEAS1atuTY srUDY Preliminary study to determine
practicability of a proposed health
program or procedure, or of a larger study, and to appraise the factors that may
influence iLt praCUCablllty. See also PIt.OT STUDY.
FECtiNDtTY The ability to produce live offspring. Fecundity is difficult to measure since
it refers to the theoretical ability of a woman to conceive and carry a fetus to term.
If a woman produces a live birth, it is known that she and her consort were fecund
during some time in the past.
F>4RTttrsY The actual production of live offspring. Stillbirths, fetal deaths, and abor-
uons are not included in the measurement of fertility in a population. See also
CRAVIDITY; PARITY.
FERT1l1TY RATE See GENERAL FERTIUTY RATf.
rERns-ITV RArlo A measure of the fertility of the population that restricts the dcnom-
inator to the female population of appropriate age for childbearing. The fertility
ratio is defined as
Number of girls under 15 years of age
Fertility ratio - x 1000
Number of women in 15=T9 age group
. (Not to be confused with GENERAL FERTIrJTY RATE.)
FETAL DEATN (Syn: stillbirth) Death prior to the complete expulsion or extraction from
49 frtne.s
its mother of a product of conception, irrespective of the duration of pregnancy.
The death is indicated by the fact that after such separation the fetus does not
breathe or show any other evidence of life, such as beating of the heart, pulsation
of the umbilical cord, or definite movement of voluntary muscles. Defined variously
as death after the 20th or 28th week of gestation (the definition of the length of
gestation varies between different jurisdictions, making this event difficult to com-
pare internationally). See also LIVE RIRTH.
FETAL DEATH CERTIFICATE (Syn: certificate of stillbirth) A vital record registering a fetal
death or stillbirth. Some health jurisdictions require the use of a fetal death ccrtif-
icate for all products of conception, whereas odters require its use only in cases in
which gestation has reached a particular duration, usually the 20th or the 28th
week.
tz'rAL DEATH RATE (Syn: stillbirth rate) The number of fetal deaths in a year expressed
as a proportion n o(' the tot_al number of births (live births plus fetal deaths) in the
same vear.
Fetal death rate - Number of fetal deaths in a year
X 1000
Number of fetal deaths prus live
births in the same year
Note that the denominator is larger than for the FETAL DEATH RATIO and that the
fetal death rate is therefore lower than the f-etal death ratio, which is used in some
jurisdictions. International comparisons of stillbirth or fetal death statistics will be
Ilawed if the distinction is not appreciated.
FcTAL DEATat RAno A measure of fetal wastage. related to the number of live births.
Defined as
Fetal death ratio =
Number of fetal deaths in a vcar
Nurnber ol live births in IhF same vear
(Can be expressed per 1(0(1.)
FIELD SURVEY The planned collection of data in "the Field," i.e., usually anrong nonin-
stitutionalized persons in the general population. A method of establishinR a rela-
Iionship between two or more variables in a population in numerical terms bY elic-
iting and collating information from existing sources (not onlY records but people
who can say how they feel or what happened). See also CROSS-sECTioNAL sruoY.
FINLAY, CARLOS ALRERT (1833-1915) Cuban physician, initial investigator (1888-1891)
of the role of Aedes rtrgYpti (then known as Culrz fniciotw) in the transmission of
yel{ow fever. His experiments were unsatisfactory, btn his theory was fully con-
firmed by the experiments of the team led by REEO in which he took an active part.
Ft3NER'S EXACT TEST The test for association in a two-by-two table that is based upon
the exact hypergeometric distribution of the frequencies within the table.
FzsNtNC ExeEDrnoN Exploratory study to find clues and leads for further study. AI-
though the term is sometimes used pejoratively, "fishing expeditions" may be done
for worthwhile causes, e.g., to seek cities to the cause of a major life-threatening
outbreak. A recent example was the initial investigation of Legionnaires' disease.
rrTNrs_ s This word has specific meanings in several fields related to epidemiology.
I._ In population genetics, a measure of the relative survival and reproductive
success of a given individual or phenotype, or population subgroup.
2. In health prornation, health risk appraisal, physical fitness is a set of attributes
t6U"TSVZOCI

fixed cohort 50
51
function
that people have or achieve, that relate to their ability to perform physical
activity. Intellectual and emotional fitness can also be described and to some
extent measured.
rtxED CoHoRT A cohort in which membership is fixed by being present at some dehn-
ing event ("zero time"); an example is the cohort comprising survivors of the atomic
bomb exploded at Hiroshima. See also eLOSED coHORT.
Fou.ow-tn Observation over a period of time of an individual, group, or initially de-
fined population whose appropriate characteristics have been assessed in order to
observe changes in health status or health-related variables. See also COHORT.
FOI1A_ W-U_ T STVDY
I. A study in which individuals or populations, selected on the basis of whether
they have been exposed to risk, received a specified preventive or therapeutic
procedure, or possess a certain characteristic, are followed to assess the out-
come of exposure, the procedure, or effect of the characteristic, e.g., occur-
rence of disease.
2. SVnonvm for COHORT STLDY.
owrrrs (singular, lomes) Articles that convey infection to others because they have
been contaminated by pathogenic organisms. Examples include handkerchief,
drinking glass. door handle, clothing, and toys.
FORCE OF MOR<IDITY (Syn: hazard rate, instantaneous incidence density, instantaneous
incidence rate, person-time incidence rate) Theoretical measure of the number of
new cases that occur per unit of population-time, e.g., person-years at risk. This is
a measure of the occurrence of disease at a point in time, t, defined mathematically
as the limit, as At approaches zero, of
Probability that a person well at time I will develop
the disease in the interval t± At
At
The average value of this quantity over the interval t to ((±At) can be estimated as
Incident cases observed from t to (t + Ot)
Number of person-time units of experlence observe
from t to (I + At)
FOURIOt.D TAL/LId SCe CONTINGENCY TABLE.
FRACASroRn/s,, Gtttot.AMo (1484-1553) Physician, poet, natural scientist, and a man of
legends, said to have required surgery at birth to open fused lips and to have sur-
vived a lightning boh that killed his mother while he was in her arms as an infant.
He gave the word "syphilis" to the world in his mock-heroic poem, Syp/hilia Sivr
Morbm GaNicus (1530), which explicitly described the transmission of disease by acts
of venery. In Dr Con(agionr (1546), he described transmission of infection by direct
contact, by fomites, and "at a distance," by which he meant droplets.
I+RAMINGNAM h'R)DY Probably the best known cohort study of heart disease. isease. Since 1949,
samples of residents of Framingham, Massachusetts, have been subjects of investi-
gations of risk factors in relation to the occurrence rrence of heart disease and later, other
outcomes.
FRANR, JOHANN PtETER (1745-182)) Author of StistrRl riner voUstdndiRrn Rudicinuchrn
PoGsn, which established hygiene as a systematic science and contained many sug-
gestions based on epidemiologic observations. In modern terminology, Frank was
"Director-general of public health" to the Hapsburg empire in eighteenth centu -ry
Vienna. His Systrm contained many sensible rule_s_ for individual good health, and
detailed specifications for public health practice.
FREQUENCY See OCCURRENCE.
FIIEQUENCY DISTRIR(fT1pN See DISTRIBUTION.
FRE(lUENCY MATCHING See MATCHING.
FREQoENCY POLYGON A graphic illustration of a distribution, made by joining a set of
points, for each of which the abscissa is the midpoint of the class and the ordinate,
or height, is the frequency.
m
Is
s
FORCE OF MORTALJ7Y (Syn: actuarial death rate) The hazard rate of the occurrence of
death at a point in time t, i.e., the limit as At approaches zero, of the probability
that an individual alive at time I will die by time I+At, divided by At. Distinct from
cumulative death rate.
roRECASTtNS A method of estimating what may happen in the future that relies on
extrapolation of existing trends (demographic, epidemiologic, etc.). It may be less
useful than sCENARIO RUILDING, which has greater flexibility. For example, extra-
polation of mortality trends for coronary heart disease in the early 1960s in the
,United States suggested that the mortality rates would continue to rise, perhaps
indef7nitely, whereas in fact the rates began to fall soon after that time.
FOR7l11TOUS RELATIONSHtr A relationship that occurs by chance and needs no further
explanation.
FORWARD SURVIVAL EST7MATE A procedure for estimating the age distribution at some
later date by projecting forward an observed age distribution. The procedure uses
survival ratios, often obtained from model life tables.
00
f
~r.
110 ISO 1K0 170 Ttw- 1!0 200 210 770 230 7/0- -SSO
Serum rholeperol rJut Img'IOQmI1 -
Frequency polygon. From Rimm et al., 1980.
FUNCTION A quality, trait, or fact that is_ so related to another as to be dependent upon
and to vary with this other.
ssCZT!:ezaz

/
GALTON, FRANCIS (1822-191 1) A founder nder of the modern science of human biology and
the inventor of several statistical methods. Perhaps he is best known as the author
of Hrrrditorti Grniui (1889), an analysis of physical and intellectual characteristics of
successive generations of several hundred prominent families. Observing that off-
spring of parents of unusual talent, height, etc., tended toward average, he for-
mulated the "Law of filial regression" (the origin of the term "regression"). His
statistical approaches were refined and extended by his pupil. KARL PEARSON, the
lounder ot modern RtOMETRY.
GAVS5IAN DISTRIRlfl70N See NORMAL DISTRtRUTION.
GAME THEORY A b_ranch of mathematical logic concerned with the range of possible
reactions to a particular strategy; each reaction can be assigned a probability and
each reaction can lead to further action by the "adversarv" in the game. Used mainlv
in systems analysis and such applications as war-gaming, game theory has occasional
applications in disease surveillance and control. It is also one of the underlying
theories used in clinical decision analysis.
GENE A sequence of DNA that codes for a particular protein product or that regulates
other genes. Genes are the biological basis of heredity and occupy precisely defined
locations on chromosomes.
GENE rOOL The total of all genes possessed bv reproductive members of a population.
GENERAL rERTILITY RATE A more refined measure of fertility than the crude birth rate.
The denominator is restricted to the number of women of childbearing age (i.e..
15-44 or 15-49). Defined as
Number of live births in an area
General fertility rate - during a year x 1000
Midyear female population age 15-44
in same area in same year
The tipper age limit for this rate is 44 years in most jurisdictions.
GENERATaoN EfFECT (Syn: cohort effect) Variation in health status that arises from the
different causal factors to which each birth cohort (see coe+oRT) in the population
is exposed as Ihe environment and society change. Each consecutive birth cohort is
exposed to a unique environment that coincides with its life span.
GENERATION TIME The interval between receipt of infection by and maximal infectivity
of the host. This applies to both clinical cases and inapparent infections.
With person-tn-pcrson transmission of infection, the interval between cases is de-
' termined by the generation time. See also INGUeA-T-ION PERIOD.
GENETIC DRIrT Random variation in gene frequency from generation to generation;
53 Gompertz'. IRw
most often observed in small populations. The process of evo(ution through ran-
dom statistical fluctuation of genetic composition of populations.
GENETtc ErtDEMlot.ocY The science that deals with the etiology, distribution and con-
trol of disease in groups of relatives, and with inherited causes of disease in popu-
lations.'
' Monon NE: Outlinr of Rrnrtic rpidrwioleRy. New York- Karger. 1982.
GENE77C LINRAGE Particular genes occupy specific sites in chromosomes. one member
of each pair of chromosomes of course coming from each parent, rent. When two genes
are fairly close lo each other in the same chromosome pair, they tend to be inher-
ited together. Such genes are said to be linked, and the phenomenon is called ge-
nelic linkage.
GENETIC IENE'YRANCE The extent to which a genetically determined condition is ex-
pressed in an individual. This_ determines the frequency with which genetic effect
is shown in a population.
GENE-Ttc,s The branch of biology dealing with heredity and variation of individual
members of a species. Its branches include population genetics. which overlaps ep-
idemiology; therefore we include pertinent genetic terms in this_ dictionary.
GENOME The array of genes carried by an individual.
GEOGRAPHIC rATS1oLOCY (Svn: medical geography) The comparative study of coun-
tries, or of regions within thcm. with regard to variations in morbiditylmortality.
The (implied) aim of such study is usually to demonstrate that the variations are
caused by or related to differences in the geographic environment.
GEOMETRIC MEAN See MEAN, GEOMETRIC.
GESTATIONAL AGE Strictly speaking, the gestational age of a fetus is the elapsed time
since conception. However, as the moment when conception occurred is rarely known
precisely. the duration of gestation is measured from the first day of the last normal
menstrual period. Gestational age is expressed in completed days or completed weeks
(e.g., events occurring 280-286 days after the onset of the last normal menstrual
period are considered to have occurred at 40 weeks of gestation).
Measurements of fetal growth, as they represent continuous variab(es, are ex-
pressed in relation to a specific week of gestational age (e.g., the mean birth weight
for 411 weeks is that obtained at 280-286 days of gestation on a weight-for-
gesutional-age curve). Some specified variations of gesutional age are: Prrtme: Less
than 37 completed weeks (less than 259 days). Trne: From 37 to kss than 42 com-
pleted weeks (259-293 days). PostunR: Forty-two completed weeks or more (294
days or more).
"GOLD sTANDARD" A jargon term, used to describe a method, procedure, or measure-
nient that is widely accepted as being the best available. Often used to compare with
new methods.
GOLDRERQER, JosErtt (1874-1927) A U.S. Public Health Service physician. Responsible
for a brilliant series of investigations of pellagra. After logical deductions Icd him
to reject the prevailing view that pellagra had an infectious origin, he conductcd
studies in several rural communities and in institutions, leading conclusively to the
demonstration that pellagra was a dietary deficiency disease.
GowrERtz's uw The proportionate relationship of mortality to age. Mortality is high
during the first year of life (infancy), drops to its lowest level in childhood, and
gradually climbs during the third and fourth decadc. After age 35 or 40, the in-
crease in mortality with age tends to be logarithmic for the remainder of the life
span, i.e., the relative increase in mortality in each successive age class (of equal
9f MTSE~O?; F'

i
6onadotrophic cycle 54
size) is about constant. This law was first enunciated by the demographer Benjamin
F.ompertz, on the basis of survival curves in English villages in the 1840s.
GONADOTROPHIC CYCLE One complete round of ovarian development in the mosquito
(or other insect vector) from the time when the blood meal is taken to the time
when the fully developed eggs are laid.
GOODNESS OF F7T- Degree of agreement between an empirically observed distribution
and a rnathematical or theoretical distribution.
GOODNESS or rtT TEST A statistical test of the hypothesis that data have been randomly
sampled or generated from a population that follows a particular theoretical distri-
bution or model. The most common such tests are chi-square tests.
CRADIENT OF INFECTION The variety of host responses to infection ranging from inap-
parent infection to fatal illness.
GRAPH Visual display of the relationship between variables; the values of one set of
variables are plotted along the horizontal or x axis, of a second variable, along the
vertical or r axis. Three-dimensional graphs of relationships between three variables
can be represented and comprehended visually in two dimensions. The relationship
between x and Y may be linear, exponential, logarithmic, etc. See also AXIS. AasclssA,
ORDINATE. °Graph"is also a descriptive term for histograms, bar chans, etc.
N I
>
/
/
P/
---------r
/i
~
Abscisss
x axis
Graph showing abscissa, ordinate, and locus of a point, P,
in relation to x and Y axis.
GRAUNT, JOHN (1620-1674) By profession a haberdasher, he was a member of the
small community of scholars and natural scientists in London who were Fellows of
the Royal Society in its early years and who made important contributions to the
natural sciences. Graunt studied the slLts OF MoRTAUrv and used them to conduct
the first analytic studies of vital statistics, identifying differences in mortality rates
between the sexes, between city and country folk, and recording all in Nalural and
political observations mentioned in a following index and made upon the Bills of Mortality
(London, 1662).
4sC%isCz0z
55 growth rate of population
GRAvtDITT The number of pregnancies (completed or incomplete) experienced by a
woman.
GREENwooD, MAJoR (1888-1949) Medical epidemiologist, trained in statistics by Karl
Pearson; Greenwood was the first professor of epidemiology at the London School
of Hygiene and Tropical Medicine. He inspired a whole generation of British epi-
demiologists, introducing to the subject a level of mathematical reasoning and sta-
tistical rigor it had not previously known. Author of many papers and several mon-
ographs, best known of which is Epidemics and Crouxt Diseases (London, 1933).
GROSS S REPRODUC77ON RATE The average number of female children a woman would
have if she survived to the end of her childbearing years and if, throughout that
period, she were subject to a given set of age-specific7ertility rates and a given sex
ratio at birth. This rate provides a measure of replacement fertility in the absence
Of mOrtahty. See also NET REPRODUCTION RATE. GRO WTrt RATE or POPULATION A measure of population
growth (in the absence of mi-
gration) comprising addition of newborns to the population and subtraction of deaths.
The result, known as natural rate of inereasE, is calculated as
Live births during the year - deaths during the year X 100
Midyear population
Alternatively, it is the difference between crude birth rate and crude death rate.

H
HACKETT SrLEEN CLASSIEICATIoN A numerical means of recording the size of an en-
larged spleen, especialh in malaria. This is a 6-point scale of 0 (no enlargement) in
5 (enlarged to umbilicus or larger). See Tmeinolog-v of Malaria and of Malaria Eradi-
cation. Geneva: WHO, 1963, pp. 40-41.
HALO EEFECT
1. The effect (usualFr beneficial) that the manner, attention, and caring of a
provider have on a patient during a medical encounter regardless of what
medical procedures or services the encounter involves. See also rLAceeo, rLw-
CERO EFCECT.
2. The influence upon an observation of the observer's perception of the char-
acteristics of the individual observed (other than the characteristic under stud))
or the influence of the observer's recollection or knowledge of findings on a
previous occasion.
HANDICAr Reduction in a person's capacity to fulfill a social role as a consequence ofan
IMVAIRMENT, inadequate training for the role, or othcr circumstances. Applied to
children. the term usualh refers to the presence of an impairment or other circum-
sunce that is likely to interfere with normal growth and development or with tht
capaOlr to learn. See also INTERNAAIINAL Ct.ASSIFICATION Or IMPAIRMCNTS, DtSARIL-
ITIs. ANO HANDICAVS for the official WHO definition.
HAPHAZARD SAMrLE Selection of a group of persons for studc without thought as to
whether they are representative of the population. The word "haphazard" hetc
implies selection based on a mixture of criteria such as convenience, accessibilit),
turning up at the time an investigation or study is in progress, and belonging to
some existing list or registry, etc. Because they have an unknown chance of being
unrepresentative of the population, haphazard samples are unsatisfactory for gen-
eralization. -
HARDY-WEINSERG uw The principle that both gene and genotype frequencies will
remain in equilibrium in an infinitely large population in the absence of mutation,
migration, selection, and nonrandom mating. I f p is the frequency of one allele and
q is the frequency of another and P+q = I. then p' is the frequency of Iwmozygotes
for the allele, q.l is the frequency of homozygotes for the other allclc, and '!nq is the
frequency of heterozygutes.
HARMONIC MEAN See MEAN, HARMONIC.
HAWTHORNE EP[ECT The effect (usually positive or beneficial) of being under study
upon the persons being studied; their knowledge of the study often influences their
behavior. The name derives from work studies by Whitehead. Dickson, Ruethlis-
berger, and others, in the Western Electric Plant, Hawthorne, Illinois, reported by
Elton Mayo in The Social ProBlamu of an Industrial CiviGzatinn (London: Routlcdge,
1949).
57 health promotion
HAZARD A factor or exposure that may adversely affect health.
HAZARD RATe (Syn: force of morbidity, instantaneous incidence rate) A theoretical
measure of the risk of occurrence of an event, e.g., death, new disease, at a point
in time, 1, defined mathematically as the limit, as At approaches zero, of the proba-
bility that an individual well at time t will experience the event by I+At, divided by
At.
HEALTH The World Health Organization (WHO) described health in the preamble to its
- - - - - -- - --- constitution as. "A state of complete physical, mental, and social wcll-being and
not
merely the absence of disease or infirmity." The WHO description of health has
been criticized because of the difficulty of defining and measuring "complete"
wellbeing.
There are several other definitions, including the following:
A state of dynamic balance in which an individual's or a group's capacity to cope
with all the circumstances of living is at an optimum level.
A state characterized by anatomical, physiological and psychological integrity, ability
to perform personallv valued family, work and community roles; ability to deal with
phvsical, biological, psvchological and social stress: a feeling of well-being: and free-
dom from the risk of disease and untimely death.
Rene Dubos offered the following definition: "A modus vivendi enabling imper-
fect men to achieve a rewarding and not too painful existence while they cope with
an imperfect world."
The word "health" is dcrived from the Old English Hal, meaning hale, whole.
sound in wind and limb.
HEALTH sEHAV1oR The combination of knowledge, practices, and attitudes that to-
gether contribute to motivate the actions we take regarding health. Health behavior
mat promote and preserve good health, or if the behavior is harmful, e.g., tobacco
smoking. may be a deterrninant of disease. This combination of knowledge. prac-
ticcs, and attitudes has been described and discussed by several writcrs, notably
Becker.' Sec also Ilhlr.u (xharnor.
' Becker MH (n1): The Hral1A Brlv/Modrl and Prraonal Health Brhaurar. Thorofare N): Slack. 1974.
HEALTH CARE Those services provided to individuals or communities by agents of the
health services or professions, for the purpose of promoting. maintaining, monitor-
ing, or restoring health. Health care is broader than, and not limited to medical
care, which implies therapeutic action by or under the supervision of a physician.
The term is sometimes extended to include self-care.
HEALTH EDVCATION The process by which individuals and groups of people learn to
behave in a manner conductive to the promotion, maintenance, or restoration of
health.
HEALTH INDEX A numerical indication of the health of a given population derived from
a specified compenite formula. The components of the formula may be INfANT
MORT-ALI-TY RA-T-FS, INCIDENCE RA-T-S for particular disease, or other HEALTH INDICA-
TORS.
HEALTH INDICA3'OR A variable, susceptible to direct measurement, that reflects the state
of health of persons in a community. Exampks include infant mortality rates, inci-
dence rates based on notified cases of disease, disability days, etc. These measures
may be used as components in the calculation of a HEALTH iNDEx.
HEALTH rROMOTION The process of enabling people to increase control over and im-
prove their health. It involves the population as a whole in the context of their
everyday lives, rather than focusing on people at risk for specific diseases, and is
directed toward action on the determinants or causes of health.
vX MC~iV% --

health risk .ppraisa) 58
HEALTH RISK AratAJsAL (HRA) (Syn: health harard appraisal (HHAJ) A generic term
applied to methods for describing an individual's chances of becoming ill or dying
front selected causes. The many versions now available share several common fea-
tures: Starting from the average risk of death for the individual's age and sex, a
consideration of various lifestyle and physical factors indicates whether the individ-
ual is at greater or less than average risk of death from the commonest causes of
death for his age and sex. All methods also indicate what reduction in risk could
be achieved by altering any of the causal factors (such as cigarette smoking) that
the individual could modify.
The premise underlying such methods is that information on the extent to which
an individual's characteristics, habits, and health practices are influencing his future
risk of dving will assist health care workers in counseling their patients.
HEALTII SERV)CES Services that are performed by health careprofessionals, or by others
under their direction, for the purpose of promoting, maintaining, or restoring health.
In addition to personal health care, health services include measures for health
protection and health education.
HEALTH SERVICES RESEARCH The integration of epidemiologic, sociological, economic,
and other analytic sciences in the study of health services. Health services research
is usuallv concerned with relationships between NEED, DEMAND, supply, use, and
OUTCOME of health services. The aim of health services research is evaluation; sev-
eral components of evaluative health services research are distinguished, via:
Evaluation of structurr, concerned with resources, facilities, and manpower.
Evaluation of pr_osrss, concerned with matters such as where, by whom, and how
health care is provided.
Evaluation of oulpul, concerned with the amount and nature of health services
provided.
Evaluauon of outcome, concerned with the results, i.e., whether persons using health
services experience measurable benefits such as improved survival or reduced
disabilitv.
HEALTH trATUrtcs Aggregated data describing and enumerating attributes, events, be-
haviors, services, resources, outcomes, or costs related to health, disease, and health
services. The data may be derived from survey instruments, medical records, and
administrative documents. vtTAL STATISTICS are a subset of health statistics. HEALTH t7ATUS INDEx A
set of measurements designed to detect short-term Fluctua-
tions in the health of members of a population; these measurements generally in-
clude phv.sical function, emotional well-being, activities of daily living, 7eelings, etc.
Most indexes require the use of carefully composed questions designed with refer-
ence to matters of fact rather than shades of opinion. The results are usually ex-
pressed bc a numerical score that gives a profile of the well-being of the individual.
HEALTH 6URVEY A survey designed to provide information on the health status of a
population. It may be descriptive, exploratory, or explanatory. See also MoastDtTv
sURVt:.v..
HEaLTHY WORKER ErsECr A phenomenon observed initially in studies of occupational
diseases: Workers usually exhibit lower overall death rates than the general popu-
lation, due to the fact that the severely ill and disabled are ordinarily excluded from
employment. beath rates in the general population may be inappropriate for com-
parison if this effect is not taken into account.
HERDOMADAL MORTALrrY RATE The mortality rate in the first week of life; the denom-
inator is the number of live births in a year.
FfENLE-KOCH FOSTULATES See ROlat'S POSTUtaTFS.
HERD utMUNtTr The immunity of a group or community. The resistance of a group to
59 hi.torica1 control
invasion and spread of an infectious agent, based on the resistance to infection of
a high proportion of individual members of the group. The resistance is a product
of the number susceptible and the probability that those who are susceptible will
come into contact with an infected person. In the herd immunity equation. "prolr_
ability of contact" is the intervening factor that reduces susceptibility to infection
among group members to less than that anticipated from their susceptibility as un-
related n-
related individuals.
HE_TEROSCEDASTtCITY Nonconstancy of the variance of a measure over the levels of the
factors under study.
HIlERNATJON See YECTOR-RORNE INFECTION.
HterocRATEs or Cos (c 460-370 BC) Greek physician, "Father of Medicine," respon-
sible for careful clinical observation of many important and common diseases-
tetanus, mumps, puerperal septicemia, etc. His writings contain important epide-
miologic observations, as in the books Airs, W'aun, Places, and Eftidemics. His Aphor-
itnts also demonstrate considerable empirical epidemiologic knowledge.
HtsT.OGRAM A graphic representation of the frequency distribution of a variable. Rec-
tangles are drawn in such a way that their bases lie on a linear scale representing
diflerent intervals, and their heights are proportional to_ the frequencies of the
values within each of the intervals. See also sAR DIAGRAM.
25
20
= 15
W
u
~
n 10
5
0
Men - 45- 54 yeors of oqe
100 200 300
SERUM CHOLESTEROL I mp/IOOmI I
Histogram. From National Center for Health Sutistics, 1978.
HLtTORICAL COHORT STUDY (Syn: historical prospective study, nonconcurrent prospec-
tive study, prospective study in retrospect) A COHORT STUDY conducted by recon-
structing data about persons 21 a time or times in the past. This method uses exist-
ing records about the health or other relevant aspects of a population as it was at
some time in the past and determines the current (or subsequent) status of inem-
ben of this population with respect to the condition of interest. Different levels of
past exposure to risk factor(s) of interest must be identifiable for substts of the
population. See also COHORT STUDY.
HISTORICAL CONTROL Control subject(s) for whom data were collected at a time preced-
ing that at which the data are gathered on the group being studied. Because of
differences in exposures etc., use of historical controls can lead to bias in analysis.
sGVzTsCZo7

Hogben number 60
HocsEN NUMtER A unique personal identifying number constructed by using a se-
quence of digits for birthdate, sex. birthplace, and other identifiers. Suggested by
the English mathematician Lancelot Hogben. Used in primary care epidemiology
in some countries and usable in RECORD LINKAGE. See also IDENTIFICATION NUMR_ER;
SOUNDFx ~CODE.
HoLME-c, OLIVER WENDELL (1809-1894) Physician, poet, philosopher, autocrat ("of the
Breakfast Table"). and crusader against puerperal fever. He argued that this was
conveyed to patients by the contaminated hands and clothes of attending physicians
and recommended washing the hands and changing clothes as a way to prevent it.
Unlike sEMMELWEIS, he succeeded in convincing the medical profession. His correct
belief was recorded in a paper. "The Contagiousness of Puerperal Fever.`
'N Eng QJ MrA Surg 1:503-530. 1842-43._
HOLOENDEMIC DISEASE A disease for which a high prevalent level of infection begins
early in life and affects most of the child population, leading to a state of equilib-
rium such that the adult population shows evidence of the disease much less com-
monh than do the children. Malaria in many communities is_ a holocndemic disease.
HOLOMIANTIC INFECTIoN See common source epidemic.
HO_ MOSCEDAST7CIT' Constancy of the_ variance of a measure over the levels of the fac-
lors under study.
HOSr1TAL-ACQUIRED INFEC7ION See NOSOCOMIAL INFECTION.
HOSPITAL DISCHARGE AtSTRACT SYSTEM AbslracUon of MINIMUM DATA SET from hospital
charts for the purpose of producing summary statistics about hospitalized patients.
Examples include the Hospital Inpatient Enquiry (H11'E) and Professional Activitv
Study (1'AS). The statistical tabulations commonly include length of stay by final
diagnosis, surgical operations, specified hospital service (i.e., medical, surgical,
gvnecoloFical, etc.) and also give outcomes such as "death'' and "discharged alive
61 hypothesis
shelter, cooking, washing, and sleeping facilities; may or may not be a family. The
term is also used to describe the dwelling unit in which the persons live.
HOUSEHOLD SAMPLE lURvtY A survey of persons in a sample of households. This, in
many variations, is a favored method of gathering data for health-related and for
many other purposes. The households may he sampled in any of several ways, e.g.,
by cluster, use se of random numbers in relation to numbered dwelling units. The
survey may he conducted by interview, telephone survey, or self-complcted rF-
slwnses to present queslions. The method is used in developing nations as well as
in the industrial world. -
HUMAN SIAOD INDEX Proportion of insect vectors found_ to contain human blood.
HUMAN ECOLOGY See ECOLOGY.
HUMAN IMMUNODEE7CIENCY VIRUS_ (Hlv) The pathogenic organism responsible for the
acquired immunodeficiency syndrome (AIDS); formerly or also known as the
hmphadenopathv virus (LAV), the name given by the original French discoverers
Montagnier et al.' in 1983. or the human Tcell lymphotropic virus, type III (HTLV-
III). the name given by Gallo et al.i to the virus they reported in 1984.
t Barre-Sinnussi F. Cherman IC, Rev F, et al.: Isolation of a T-Ivmphotropic retrovirus from a
patient at risk for acquired immune deficiency syndrome (AIDS). Sricnce 220:868-871, 1983.
=Gallo a(:. Salahuddln SZ. Popnvic M. el al.: Frequent detection and isolation of cwopalhic retro-
viruscs (HTLN-I111 from patients with AIDS and at risk for AIDS. Scinvr 224:500-503, 1984.
FIYPERENDEMIC DISEASE A disease that is constantly present at a high incide_nc_e_ and/or
prevalence rate and affects all age groups equally.
HYPERCEOMET#IC D13T7uSUTloN The exact probability distribution of the frequencies in
a two-by-iwo contingency table, conditional on the marginal frequencies being fixed
at their observed levels.
HYPOTHESIS .. . ..
from hospital." This system cannot generalh. be
as it is not possible to infer representaliveness or used for epidemiologic purposes
to generalize: this is because the I. A supposition, arrived_ at from observation or reOection,
that leads to refutable
predictions.
data usuallv lack a defined denominator and the same person may be counted more 2. Any conjecture
cast in a form that will allow it to be tested and refuted.
than once in the event of two or more HOSPITAL SEPARATIONS in the period of sludl'. See also NULL
HYPOTHESIS.
HOSPITAL INrATIENT ENQUtRY (Hl7E) Statistical tables of a 10% sample of hospital pa-
tients in England and Wales, showing class of hospital, diagnosis, length of su)',
outcomes. etc.
HOSPITAL SEPARATION A term used in commentaries on hospital statistics to describe
the departure of a patient from hospital without distinguishing whether the patient
departed alive or dead (the distinction is unimportant so far as the statistics of
hospital activity such as bed occupancy are concerned).
HOST
I. A person or other living animal, including birds and arthropods, that affords
subsistence or lodgment to an infectious agent under er natural conditions. Some
protozoa and helminths pass successive stages in alternate hosts of different
species. Hosts in which the parasite attains maturity or passes its sexual stage
are primary or definitive hosts; those in which the parasite is in a larval or
asexual state are secondary or intermediate hosts. A transport host is a carrier
in which the organism remains alive but does not undergo development.'
2. In an epidemiologic context, the host may be the population or group; biolog-
ical, social, and behavioral characteristics of this group that are relevant to
health are called "host factors."
' BFnenson, op rit.
HOUSEHOLD One or more persons who occupy a dwelling, i.e., a place that provides
ootZTsCzoZ

63 iocidence rate
IATROGENIC DtSEASE Illness resulting from a physician's professional activity, or from
the professional activity of other health professionals.
ICD See INTERNATIONAL CIASSIFICATION OF DISEASE. ICER[RG PHENOMENON That portion of disease that
remains unrecorded or undetected
despite physicians' diagnostic endeavors and community disease surveillance pro-
cedures is referred to as the "submerged portion of the iceberg." Detected or di-
agnosed disease is the "tip of the iceberg." The submerged portion comprises dis-
ease not medically attended, medically attended but not accurately diagnosed, and
diagnosed but not reported.'
' hst JM: The Iceberg. Lanret. 2:28-31. 1963.
ICHPPC See INTERNATIONAL CLASSIFICATION OF HEALTH PROlIIMS IN PRIMARY CARE.
IDENTIF7CATION NUMBER, IDENLItYtNG NUMBER Unique number given to eve -ry individ-
ual at birth or at some other milestone. Sweden has a system based on a sequence
of digits for birthdate, sex, birthplace, and additional digits for each individual.
Other systems, e.g., National Insurance number in the United Kingdom, Social
Security number in the United States, and Social Insurance number in Canada, are
sometimes used but are neither universal nor unique, being sometimes applied to
whole families or at least to more than one individual. See also HoceEN NUMnER;
SOUNDEx CODE.
IDIOSYNCRASY Webster's Dictionary defines this as a distinctive characteristic or peculi-
arity of an individual. In pharmacoepidemiology, it means an abnormal reaction,
sometimes genetically determined, following the administration of a medication.
ILLNESS See DtSFaSE.
ILLNESS BEHAVIOR Conduct of persons in response to abnormal body signals Such be-
havior influences the manner in which a person monitors his body, defines and
interprets his symptoms, takes remedial actions, and uses the health care system.
See aISO HALT/t BEHAVIOR.
IMMUNtTY, ACQUtRED Resistance acquired by a host as a result of previous exposure to
a natural PATHOOEN or foreign substance for the host, e.g., immunity to measles
resulting from a prior infection with measles virus.
IMMUNIIY, AcrtvE_ Resistance developed in response to stimulus by an antigen (infect-g ing agent or
vaccine) and usually characterized by the presence of antibody pro-
duced by the host.
tMMtINITY, NATURAL Species-determined inherent resistance to a disease agent, e.g., re-
sistance of man to virus of canine distemper.
IMMUF(IIY, PASStvE Immunity conferred by an antibody produced in another host and
acquired naturally by an infant from its mother or artificially by administration of
an antibody-containing preparation (antiserum or immune globulin).
IMMUNITY, sPEclne A state of altered responsiveness to a specific substance acquired
through immunization or natural infection. For certain diseases (e.g., measles,
chickenpox) this protection generally lasts for the life of the individual.
IMMUNIZATION (Syn: vaccination) Protection of susceptible individuals from communi-
cable disease by administration of a living modified agent (as in yellow fever), a
suspension of killed organisms (as in whooping cough), or an inactivated toxin (as
in tetanus). Temporary passive immunization can be produced by administration
of antibody dy in the form of immune globulin in some conditions.
IMPAIRMENT A physical or mental defect at the level of a body_system or organ. See
also 1NTERNATIONAL CIASSIFICATION OF IMPAIRMENTS_, DISARILITIFS, and_ HANDICAPS
for the official WHO_ definition.
INArPARENT INFEC'11ON (Syn: subclinical infection) The presence of infection in a host
without occurrence of recognizable clinical signs or symptoms. Of epidemiologic
significance because hosts so infected, though apparently well, may serve as silent
or inapparent disseminators of the infectious agent. See also DISEASE, PRE_C_LINICAL;
DISEASE, SUBCLINICAL; VECTOR-BORNE tNFECTION.
INCEPTION R._ATE The rate at which new spells of illness occur in a population; a term
applied principally to short-term spells of illness such as acute respiratory infec-
tions, and preferred bv some epidemiologists because an annual incidence rate for
such conditions mav exceed the numbers in the population at risk.
INCIDENCE (Syn: incident number) The number of instances of illness commencing, or
of persons falling ill. during a given period in a specified population.' More gen-
eralls, the number of new events, e.g., new cases of a disease in a defined popula-
tion, within a specified period of time. The term incidence is_ sometimes used to
denote INCIDENCE RATE. ' Pre.alence and Incidence. H'HO Dol l5:783-784, 1966.
IINCIDENCE DENSITY The per3on-time incidence rate: sometimes metimes used to describe the
hazard rate. See FORCE OF MORDIDITY. INCIDENCE-DENSITY R_AT70 (IDR) The ratio of two incidence
densities. See also RATE RA-
TIO. - INCIDENCE RATE The rate at which new events occur in a population. The numerator
is the number of new events that occur in a defined period; the denominator is the
population at risk of experiencing the event during this period, sometimes ex-
pressed as person-time. The incidence rate most often used in public health prac-
tice is calculated by the formula
Number of new events in specified period x 10"
--Numb_er of persons exposed to risk
during this period
In a dynamic population, the denominator is the average size of the population,
often the estimated population at the mid-period_. If the period is a year, this is the
annual incidence rate. This rate is an estimate of the person-time incidence rate,
i.e., the rate per 10' person-years. If the rate is low, as with many chronic diseases,
it is also a good estimate of the cumulative incidence rate. In follow-up studies with
no censoring, the incidence rate is calculated by dividing the number of new cases
in a specified period by the initial size of the cohort of persons being followed; this
is equivalent to the cumulative incidence rate during the period. If the number of
new cases during a specified period is divided by the sum of the person-time units
at risk for all persons during the period, the result is the person-time incidence
rate.
62
TOM"oz

incidence .tudy 64
INCIDENCE trTIIDY See CONORT STUDY.
INCIDENT NUMBER See INCIDENCE.
INCUBATION rER1OD
1. The time interval between invasion by an infectious agent and appearance of
the First sign or symptom of the disease in question.
2. In a vEcroR, the period between entry of the infectious agent into the vector
and the time at which the vector becomes infective; i.e., transmission of the
infectious agent from the vector to a fresh final host is possible (extrinsic in-
cubation period).
INDEtEND_ ENCE Two events are said to be independent if the occurrence of one is in no
wav predictable from the occurrence of the other. Two variables are said to be
independent if the distribution of values of one is the same for all values of the
other. Independence is the antonym of ASSOCIATION.
INDEPENDENT E_PEND_ ENT VARIABLE
I. The characteristic being observed or measured that is hypothesized to influ-
ence an event or manifesution (the dependent variable) within the defined
area of relationships under study; that is, the independent variable is not in-
fluenced by the event or manifestation but may cause it or contribute to its
variation.
2. In sutistics, an independent variable is one of (perhaps) several variables that
appear as arguments in a regression equation.
INDEX In epidemiology and related sciences, this word usually means a rating scale,, e.g.,
a set of numbers derived from a series of observations of specified variables. Ex-
amples include the many varieties of health status index, scoring systems for sever-_
itv or stage of cancer, heart murmurs, mental reurdation, etc.
INDEXCASE The first case in a family or other defined group to_ come to the attention
of the investigator. See also PROPOStrus.
INDEX GROUP (Svit: index series)
1. In an experiment, the group receiving the experimental regimen.
2. In a case control study, the cases.
3. In a cohort study, the exposed group.
INDICATOR VARIA.LE In sutistics, a variable taking only one of two possible values, one
(usually I) indicating the presence of a condition, and the other (usually zero) in_-
dicating absence of the condition. Used mainly in REGRESSION SION ANALYSIS.
INDIRECT ADJUSTMENT See slandardizalion. INDIVIDUAL VARIATION Two types are distinguished:
I. Intrarndroidual variation: The variation of biological variables within the same
individual, depending upon circumstances such as the phase of certain body
rhythms and the presence or absence of emotional svess. These variables do
not have a precise value, but rather a range. Examples include diurnal varia-_
tion in body temperature, fluctuation of blood pressure, blood sugar, etc.
2. Intrnnditridual tanation: As used by varwin, the term means variation brlwrrn
individuals. This is the preferred usage; the first usage is better described as
personal variation.
INDUCTION PERIOD The period required for a specific cause to produce disease. More
precisely, the interval from the causal action of a factor to the initiation of the
disease. For example, a span of many years may pass between (presumably) radiation-
induced mutations and the appearance of leukemia; this span would be the induc-
tion perNrd for radiogenic leukemia. See also INCUBATION PERtOD; LATENT PERIOD.
65 inference
INDUSTRIAL NYGIENE The science and art devoted to recognition, evaluation, and con-
trol of those environmental factors or stresses arising from or in the workplace,
which may cause sickness, impaired health, and well-being, or significant discomfort
and inefficiency among workers or among persons in the community. Alternatively,
the profession that anticipates and controls unhealthy conditions of work to prevent
illness among employees.
INFANT MORTALITY RATE (tMR) A measure of the yearly rate of deaths in children less
than one year old. The denominator is the number of live births in the same year.
Defined as
Number of deaths in a year of
children less than I year of age
Infant mortality rate = x 1000
Number of live birtris in the same year
This is often quoted as a useful indicator of the kvel of health in a community.
INFECTaRtLtTV The host characteristic or state in which the host is capable of being
infected. See 2150 INFECTtO_USNESS; INFECTIVITY.
INFECTION (Syn: colonization) The entry and development or multiplication of an in-s lectious agent
in the body of man or animals. Infection is not synonymous with
infectious disease; the result may be inapparent or manifest. The presence of living
infectious agents on exterior surfaces of the body is called "infestation" (e.g., pedi-
culosis, scabies). The presence of living infectious agents upon articles of apparel
or soiled articles is not infection, but represents CONTAMINATtON Of such articks.
See a1S0 INAPPARENT INFECTION: TRANSMISSION OF tNFEC_TION.
INFECTION, GRADIENT- OF The range of manifestations of illness in the host reflecting
the response to an infectious agent, which extends from death at one extreme to
inapparent infection at the other. The frequency of these manifestations varies with
the specilic infectious disease. For example, human infection with the virus of ra-
bies is almost invariably faul, whereas a high proportion of persons infected in
childhood with the virus of hepatitis A, experience a subclinical or mild clinical
infection.
INFECT7ON. LATENT tERIOD oF The time between initiation of infection and first shed-
ding or excretion of the agent.
INFECTION, ZIURCLINICAL See INAPPARENT INFECTION.
INFECTIOUS DISEASE See COMMUNICABLE DISEASE.
INFECnouaNESS A characteristic of the disease that concerns the relative ease with which
it is transmitted to other hosts. A droplet spread disease, for instance, is more in-
fectious than one spread by direct contact. The characteristics of the portals of exit
and entry are thtts also determinants of infectiousness, as are the agent character-
istics of ability to survive away from the host, and of infectivity.
INFECTI V ITY
I. The characteristic of the disease agent that embodies capability to emer, sur-
vive, and multiply in the host. A measure of infectivity is the secondary attack
rate.
2. The proportion of exposures, in defined circumstances, that results in infec-
tion.
INFERENCE The process of passing from observations and axioms to generalizations. In
statistics, the development of generalization from sample data, usually with calcu-
lated degrees of uncertainty.
ZaVzTsCzoz

infestation
66
rNFESrAT1oN The development on (rather than in) the body of a pathogenic agent, e.g.,
body lice. Some authors use the term also to describe invasion of the gut by parasitic
worms.
rNFORMAnON trsTtsM A combination of vital and health statistical data from multiple
sources. used to derive information about the health needs, health resources, costs,
use of health services, and outcomes of use by the population of a specified juris-
diction. The term may also describe the automatic release from computers of stored
information in response to programmed stimuli. For example, parents can be no-
tified tified when their children are due to receive booster doses of an immunizing agent
against infectious disease.
INIrORMED CONSENT Voluntary consent given by a subject or by a person responsible for
a subject (e.g., a parent) for participation in an investigation, immunization pra-
gram, treatment regimen, etc., after being informed of the purpose, methods, pro-
cedures. benefits. and risks. Awareness of risk is necessary for any subject to make
an informed choice. The term also refers to consent for medical care.
INOCULATION Sit VACCINATION.
INPUT
1. The sum total of resources and energies purposefullY engaged in order to
intervene in the spontaneous operation of a system.
2. The basic resources required in terms of manpower. money, materials, and
time.
INSTANTANEOUS INCIDENCE RATE See FORCE Or MORBIDITY.
tNSTSIUMENTAL ERROR Error due to faults arising in any or in all aspects of a measuring
instrument. i.e.. calibration, accuracy, precision, etc. Also applied to error arising
from impure reagents, wrong dilutions, etc.
tNTERACTtON
I. The interdependent operation of two or more causes to produce or prevent
an effed. Biobgical inuradion means the interdependent operation of two or
more causes to produce, prevent, or control disease. See also ANTAGONISM;
SYNERC.ISM.
2. Differences in the effects of one or more factors according to the level of the
remaining faclor(s). See 71S0 EEFECT MODIFIER. 3. In sutistics. the necessity for a product term in
a linear model.
tMERMEDIATE VARIASLE (Syn: contingent variable, intervening Icausall variable, me-
diator variable) A variable that occurs in a causal pathway from an independent to
a dependent variable. It causes variation in the dependent variable, and itself is
caused to vary b) the independent variable. Such a variable is statistically associated
with both the independent and dependent variables.
1NTERNAL VAWDtTV See YALIDITY, STUDY.
INTERNATIONAL CLASSIr7CAT1ON OF DISEASE (ICD) The classification of specific condi-
tions and groups of conditions determined by an internationally representative group
of experts who advise the World Heahh Organization, which publishes the com-
plete list in a periodically revised book, the (Manual of tht) lntrrnalional StalulUcal
Classificalion of DDSases. lnjurin and Causes of Death. Every disease entity is assigned
a number. There are 17 major divisions (rhad/m) and a hierarchical arrangement
of subdivisions (n.brw) within each. Some chapters are "etiologic." e.g.. Infective
and Parasitic Conditions; more relate to body syslems, e.g., Circulatory System; and
some to classes of condition, e.g, neoplasms, injury (violence). The heterogeneity
of categories reflects prevailing uncertainties about causes of disease (and classifi-
67 involuntary smoking
cation in relation to causes). The Ninth Revision of the Manual (ICD-9) was pub-
lished by WHO in 1977, after ratification in 1976.
INTERNATIONAL CLARtr/ICA4ION Or HEA_LTH YRORt_tMS IN ERIMARY CASE (ICHPPC) A
classification of diseases, conditions, and other reasons for attendance for primary
care. May be used for labeling conditions in problem-oriented records as used by
primary care health workers. This classification is an adaptation of the ICD but
makes more allowance for the diagnostic uncertainty that prevails in primary care.
This classification is now in its second revision (ICHPPC-2). See also PRORLEM-
ORIENTE/) MEDICAL RECORD.
INTERNATIONAL CLASSIFICATION OF 1_MPAIRMENTIt, DISAIILI77Ea, AND HANDICAPS
(ICIDH) First published by WHO in 1980. this is an attempt to produce a systematic
taxonomy of the consequences of injury and disease.
An impairmrnl is defined in ICIDH as any loss or abnormality of psychological,
physiological, or anatomical structure or function. 11 is concerned with abnormali-
ties of body structure and appeannce and with organ or system function resulting
from any cause; in principle, impairments represent disturbances at the organ level.
A duab)'Lh is defined in ICIDH as any restriction or lack (resulting from an im-
pairment) of abilitY to perform an activity in a manner or within the range con-
sidered normal for a human being. The term disability reflects the consequences of
impairment in terms of functional performance_ and activity by the individual; dis-
abilities thus represent disturbances at the level of the person.
A handicap icap is defined in ICIDH as a disadvantage for a given individual, resulting
from an impairment or a disability, that limits or prevents the fulfillment of a role
that is normal (depending on age, sex, and social and cultural practice) for that
individual. The term handicap thus reflects interaction with and adaptation to the
individual's surroundings.
INTERNAT7ONAL COMPARISON See GEOGRAPHIC PATHOLOGY. See 2180 CROSS-CULTUML
STUDY.
INTERNAL VALIDtII' See t'ALIDITV. STUDY.
INTERPOLATE, INTERrOt.Ar1oN To predict the Value of variates within the range of ob-
servations; the resulting prediction.
INTERVAL INCIDENCE DENSITY See PERSON-TIME INCID_ENCE RATE.
INTERVAL SCALE See MEASUREMENT SCALE. INTERVENING CAUSE See1NTERME_DIATE VARIABLE.
INTERVENING VARIABLE.
1. Synonym for INTERMEDIATE VARIARIJE.
2. A variable whose value is altered in order to block or alter the effect(s) of
another factor.
See a1SO CAUSALITY. FACTORS IN.
INTERVENTION STUDY An epide_miologic investigation designed to test a hypothesized_
causetffect relationship by modifying a supposed causal factor in a population.
INTERVIEW SCHEDULE The precisely designed set of questions used in an interview. See
aISO SURVEY INSTRUMENT. - INVOLUNTARY LUNTARY sMORtNG (Syn: passive smoking) The inhalation by
nonsmokers of to-
bacco smoke left in the air by smokers; includes both smoke exhaled by smoke'rs
and smoke released directly from burning tobacco into ambient air; the latter is
called sidestream smoke and contains higher proportions of toxic and other carcin-
ogenic substances than exhaled smoke. The adjective "involuntary" is preferable to
"passive" as the latter implies acquiescence-increasingly, nonsmokers are anything
UO V ZkSCGlJZ

island population 68
but acquiescent about this form of air pollution. "Passive" is, however, t:ustoma -ry
WHU usage.
Ist.ANO roeuuTroN A group of individuals isolated from larger groups and possessing
a relatively limited gene pool; alternatively, a group that is immunologically isolated
and may therefore be unduly susceptible to infection with alien pathogens.
tsouTiE (noun) Term used in genetics to describe a subpopulation (generally small) in
which matings take place exclusively with other members of the same subpopula-.
1 iOn.
LSOLATION
I. In microbiology, the separation of an organism from others, usually by mak-
ing serial cultures.
2. Separation, for the period of communicability, of infected persons or animals
from others in such places and under such conditions as to prevent or limit
the direct or indirect transmission of the infectious agent front those infected
to those who are susceptible or who may spread the agent to others. Control of
Communicabb Dura.tr in Man' lists seven categories of isolation as follows:
a. Strict isolation: This category is designed to prevent transmission of highly
contagious or virulent infections that may be spread by both air and con-
tact. The specifications, in addition to those above, include a private room
and the use of masks. gowns, and gloves for all persons entering the room.
Spccial ventilation requirements with the room at negative pressure to sur-
rounding areas are desirable.
b. Contact isolation: For less highly transmissible or serious infections, for dis-
eases or conditions that are spread primarily by close or direct contact. In
addition to the basic requirements, a private room is indicated but patients
infected with the same pathogen may share a room. Masks are indicated
for those who come close to the patient, gowns are indicated if soiling is
likely, and gloves are indicated for touching infectious material.
c. Ra)nratorr uoleuen: To prevent transmission of infectious diseases over short
distances through the air, a private room is indicated but patients infected
with the same organism may share a room. In addition to the basic require-
ments, masks are indicated for those who come in close contact with the
patient: gowns and gloves are not indicated.
d. Tuberculosis isolation (AFB isolation): For patients with pulmonary tubercu-
losis who have a positive sputum smear or chest-x-rays that strongly suggest
active tuberculosis. Specifications include use of a private room with special
ventilation and the door closed. In addition to the basic requirements, masks
are used onl) if the patient is coughing and does not reliably and consis-
tently cover the mouth. Gowns are used to prevent gross contamination ol
clothing. Gloves are not indicated.
e. Enemc drecaulimu: For infections transmitted by direct or indirect contact
with feces. In addition to the basic requiremenas, specifications include use
of a private room if patient hygiene is poor. Masks are not indicated: gowns
should be used if soiling is likely and gloves are to be used for touching
contaminated materials.
f. Rrainagrh.crraon prccauJion/: To prevent infections transmitted by direct or
indirect contact with purulent material or drainage from an infected body
site. A private room and masking are not indicated: in addition to the basic
requirements, gowns should be used if soiling is likely and gloves used for
touching contaminated materials.
69 isometric chart
g. BlooAlb_odr fluid precautions: To prevent infections that are transmitted by
direct or indirect contact with infected blood or body fluids. In addition to
the basic requirements, a private room is indicated if patient hygiene is
poor; masks are not indicated; gowns should be used if soiling of clothing
with blood or body fluids is likely. Gloves should be used for touching
blood or body fluids.
See also QUARANTINE.
t Benensnn AS (Ed): Control of Cointnunicabl. Uuiasrs in Man, 14th ed. Washington DC: American
!'ublic Health Association. 1985.
LSOMCTRIC CHART A chart or graph that portrays three dimensions on a plane surface.
t`atYzTsCzoz

JACSUtNtrE A technique for estimating the variance and the bias of an estimator. If the
sample size is n, the estimator is applied to each suhsampk of size n- 1, obtained
by dropping a measurement from analysis. The sum of squared differences be-
tween each oF the resulting estimates and their mean, multipled by (n- I)/n, is the
jackknife estimate of variance; the difference between the mean and the original
estimate, multiplied by (n- I), is the jackknife estimate of bias.
JENNER, EDwAStD (1749-1823) An English physician and naturalist. On the basis of the
observation that dairymaids who had had cowpox never got smallpox, he inoculated
a bo$age 10 with cowpox (vaccinia) in 1796. Over the succeeding two years he
inoculated 22 more persons and then attempted to inoculate them with smallpox,
always without inducing this infection. n. The results of his work were published in
An Inqutry into the Cause and Effects of the Vanolar Vaccinac (London, 1798). This
successful method of immunizing persons and populations against smallpox led
directlv to the ultimate worldwide eradication of smallpox in 1977.
KAP (KmowLEl>CE, ATTrTUD!'3, [RACT7CE) RURVEY A formal survey, using face-to-face
interviews, in which women are asked standardized pretested questions dealing with
their knowledge ot, attitudes toward, and use of contraceptive methods. Iktailed
reproductive histories and attitudes toward desired family sire are also elicited.
Analysis of responses provides much useful information on family planning and
gives estimates of possible future trends in population structure. The term has
sometimes been used to describe escribe other varieties of survey of knowledge, attitudes,
and practice. e.g., health promotion in general or in particular, cigarette smoking.
KAPPA A measure of the degree of nonrandom agreement between observers or mea-
surements of the same categorical variable
o
AA
P
-P,
I -P,
where Po is the proportion of times the measurements agree, and P, is the propor-
tion of times they can be expected to agree by chance alone. If the measurements
agree more often than expected by chance, kappa is positive; if concordance is
complete, kappa= 1; if there is no more nor less than chance concordance, kappa = 0;
if the measurements disagree more than expected by chance, kappa is negative.
KENDALL'S TAU See CORREU1TIoN COErrIC1ENT.
Kocw, RosEStT (1843-1910) German physician, pathologist, and bacteriologist. One of
the founders of microbiology and an important contributor to our understanding
of infectious disease epidemiology. His major contributions to medical science in-
clude the life cycle of anthrax, the etiology of traumatic infection, methods of fixing
and staining bacteria, and, in 1882, the discovery of the tubercle bacillus. The paper
71 kurtos+is
reporting this contained the 6rst statement of RocN's rosTUl.tTES. In 1883, he dis-
_
covered the cholera vibrio. He was awarded the Nobel Prize in 1905.
Kocx's rosruLATES First formulated by Henle and adapted by Robert Koch in 1877,
with elaborations in 1882. Koch stated that these postulates should be met before a
causative relationship can be accepted between a particular bacterial parasite or
disease agent and the disease in question.
I. The agent must be shown to be present in every case of the disease by isolation
in pure culture.
2. The agent must not t be found in cases oF other disease.
3. Once isolated, the agent must be capable of reproducing the disease in exper-
imental animals.
4. The agent must be recovered from the experimental disease produced.
See alSn CAUSALITV: EVANS'S tOSTULATES.
KuRTOSts The extent to which a unimodal distribution is peaked.
S0t`t[. Y9a'iC.0Z 70

L
/
i.ARCE SAMPLE METHOD (Svn: asymptotic method): Any statistical method based on an
approximation to a normal or other distribution that becomes more accurate as
sample size increases. An example is a chi square test on a set of frequencies.
LATENT /MMUNIZATION The process of developing immunity by a single or repeated
inapparent asymplomalic infection. Not necessarily related to latent infection. See
also IMMUNITY, ACQUIRED.
LATENT tNFECrrON Persistence of an infectious agent within the host without symptoms
(and often without demonstrable presence in blood, tissues. or bodily secretions of
host).
LATENT PERIOD (Syn: latency) Delay between exposure to a disease-causing agent and
the appearance of maniiestations of the disease. After exposure to ionizing radia-
tion, for instance, there is a latent period of five years. on avenge. before devel-
opment of leukemia, and more than 2(1 years before development of certain other
malignant conditions. The term "latent period° is often used synonymously with
"induction period." that is. the period between exposure to a disEase-causing agent
and the appearance of manifestations of the disease. It has also been defined as the
period from disease initiation to disease detection. See also INCUBATION PERIOD:
INDU(TION PERIOD.
LATIN SQUARE One of the basic statistical designs for experiments that aim at removing
from the experimental error the variation from two sources, which may be identi-
fied with the rows and columns of the square. In such a design the allocation of M
experimental treatments in the cells of a k by A(latin) square is such that each
treatment occurs exactly once in each row and column. A design for a 5 x 5 square
is as follows:
A B C
B A E
C D A
D E B
E C D
D E
C D
E B
A C
B A
After Kendall and Buckland.'
' Kendall MG. Buckland AA: A Dic/ionan o/Swtiuiral Tmpu, 4th ed. I.nndon: Longman, 1982.
1LAVEKAN, ALPHONSE (1845-1922) French army surgeon who discovered iscovered the malaria
parasite (1880) while on service in Algeria. Though initially sceptical, the scientific
community soon accepted the vafidity of hveran's discovery, which was confirmed
and enlarged by Golgi, Grassi, and others. Uveran was awarded the Nobel Prize
for medicine in 1907.
73 life table
LEAD TIME The time gained in treating or controlling a disease when detection is earlier
than usual, e.g., in the presymptomatic suge, as when screening procedures are
used for detection.
LEAD TIME SIAS (Syn: zero time shift) Overestimation of survival time, due to the back-
ward shift in the starting point for measuring survival that arises when diseases
such as cancer are detected early, as by screening procedures.
LEAST S(QUARES A principle of estimation, due to Gauss, in which the estimates of a set
of parameters in a statistical model are those quantities that minimize the sum of
squared differences between the observed values es of the dependent variable and the
values predicted by the model.
L_E_D_ ERMANN FORMULA Ledermann 1 showed empirically that the frequency distribution
of alcohol consumption in the population of consumers may be log-normal: the
curve is sharply skewed-approximately one-third of drinkers consume more than
6(K7r of the total amount of alcohol. Among drinkers the proportion of persons
with alcoholism remains constant at around 7-9%. The pattern of consumption of
illicit drugs among users may also be log-normal. Questions have been raised, how-
ever. about the validity of some assumptions upon which the formula is based.
1 trdermann S: A(roof, Alrooluev tl Akoolualion. Paris: Presses universitaires de France. 1956.
L_EEUWENI,OEK, ANTONI vAN (1632-1723) An earl) microscopist from Iklft, in the
Netherlands, the first to use his microscopes to examine and describe small crea-
tures (nnimnlculrs) such as the protozoan organisms in vaginal secretions, sperma-
tozoa, and with growing ability to make more powerful microscopes, infectious mi-
crtxsrganism. He was thus a key figure in the development of the germ theory oF
disease.
LEVIN'S ATTRISITFARLE RISK See ATTRIEUTABtE FRACTION (POPUL/tT10N).
LIFE EVENTS Changes or disruptions in the pattern of living that may be associated with
or produce changes in health. The relationship of "life stress" and "emotional stress"
to onscl of several kinds of serious chronic disease such as coronary heart disease
and hypertension has been the subject of epidemiologic studies. The Rahe-Holmes
Social Readjustment Rating Scale' was the first to be developed to assign ranks or
ratings to significant life events such as death of a spouse or other close relative.
loss of regular job, relocation, marriage. divorce, etc. Many other rating scales have
since been developed.
1 Holmrs T-H, Rahe RH: The social readjustment rating sole. J Pnr6olmeahr Rrs 1:213-218, 1967.
LIFE EXPECTANCY See EXPECTATION OF LIFE.
LIFE EXPECTANCY FREE FROM DISARIt.RY (LEFD) An estimate of life expectancy adjusted
for activity-limitation (data for which are derived from hospital discharge statistics,
etc.). See alSO QALY. LIFE STYLE The set of habits and customs that is intluenced, modified,
encouraged, or
constrained by the lifelong process of socialization. These hahits and customs in-
clude use of substances such as alcohol, tobacco, tea, coffee: dietary habits, exercise,
etc., which have important implications for health and are often the subject of epi-
demiolugic investigations.
L_IFE_ TABLE A summarizing technique used lo describe the pattern ofmortality and
survival in populations. The survival data are time specific arld cumulative proba-
bilities of survival of a group of individuals subject, throughout life. to the age-
specific death rates in question. The life table method can be applied to the study
not only of death, but also of any defined endpoint such as the onset of disease or
the occurrence of specific complication(s) of disease. The survivors to age x are
denoted by the symbol ! the expectation of life at age x is denoted by thc symbol
9QV?' Fa 72
(.Z()?'

life table 74
l and the proportion alive at age x who die between age x and x+ I years is de-
noted by the symbol nq,. The life table method is used extensively in epidemiology
and in many assessments of treatment regimens in clinical practice.
The first rudimentary life tables were published in 1693 by the astronomer Ed-
mund Halley. These made use of records of the funerals in the city of Breslau. In
1815 in England, the first actuarially correct life table was published, based on both
population and death data classified by age.
Two types of life tables may be distinguished according to the reference year of
the table: the current or period life table and the generation or cohort life table.
The current life table is a summary of mortality experience over a brief period
(one to three years), and the population data relate to the middle of that period
(usually close to the date of a census). A current life table therefore represents the
combined mortality experience by age of the population in a particular short period
of time.
The cohort or generation life table describes the actual survival experience of a
group, or cohort, of individuals born at about the same time. Theoretically. Ihe
mortality experience of the persons in the cohort would be observed from their
moment of birth through gh each consecutive age in successive calendar years until all
of them die.
The clinical life table describes the outcome experience of a group or cohort of
individuals classified according to their exposure or treatment history.
l-ife tables are also classified according to the length of age interval in which the
data are presented. A complete life table contains data for every single year of age
from birth to the last applicable age. An abridged life table contains data by inter-
t'a13 Of five Or ten years Of age. See alsO EXPECT-ATION or LIFE: SURVIVORSHIP STUDY.
u_ rE_ TABLE, ExPECTAT70N or IJFE FUNCtIoN, t, (Syn: average future lifetime) The ex-
pecution of life function is a statement of the average number of years of life
remaining to persons who survive to age x.
L/FE TARLE, suttvtvoRsHtr ntNCnoN, f, The survivorship function is a sutement of
the number of persons out of an initial population of defined size, e.g., 100,00(1
live births, who would survive or remain free of a defined endpoint condition to
age x under the age-specific rates for the specified year. The value of 140, for ex-
ample, is determined by the cumulative operation of the specific death rates for all
ages below 40.
uttnwE RISR The risk to an individual that a given health effect will occur at any time
after exposure, without regard for the time at which that effect occurs.
uRELtHUOD rvNCnoN A function constructed from a statistical model and a set of ob-
served data, which gives the probability of the observed data for various values of
the unknown model parameters. The parameter values that maximize the proba-
bility are the maximum likelihood estimates of the parameters.
uKE_LINOOD RAT10 TEST A statistical test based on the ratio of the maximum value of
the likelihood function under one statistical model to the maximum value under
another statistical model; the models differ in that one includes, the other exc(udes,
one or more parameters.
U+D, jAMES (1716-1794) British naval surgeon: contributed to improved hygiene aboard
ships. Conducted what amounted to Fpidemiologic experiments (albeit with small
numbers) which established that scurvy could be prevented by fresh fruits such as
lemons and oranges.
LINEAR MODEL A sutistical model in which the value of a parameter for a given value
of a factor, a, is assumed to be equal to a+bx, where a and b are constants.
75 Louis, Pierrc-Charles-Alextmdre
LINEAR REGRESSION Regression analysis of data using ing linear models.
LINRAGE See GENETIC LINKAGE: RECORD LINKAGE.
LtvE .IRnt WHO definition adopted by Third World Health Assembly, 1950: Live
birth is the complete expulsion or extraction from its mother of a product of con-
ception, irrespective of the duration of the pregnancy, which, after such separation,
breathes or shows any other evidence of life, such as beating of the heart, pulsation
of the umbilical cord, or definite movement of voluntary muscles, whether or not
the umbilical cord has been cut or the placenta is attached; each product u_ ct of such a
birth is considered live born.
1 n the Rrport of WHO Expert Conlmillrr on Prrurn/ion of f'rnrwJal MorialirY and Mor-
bidih (Technical Rrport Srrara 457, 1970), it is noted that the above definition requires
the inc(usion as live births of very early and patently nonviable fetuses and that
accordingly it is not stricUy applied. The committee suggested, therefore, that WHO
should introduce a viability criterion into the definition so_ that very immature fe-
tuses surviving for very short periods were excluded, even though they showed one
or more of the transitory signs of life.
LOCUs
1. The position of a point, as defined by the coordinates on a graph.
2. The position that a gene occupies on a chromosome.
LOD iMRE In gene_t_ics, the log odds ratio of observed to expected distribution of ge-
netic markers.
LOGISTIC G_ ISTIC MODEL A statistical model of an individual's risk (probability of disease )) as a
function of a risk factor x:
P(r~T)= ~ ~
1 +r-- '
where r is the (natural) exponential function. This model has a desirable range. 0
to I, and other attractive statistical features. In the multiple logistic model, the term
fi.r is replaced by a linear term involving several facton, e.g., pixl+prxr if there are
two factors xl and xs.
Loorr (Syn: log-odds) The logarithm of the ratio of frequencies o_ f two different cate-
gorical outcomes such as healthy versus sick.
LOGIT MODEL A linear model for the logit (natural log of the odds) of disease as a
function of a quantitative factor X:
Logit (disease given X=x)=o+fir
This model is mathematically equivalent to the t.octsTle MonEL.
LOC-LINEAR MODEL A Statistica( model that uses an ANALYSIS OF VARIANCE type of ap-
proach for the modeling of frequency counts in contingency ubles.
LO_C-NORMAL DISTRIBlrT10N If a variable V is such that X=log Yis normally distributed,
it is said to have log-normal distribution. This is a sREw DISrRIBtr-TION. Set also
NORMAL DISTRIlUTION,
LONGtT11DINAL STUDY See COHOR-T- STUDY.
LOUts, PIERRE-CHARLES-ALExANDRE (1787-1872) French physician and mathemati-
cian. One of the founders of medical sutistics, his research on tuberculosis, which
included dissection of 358 specimens and study of 1960 clinical cases, led to publi-
cation of Rrchrrchn onoto,nicopnthologiqua sur la phthitir (Paris, 1825). This work and
others are marked by rigorous numerical precision and demonstration of similari-
ties and differences based upon numerical distribution of data. The Lilienfelds'
40V%T5CZ0%

low birth weight 76
have pointed out that Louis greatly influenced the development of statistics as ap-
plied in biology and medicine: he either taught or otherwise directly influenced
many European, British, and American workers. including William Farr, John Si-
mon, William Augustus Guy, and William Budd in England. George Shattuck, Elisha
Barnett. and Alonzo Clark in the United States. and Joseph Skoda in Hungary;
those he influenced handed on these important concepts to their own pupils.
' Lilienleld AM. Lilienfeld D: Threads of epidemiological histury. in foundalionq of EQidtendaRy,
2nd Ed. (New York; pKford. 1980), pp. 23-45.
LOW OIRTH WEIGHT See DIRTH WEIGHT.
"LU_Mr1NG AND_ StLrTTING" Derisive term describing the propensity of epidemiologists
to group related phenomena or to separate phenomena that hitherto have been
MAL_ARtA
ENDEMICITY Certain terms used to describe the occurrence of malaria, based
grouped. Epidemiologists are sometimes called "lumpers and splitters." on enlarged spleen rates are
categorized by WHO as follows:
1. Hypoendemic: Spleen rate in children 2-9 years <10%.
2. Mesoendemic: Spleen rate 11-50%.
3. Hyperendemic: Spleen rate in children over 50%, in adults usually over 25%.
4. Holocndemic: Spleen rate in children constantly over 75%, adult rate low.
MALARIA PERIODICITY Recurrence at regular intervals of symptoms; periodicitv may be
quotidian. tertian, or quartan, according to the interval between pa_ roxvsms:
1. Quartan: Recurring every third day, i.e., day I, day 4, day 7, etc.
2. Quotidian: Recurring daily.
3. Tertian: Recurring every alternate day, i.e., day 1, day 3 etc.
MALARtA rATEPIT rER10D Period during which parasites are present in peripheral blood.
MALARIA RErRODUCTION RATE Estimated number of malarial infections potentially dis-
tributed by the average nonimmune infected individual in a community where nei-
ther persons nor mosquitoes were previously infected.
MALARIA SURYEY Investigation in selected age-Rroup samples in randomh selected lo-
calities tu assess malaria endemicity; uses spleen and/or parasite ntes as measure of
endenticity.
MALTHUS, THOMAa RonERT (1766-1834) An English clergyman and natural scientist
who argued in Aa Euar on dtr Pr~nciPls of Population (London, 1798) that popula-
tions increase in geometric progression while food supplies increase only in arith-
metical progression, thus making famine inevitable. His work justifies his recogni-
tion as one o f the founders of dernography, even though events proved his predictions
s.rong (at Icast in the short term).
MANSON, PATRICK (1844-1922) Studied tropical diseases in China and made many con-
trihutions of fundamental imporunce, notably the transmission of filariasis by cul-
icine mosquitoes, parts of the life cycle of schistosomes. He investigated and otr
served many other tropical parasitic diseases and founded the London School of
(Hygiene and) Tropical Medicine in 1898.
MANTEL-HAENSZEL ESTIMATE, MANTEL-HAENSZEL ODDS KAT10 Mantel and Haensrel'
provided an adjusted oDns RATIO as an estimate of relative risk that may be derived
Irom grouped and matched sets of data. au. It is now known as the Mantel-Haenszel
estimate, one of the few eponymous terms of modern epidemiologv.
The statistic may be regarded as a type of weighted average of the individual
odds ratios, derived from stratifying a sample into a series of strata that are inter-
nally homogeneous with respect to confounding factors.
The Mantel-Haenszel summarization method can also be extended to the sum-
marization of rate ratios and rate differences from fo(low-up studies.
'Manter N, Hacnszel W: Sutistical aspects of the analvsis of data from retrospective studies of
disease. f Natl Cancrr hw 22:719-748. 1959.
gQV%TSeZ6?'

.-1
Mantel-Haenrrzel test 78
MANTEL-HAENSZEL TEST A summary CHI-SQUARE TEST developed by Mantel and
Haenszel for stratified data and used when controlling for CoNFOUNDING
MARGIN OF sAFtcrr An estimate of the ratio of the no-observed-effect level (NOEL) to
the level accepted in regulations.
MARGtNALS The row and column totals of a contingency table.
MARROV rRoCESs A stochastic process such that the conditional probability distribution
for the state at any future instant, given the present state, is unaffected by any
additional knowledge of the past history of the system.
MASRED STUDY See RLINDED STUDY.
MAStuNG Procedure(s) intended to keep participant(s) in a study from knowing some
fact(s) or observation(s) that might bias or influence their actions or decisions re-
garding the study.
MATCHED CONTROLS SFE GONT7lOLS. MATCHED.
MATCHING The process of making a study group and a comparison group comparable
with respect to extraneous factors. Several kinds of matching can be distinguished:
Caliper matching is the process of matching comparison group subjects to study
group subjects within a specified distance for a continuous variable (e.g., matching
age to within two years).
Frequency matching requires that the frequency distributions of the matched
variable(s) be similar in study and comparison groups.
Category matching is the process of matching study and control group subjects
in broad classes such as relatively wide age ranges or occupational groups.
Individual matching relies on identifying individual subjects for compari_son, each
resembling a study subject on the matched variable(s).
f air matching is individual matching in which study and comparison subjects are
paired.
MATERNAL MORTAUTY (RATE) The risk of dying from causes associated with childbirth
is measured by the maternal mortality rate. For this purpose the deaths used in the
numerator are those arising during pregnancy or from puerperal causes. i.e.. deaths
occurring during and/or due to deliveries, complications of pregnancy, childbirth.
and the puerperium. Women exposed to the risk of dying from puerperal causes
are those who have been pregnant during the period. Their number being un-
known, the number of life births is used as the conventional denominator for com-
puting comparable maternal mortality rates. The formula is
Number of deaths from puerperal
causes in a given geographic area
Annual maternal w during a given year x 1000 (or 1t>0,000)
mortality rate Number of live birthsthat
occurred among the population of
the given geographic area during
the same year
, There is variation in the duration of the postpartum period in which death may
occur and be certified due to "puerperal causes," i.e., "maternal mortality." Accord-
ing to WHO, a maternal death is defined as the death of a woman while pregnant
or within 42 days of termination of pregnancy. irrespective of the duration and the
site of pregnancy, from any cause related to or aggravated by the pregnancy or its
management but not from accidental or incidental causes.
Maternal deaths should be subdivided into two groups: (1) direct obstetric deaths,
II
79 mea.urement
resulting from obstetric complications of the pregnant state, and (2) indirect obstet-
ric deaths, resulting from preexisting disease or conditions not due to direct obstet-
ric causes.
Although WHO defines maternal mortality as death during pregnancy or within
42 days of delivery, in some jurisdictions, a period as long as a year is used.
MATHEMAT7CAL MODEL A representation of a system, process, or relationship in math-
ematical form in which equations are used to simulate the behavior of the system
or process under study. The model usually consists of two parts: the mathematical
structure itself, e.g.. Newton's inverse square law or Gauss's "normal" law, and the
particular constants or parameters associated with them, such as Newton's gravita-
tional constant or the Gaussian standard deviation.
A mathematical model is deterministic if the relations between the variables in-
volved take on values not allowing for any play of chance. A model is said to be
statistical, stochastic, or random, if random variation is allowed to enter the picture.
S'e a1SO MODEL.
MAXIMUM ALLOWASLE CONCENTRATION (MAC) See 3AFETT STANDARDS.
MAXIMUM LIKELIHOOD ESTIMATE The value for an unknown parameter that maximizes
ximiz_es
the probability of obtaining exactly the data that were observed.
MeNEMAR's TEST A form of the CHI-sQUARE TEST for matched-pairs data. It is a special
case of the MANTLL-HAENSZEL TEST.
MEAN, ARITHMETIC A MEASURE OF CENTRAL TENDENCY. Calculable only for positive val-
ues. It is calculated by taking the logarithms of the values, calculating their arith-
metic mean, then converting back by taking the antilogarithm.
MEAN, HARMONIC A MEASURE OF CENTIlAL TENDENCY computed by summing the recip-
rocals of all the individual values and dividing the resulting sum into the number
of values. -
MEASURE OF A_SSOCIATION A quantity that expresses the strength of association between
variables. Commonly used measures of association are differences between means,
propcmions or rates, the rate ratio, the odds ratio, and correlation and regression
coefficients.
MEASUREMENT The procedure of applying a standard scale to a variable or to a set of
values.
MEASUREMENT, rRORt.tM! wITH TERMINOtAGY There is sometimes uncertainty about
the terms used to describe the properties of measurement: accuracy, precision, va-
lidity, reliability, repeatability, and reproducibility. Accuracy and precision are often
used synonymously, validity is defined variously, and reliability, repeatability, and
reproducibility are frequently used interchangeable.
Etymologies are helpful in making a case for preferred usages, but they are not
always decisive. Accurac7 is from the Latin cura, care, and while this may be of
interest to those in the health field, it does not illuminate the origins of the standard
definition, that is, "conforming to a standard or a true value" (OED). Accuracy is
distinguished from precision in this way: A measurement or statement can reflect
or represent a true value without detail. A temperature reading of 98.6°F is accu-
rate, but it is not precise if a more refined thermometer registers a temperature of
98.6'J7°F.
Pucirion (from Latin pnucidnr, cut short) is the quality of being sharply defined
through exact detail. A faulty measurement may be expressed precisely, but may
not be accurate. Measurements should be both accurate and precise, but the two
terms are not synonymous. Consistency or reliability describes the property of inea-_
suremenu or results that conform to themselves.
GlJt (~ i'.~i4%lJz

meaauremeot scale 80
RtluibiGty (Latin rr(igore, to bind) is defined by the OED as a quality that is sound
and dependable. Its epidemiologic usage is similar; a result or measurement is said
to be reliable when it is stable, i.e., when repetition of an experiment or measure-
ment gives the same resu)ts. The terms "repeatability" and "reproducibility" are
synonymous (the OED defines each in terms of the other), but they do not refer 10
a quality of measurement, rather only to the action of performing something more
than once. Thus, a way of discovering whether or not a measurement is reliable is
to repeat or reproduce it. The terms"repeaubility''and "reproducibility," formed
from their respective verbs, are used inaccurately when they are substituted for
"reliability:' a noun that refers to the measuring procedure rather than the al-
tribute being measured. However, in common usage, both repeatability and re-
producibilitr refer to the capacity of a measuring procedure to produce the same
result on each occasion in a series of procedures conducted under identical condi-
tions.
Validur is used correctly when it agrees with the standard definition given by the
OED: "sound and sufficient." If, in the epidemiologic sense. a test measures what it
purporu to measure (it is sufficient) then the test is said to be valid. See also ACCU-
RACI': PREC_ISION; RELIARILITY; REPEATARILIT-Y; VALIDITY.
MEwsuREMENT SCALE The complete range of possible values for a measurement (e.g..
the set of possible responses to a question, the physicalh. possible range for a set of
bod) weights). Measurement scales are sometimes classified into five major types,
according to the quantitative character of the scale:
I. Dlchotomous xa_4: One that arranges items into either of two mutually exclusive
categories.
2. Nominal ttalE: Classification into unordered qualitative categories; e.g., race,
religion. and country of birth as measurements of individual attributes are
purely nominal scales, as there is no inherent order to their categories.
3. Ordinal eca/c Classification into ordered qualitative categories, e.g., social class
(I. II, I1I etc.). where the values have a distinct order, but their categories are
qualitative in that there is no natural (numerical) distance between their pos-
sible values.
4. lnlma( fcalr: An (equal) interval involves assignment of values with a natural
distance between them, so that a particular distance (interval) between two
values in one region of the scale meaningfully represents the same distance
between two values in another region of the scale. le. Examples include Celsius
and Fahrenheit temperature, date of birth.
5. Ralio xa4: A ratio is an interval scale with a true zero point, so that ratios
between values are meaningfully defined. Examples are absolute tempenture,
Meight. height, blood count, and income, as in each case it is meaningful to
speak of one value as being so many times greater or less than another value.
MEASURES OF CENTRAL TENDENCY A general-term for several characteristics of the dis-
tribution of a set of values or measurements around a value or values at or near
the, middle of the set. The principal measures of central tendency are the mean
(average), median, and mode. (See entries under each.)
MECHANICAL TRANSMISSION See VECTOR-SORNE INFECTION.
MEDIAN A MEASURE OF CENTRAL TENDENCY. The simplest division of a set of ineasure-
ments is into two parts--the lower and the upper half. The point on the sc-ale that
divides the group in this way is called the "median."
MEDIATOR (MEDIATING) VARIAt1LE See INTERMEDIATE VARIAti1.E.
MEDICAL AUDrr A health service evaluation procedure in which selected data from pa-
81 migrant studies
tients' charts are_ summarized in tables displaying such data as average length of
stay or duration of an episode of care, the frequency of diagnostic and therapeutic
procedures. and outcomes of care arranged by diagnostic category. These are often
compared with predetermined norms. -
MEDICAL CARE See /IEALTH CARE.
MEDICAL RECORD A file of information relating to transaction(s) in personal health care.
In addition to facts about a patient's illness, medical records nearly always contain
other information. The full range of data in medical records includes the following:
1. Clinical, i.e., diagnosis, treatment, progress, etc.
2. Demographic, i.e., age, sex, birthplace, residence. etc.
3. Socicxultural, i.e., language, ethnic origin, religion, etc.
4. Sociological. i.e., family (next of kin), occupation, etc.
5. Economic. i.e., method of paymem (fee-for service, indigent, etc.).
6. Administrative, i.e., site of care, provider, etc.
7. "Dehavioral," e.g., record of broken appointment may indicate_ dissatisfaction
with service provided.
MEDICAL RTATtST1CS See s1OSTATISTICS.
MENDEL's LAWS Derived from the pioneering genetic studies of Gregor Mendel (1822-
1884). Mendel's first law states that genes are paniculate units that segregate; i.e.,
members of the same pair of genes are never present in the same gamete, but
always separate and pass to different gametes. Mendel's second law states that genes
assort independently; i.e., members of different pairs of genes move to gametes
independently of one another. -
META-ANALYSIS The process of using statistical methods to combine the results of dif-
ferent studies. In the biomedical sciences. the systematic, organized and structured
evaluation of a problem of interest, using information (commonly in the form of
statistical tables or other data) from a number of independent studies of the prob-
lem. A frequent application has been the pooling of results from a number of small
randomized controlled trials. none in itself large enough to demonstrate statistically
significanl differences, but in aggregate, capable of so doing. Meta-analysis has a
qualitative component, i.e., application of predetermined criteria of quality (e.g.,
contpleteness of data, absence of biases), and a quantiutive component, i.e., inte-
gration of the numerical information. Meta-analysis includes aspects of an overview.
and of pooling of data, but implies more than either of these processes. Meta-
analysis carries the risk of several biases.
MCTHODQ_ LOGY The scientific study of methods. Methodology should not be confused
with methods. Sad to say, the word "methodology" is all too often used when the
writer means "method."
MtwsMA THEORY An explanation for the origin of epidemics, the "miasma theory" was
implied by many aucient writers, and made explicit by Lancisi in Dr noxiis pafudum
rfflulriu (1717). It was based on the notion that when the air was of a"bad quality"
(a state that was not precisely defined, but that was supposedly due to decaying
organic matter), the persons breathing that air would become ill. Malaria ("bad air")
is the classic example of a disease that was long attributed to miasmata. "Miasma"
was believed to pass from cases to susceptibles in these diseases considered conta-
gious. MIGRANT STUDIES Studies taking advantage of migration to one country by those from
other countries with different physical and biological environments, cultural back-
ground and/or genetic makeup, and different morbidity or mortality experience.
Comparisons are made between the mortality or morbidity experience of the mi-
aTt`%TsTzoz

Mi1{', c.noD 82
gnnt groups with that of their current country of residence and/or their country
of origin. Sometimes the experiences of a number of different groups who have
migrated to the same country have been compared.
Mtt.t.'s CANons In A SYJt<m ojLogic (1856). J.S. Mill devised logical strategies (canons)
from which causal relationships may be inferred. Four in particular are pertinent
to epidemiology: the methods of agreement, difference, residues, and concomitant
variation.
Method oJagrrErnEnt (first canon): "If two or more instances of the phenomenon
under investigation have only one circumstance in common, n, the circumstance in
which alone all the instances agree, is the cause (or effect) of the given phenome-
non."
Method of difference (second canon): "If an instance in which the phenomenon
under investigation occurs, and an instance in which it does not occur, have every
circumstance in common save one, that one occurring only in the former, the cir-
cumsunce in which alone the two instances differ is the effect. or cause or a nec-
essan part of the cause, of the phenomenon."
Method of rrsidua (fourth canon): "Subduct from any phenomenon such part as
is known by previous inductions to be the effect of certain antecedents, and the
residue of the phenomenon is the effect of the remaining antecedents."
Method oJconcotnitonr variation (fifth canon): "Whatever phenomenon varies in any
manner whether another phenomenon varies in some particular manner, is either
a cause or an effect of that phenomenon, or is connected with it through some fact
of causation."
MINIMUM DATA /ET (Syn: uniform basic data set) A widely agreed upon and generally
accepted set of terms and definitions constituting a core of data acquired for mcd-
ical records and emplOyed for developing sutistics suitable for diverse types of analyses
and users. Such sets have been developed for birth and death certificates, ambula-
torv nre, hospital care, and long-term care. See also eIRTH CERTIFICATE; DEATH
CERTIFICATE; HOSPITAL DISCHARGE ARSTRACr SYSTEM.
MtscLA«tncATtoN The erroneous classification of an individual, a value, or an at-
tribute into a category other than that to which it should be assigned. The proba-
bilitF of mixlassification may be the same in all study groups (nondifferential mis-
classification) or may vary between groups (differential misclassification).
MOStLITY, cEOCRAPHtC Movement of persons from one country or region to another.
Moatm, SOCIAL Movement from one defined socioeconomic group to another, either
upward_ or downward. Downward social mobility, which can be related to impaired
health (e.g.. a)coholism, schizophrenia, or mental retardation) is sometimes re-
ferred to as "social drift."
MODE One of the MEASURE3 OF CENTR_AL TENDENCY. The most frequently occuring value
in a set of observations.
83 mortalitf statistics
In epidemiology the use of models began with an effort to predict the onset and
course of epidemics. In the second report of the Registrar-General of England and
Wales (1840), WILLtAM FARR developed the beginnings of a predictive model for
communicable disease epidemics. He had recognized regularities in the smallpox
epidemics of the 1830s. By calculating frequency curves for these past outbreaks,
he estimated the deaths to be expected. See also DEMONSTRATION MODEL; NATHE-
MATICAL MODEL; THEORETICAL EPIDEMIOLOGY.
MODERATOR vARtwsLE (Syn: qualifier variable) In a study of a possible causal factor
and an outcome. a moderator variable is a third variable exhibiting statistical imer-
action by virtue e of its being antecedent or intermediate in the causal process under
stud). If it is antecedent, it is termed a conditional moderator variable or EFrECr
MODIFIER; if it is intermediate, it is a contingent_ moderator variable. See also trr-rER-
ACTION; INTERMEDIATE VARIARI_.E. MONITORING I. The performance and analysis of routine measurements,
aimed at detecting
changes in the environment or health status of populations. Not to be con-
fused with SURVEILLANCE. To some, monitoring also implies intervention in the
light of observed measurements.
2. Ongoing measurement of performance of a health service or a health profes-
sional, or of the extent to which patients comply with or adhere here to advice from
m
health professionals.
3. In management, the continuous oversight of the implementation of an activity
that seeks to ensure that input deliveries, work schedules, targeted_ outputs,
and other required actions are proceeding according to plan.
MONOTOMC iEQUENCE A sequence is said to be monotonic increasing if each value is
greater than or equal to the previous one, and monotonic decreasing if each value
is less than or equal to the previous one. If equality of values is excluded, we speak
of a stricdy (increasing or decreasing) monotonic sequence.
MoNTE CARLO tTVDY, TRIAL Complex relationships that are difficult to solve by math-
ematinl analysis are sometimes studied by computer experiments that simulate and
analyze a sequence of events, using random numbers. Such experiments are called
Monte Carlo trials, or studies, in recognition of Monte Carlo as one of the gambling
capitals of the world.
MOR11DiTY Any departure, subjective or objective, from a state of physiological or psy-
chological well-being. In this sense, .tiuknEss, iUniss, and morbid corulition are similarly
defincd_ and synonymous (but see DtsEUSE).
The WHO Expert Committee on Health Statistics noted in its Sixth report (1959)
that morbidity could be measured in terms of three units: (I) persons who were ill;
(2) the illnesses (periods or spells of illness) that these persons experienced; and (3)
the duration (days, weeks, etc.) of these Illnesses._ See 2150 HEALTH INDEX; INCIDENCE
RATE; NOTIFIARLE DISEASE; PREVALENCE RATE.
MODEL
l
al
tical
or
t
i
i
hi
b
l
f
l
i
h
MORaIDtTY RATE A term, preferably avoided
used indiscriminately to refer to
incid
nc
,
, an
y
ween
n
at
ons
p
og
on o
e re
e
t
1. An abstract representat ,
_
e
e
'
empirical components of a system. See also MATHEMATICAL MODEL or prevalence rates of disease.
2. A formalized expression of a theory or the causal situation that is regarded as MoRnD1TY stntvEV
A method for estimating the prevalence and/or incidence of discase
3. having generated observed data.
(Animal) model: an experimental system that uses animals, because humans or diseases in a
population. A morbidity survey is usually designed simply to ascer-
tain the facxs as to disease distribution, and not to test a hypothesis. Se_e also cROSS-
cannot be used for ethical or other reasons. SECTIONAL STUDY; HFALTH SURVEY.
4. A small-scale simulation, e.g., by using an "average region" with characteristics MORTAUTT RATE
See DEATH RATE.
resembling those of the whole count -ry. MORTAtlTT STATISTTCi Statistical tables compiled from the
information contained in DEUTH
tiVzMzoz

multicouiee.rity 84
cERTtrtcATU. Most administrative jurisdictions in all nations produce tables of mor-
talitv statistics. These may be published at regular intervals; they usually show num-
bera of deaths and/or rates by age, sex, cause, and sometimes other variables.
MULT7COLUNEARITY In multiple regression analysis, a situation in which at least some
of thrindependent variables are highly correlated with each other. Such a situation
can result in inaccurate estimates of the parameters in the regression model.
MVLTIFACTQRIAL ETIOLOGY See MULTIPLE CAUSATION. MVLTINOMIAL DISTlusImON The probability distribution
associated with the classifica-
tion of each of a sample of individuals into one of several mutually exclusive and
exhaustive categories. When the number of categories is two, the distribution is
called binomial. See also BINOMIAL DISTRIBUTION.
'MVLTIPHASIC SCREENING See_ SCREENING.
MULTtst.E CAUSATION (Syn: mulufactorial etiology) This term is used to refer to the
concept that a given disease or other outcome may have more than one cause. A
combination of caus_e_s_ or alternative combinations of causes may be required to
produce the effect.
MULTIPLE LOGISTIC MODEL See t.OGISTIC MODEL.
MULnPLE RtsR Where more than one risk factor for the development of a disease or
other outcome is present, and their combined presence results in an increased risk,
we speak of "multiple risk." The increased risk may be due to the additive effects
of the risks associated with the separate risk factors. or to SYNERGISM.
MULT7rLICATIVE MODEL A model in which the joint effect of two or more causes is the
product of their effects. For instance, if factor a multiplies risk by the amount a in
the absence of factor b, and factor b multiplies risk by the amount b in the absence
of factor a, the combined effect of factors a and b on risk is a x b. See also ADDITIVE
MODEL MULTISTAGE MODEL A mathematical model, mainly for carcinogenesis, based on the
theon that a specific carcinogen may affect one of a number of stages in the de-
velopment of cancer.
MULTiVAR_IATE ANALYSIS A set of techniques used when the variation in several vari-
ables has to be studied simultaneously. In statistics, any analytic method that allows
the simultaneous study of two or more DEPENDENT VARIABLES.
MUTATION Heritable change in the genetic material not caused by genetic segregation
or recombination, which is transmitted to daughter cells and to succeeding gener-
ations, provided it is not a dominant lethal factor. -
MUTATION RATE The frequency with which mutations occur per gene or per genera-
tion. -
zTVZTsCzn%
N
NATIONAL DEATH INDEX A computerized central registry of deaths in the United States,
started in 1979 and operated by the U.S. National Center for Health Statistics, that
facilitates mortality followup; Cf. CANADIAN MORTALITY DATA BASE.
NATURAL EZRERIMENT A term probably derived from joHN SNOw's account of his inves-
tigation of the practices of water supply companies in relation to the cholera epi-
demics in London in the 1850s. It refers to naturally occurring circumstances in
which populations have different exposures to a supposed causal factor in a situa-
tion resembling an actual experiment in which persons would be assigned to groups.
John Snow was able to trace the London outbreaks of cholera in the 19th century
to water impurity as a result of comparisons made between two water companies.
It would have been unethical to expose "test subjects" to infection, but the situation
at the time afforded him the opportunity to make observations of crucial impor-
nce.
tance.
To turn this grand experiment to account, all that was required was to learn the
supply of water to each individual house where a fatal attack of cholera might occur
I resolved to spare no exertion which might be necessary to ascertain the exact
effect of the water supply on the progress of the epidemic, in the places where all
the circumstances were so happily adapted for the inquiry ... I had no reason to
doubt the correctness of the conclusions I had drawn from the great number of
facts already in my possession, but I felt that the circumstances of the cholenpoisoning
passing down the sewers into a great river, and being distributed through miles of
pipes, and yet producing its specific effecu was a fact of so startling a nature, and
of so vast importance to the community, that it could not be too rigidly examined
or esublished on too firm a basis. (Snow, On the Modi of Cmnmunicatwn of Cho/E+o.
1855)
NATURAL HISTORY OF DISEASE The course of a disease from onset (inception) to reso-
lution. Many diseases have certain well-defined stages that, taken all together, are
referred to as the "natural history of the disease" in question. These stages are as
follows:
I. Stage of pathological onset.
2. Presymptomatic stage: from onset to the first appearance of symptoms and/or
signs. SCREENtNG tests may lead to earlier detection.
3. Clinically manifest disease, which may progress inexorably to a fatal termina-
tion, be subject to remissions and relapses, or regress spontaneously, leading
to recovery.
Detection and intervention can alter the natural history of disease. The term has
also been used to mean "descriptive epidemiology of disease."
NATURAL HISTORY iTUDY A study, generally longitudinal, designed to_ yield information
about the natural course of a disease or condition.
NATURAL RATE OF INCREASE (DECREAEE_) See GROMTtt RATE Or POPULATION.
NEAREST NEIGHSOR METHOD A means of analyzing the spatial patterns of a free-living
population. A term from veterinary epidemiology. Random sampling points are
located throughout an area and the distance from each point to the nearest individ-
ual is measured; alternatively, individuals are selected at random and from each of
these the distance to the nearest neighbor is measured.

necessary and sufficient cause 86
NECESSARY AND atm7CIENT CAVSE A causal factor whose presence is required for the
occurrence of the effect and whose_ presence is always followed_ by the effect. See
a1SO ASSOCIATION; CAUSAL/TY.
NEEDS (Syn: health needs, perceived needs, professionally defined needs, unmet needs)
This term has both a precise and an all-but-indefinable meaning in the context of
public health. We speak of needs in precise numerical terms when we refer to
specific indicators of disease or premature death that require intervention bccause
their level is above that generally accepted in the society or community in question.
For example. an infant mortality rate two or three times greater than the national
average in a particular community is an indicator of unmet health needs of infants
in that community (not to be confused with a need for more or better medical care).
It should be clear that even in this seemingly precise usage there are implied value
judgments. II must be explicitly stated that "needs" always reflect prevailing value
judgments as well as the existing ability to control a particular public health prob-
lem. Thus, sputum-positive pulmonary tuberculosis was not recognized as a health
need in 1850 but was by 1900 in the industrialized nations; the ill effects of ciga-
rette smoking must now be universally acknowledge as a health need; and child
abuse is increasingly regarded as a public health problem, to which we could apply
the term "professionally defined need."
(See Vickers GR: What sets the goals of public health. L.once 1:599, 1958.)
NEONATAL MORTALITY RATE 1. In VITAL STATISTICS, the number of deaths in infants under 28 days of age
in
a given period, usually a year, per 1000 live births in that period.
2. In obstetric and perinatal research the term "neonatal mortality rate" is often
used to denote the cumulative (ative MORTALITY RATE Of live-bprn infants within 28
days of age.
NESTED CASE CONT710L ITUDY A case control study in which cases and controls are drawn
from the population in a cohort study. As some data are already available about
both cases and controls, the effects of some potential confounding variables are
reduced or eliminated.
NET MIGRATION The numerical difference between immigration and emigration.
NET MIGRAT1oN RATE The net effect of immigration and emigration on an area's pop-
ulation expressed as an increase or decrease per 1000 population of the area in a
given year.
NET rtErROOUCStoN RATE The average number of ftmak children born per woman in
a cohort subject to a given set of age-specific fertility rates, a given set of age-specific
mortality rates, and a given sex ratio at birth. This rate measures replacement fer-
tility under given conditions of fertility and mortality: it is the ratio of daughters to
mothers assuming continuation of the specified conditions of fertility and mortality.
It is a measure of population growth from one generation to another under con-( sunt conditions.
This rate is similar to the gross reproduction rate, but takes into
account that some women will die before completing their childbearing years. An
NRR of 1.00 means that each generation of mothers rs is having exactly enough
daughters to replace itself in the population. See also e.oss RErRoDUtrTaoN RATE.
NEw YORR STATE IDENTt/ICAT70N AND INTELUCEtvcE SrsTEm (NYSIIS) A method of
identifying individuals for RECORD uNRAGE based on phonetic spelling of full names,
sequence of digits for birthdate, birthplace, sex, name at birth, and parents' names.
See 2150 HOG6EN NUMtlER; SOUNDEx COD.
NIDW A focus of infection. The term can be used to describe any heterogeneity in the
distribution of a disease, but is usually applied to a Sma11 area in which conditions
87 nomogram
favor occurrence and spread of a communicable disease; 2150. the site of origin of
a pathological process.
NtonTn+GALE, tl,oReNct (1820-1910) An English woman who is identified as the founder
of modern nursing, but was much more. In addition to her famous work of elevat-
ing nursing to a noble profession during the Crimean War, and establishing a train-
ing school for nurses at St. Thomas's Hospital in London, she recognized thc im-e portance of
statistical analysis of hospital records (Notes on Hospitalr London:
Longmans, 1859); her contributions were recognized by election to Fellowship of
the Royal Statistical Society. Her best-known work is Nous on Nuning (1860).
NOISE (IN DATA) This term is used when extraneous uncontrolled ncontrolled variables and/or er-
rors inlluence the distribution of measurements that are made in a study, thus
rendering difficult or impossible the determination of relationships between vari-
ables under scrutiny.
NOMENCLATURE A list of all approved terms_ for r describing and recording observations.
NOMINAL SCALE See MEASUREMNT SCALE.
NOMOGRAM A form of line chart showing scales for the variables involved in a particular
formula in such a way that corresponding values for each variable lie on a straight
line intersecting all the scales.
/
JJ
!.
r
Y
f...
.es
ne
.p a
S w
'O
,
.I! I
f/1 M ~ 1..,
.~ . ;.«
.» b
#
... #.«
..
..
~ ...
!
... ~
. ....
w .; : ..e.
.. ~
.
. .,
.
.
w.
.
..
r ~
.,. ns
.. ~
.!
« ..e.~
. w.
..~®
w.
.. M
.W®
.«
..
t
at
~1
Nomogram of confidence limits to a rate.
Frmu Rosenbaum, Nomograms for rates per 1000, Br Med J 1:169-170, 1963.
4/r Y%Y: =s O%

uoocoocusseni study 88
NONCONCI/RRENT f'rt' /DY See HISTORICAL COHORT STUDY.
NONDIFFFJIElITIAL MLiCLAS3IFIC_ATION See MISCLASSIFICATION.
NONEArERIMENTAL STUDY See OSSERVATIONAL STUDY.
NONrARAMETRIC METHODS Set DISTRIRUTION-FREE METHOD.
NONTARAMETRIC'rEST Stt DISTRIOt7r1ON-FREE METHOD.
NONrARTtcrrANTS (Syn: nonresponden) Members of a study sample or population who
do not take part in the study for whatever reason, or members of a target popula-
tion who do not participate in an activity. Differences between participants and
nonparticipants have been demonstrated repeatedly in studies of many kinds, and
this is ohen a source of sus.
NO-ORSERVED+EFiECT LEVEL (NOEL) A term from toxicology, meaning the highest dose
at which no adverse health effects are detected in an animal population. A NOEL-
SF is a no-observed-effecta level with an added safety factor for human exposures,
used in setting human safety standards.
NoRM This term has two quite distinct meanings:
1. The first is "what is usual," e.g., the range into which blood pressure values
usually fall in a population group, the dietary or infant feeding practices that
are usual in a given culture, or the way that a given illness is usually treated
in a given health care system.
2. The second sense is "what is desirable," e.g., the range of blood pressures that
a given authority regards as being indicative of present good health or as
predisposing to future good health. the dietary or infant feeding practices that
arc valued in a given culture, or the health care procedures or facilities for
health care that a given authority regards as desirable.
In the latter sense, norms may be used as criteria when evaluating health are, in
order to determine the degree of conformity with what is desirable, the average
length of suy of patients in hospital, etc. A distinction is sometimes made between
norms, defined as quantitative indexes based on research, and standards, which are
fixed arbitrarily.
NoRMAL This term has three distinct meanings. Conceptual difficulties may arise if these
different meanings are not specified, or if the area of their overlap is not clearly
understood.
I. Within the usual range of variation in a given population or population group;
or frequently occurring in a given population or group. In this sense, "nor-
mal" is frequently defined as, "within a range extending from two standard
deviations below the mean to two standard deviations above the mean," or
"between specified (e.g.. the 10th and 90th) percentiles of the distribution."
2. In good health, indicative or predictive of good health, or conducive to good
health._ For a diagnostic or screening test, a"normal" result is one in a range
within which the probability of a specific disease is low (see also NORMAL LIM-
ITS).
3. (Of a distribution) Gaussian; SEe also NORMAL DISTRISt1TION.
NORMAL DIS'TRr11tJT1ON (Syn: Gaussian distribution) The continuous frequency distri-
bution of infinite range represented by the equation
f(x) (2-Tti E a ''nJ
where x is the abscissa, J(x) is the ordinate, p is the mean, I is the natural logarithm,
2.718 and o the standard deviation. -
89
34% 34%
13.525
2.5'R
-3
-1 0 t
Standard de.i.tionm
t
Normal distribution of hean rate. Front Rimm et al., 1980.
nosocomial
s
The properties of a normal distribution include the following: (I) It is a contin-
uous, symmetrical distribution; both tails extend to intinity; (2) the arithmetic mean,
ntode, and median are identical; and (3) its shape is completely determined by the
mean and standard deviation.
NORMAL UMITS The limits of the "normal" rrange of a test or measurement, in the sense
of bcing indicative of or conducive to good health. One way to determine normal
limits is to compare the values obtained when the measurements are made in two
groups, one that is healthy and has been found to remain healthy, the other ill, or
subsequently found to become ill. The result may be two overlapping distributions.
as illustrated. Outside the area where the distributions tions overlap, a given value clearly
identifies the presence or absence of disease or some other manifestation of poor
health. If a value falls into the area of overlap, the individual may belong to either
the normal or the abnormal group. The choice of the normal limits depends upon
the relative importance attached to the identification of individuals as healthy or
unhealthy. See also FALSE NEGATIVE; FALSE POSITIVE; SENSITIVITY AND SPECIFICITY.
BLCOO GLUCOSE
Hypolhelicol distribution of normal and diabetic glucose levels.
Frmn Lilienfeld and Lilienfeld, 1979
NORMAmE Pertaining to the normal, usual, accepted standard or values. See also NORM.
NOSOCOMIAL Arising while a patient is in a hospital or as a result of being in a hospital;
relating to a hospital; denoting a new disorder (unrelated to the patient's primary
condition) associated with being in a hospital.
VTVG k9C,C,1Jz

no.oeomi.al inlecdoa 90
NOSOCOMIAL tNiECnON (Syn: hospital-acquired infection) An infection originating in a
medical facility, e.g., occurring in a patient in a hospital or other health care facility
in whom the infection was not present or incubating at the time of admission. In-
cludes infections acquired in the hospital but appearing after discharge: it also in-
cludes such infections among staff.
NosocRAtetY, Nosot.ocY Classification of ill persons into groups, whatever the criteria
for their classification, and agreement as to the boundaries of the groups, is called
"nosolog%." The assignment of names to each disease entity in the group results in
a nomenclature of disease entities, or nosography. (Faber K: Nosography in Modnn
Internal Madi!.ina. New York: Ho_eber, 1923.)
NOTInASL>: DtsEwsE A disease that, by statutory requirements, must be reported to the
public health authority in the pertinent jurisdiction when the diagnosis is made.
A disease deemed of sufficient importance to the public health to require that its
occurrence be reported to health authorities.
The reporting to public health authorities of communicable diseases is, unfortu-
nately. ven incomplete. The reasons for this include diagnostic inexactitude; the
desire of patients and physicians to conceal the occurrence of conditions carrying a
social stigma, e.g., sexually transmitted diseases; and the indifference of physicians
to the usefulness of information about such diseases as hepatitis, influenza, and
measles. Vet notifications arc extremely important. They provide the starting point
for investigations into the failure of preventive measures such as immunizations,
for tracing sources of infection, for finding common vehicles of infection, for de-
scribing the geographic clustering of infection, and for various other purposes, de-
pending upon the particular disease.
N.s., n.s. Abbreviation, usually sually written lower case, for not statistiolly significant.
Nut.L ttrrotttEats (Syn: test hypothesis) The statistical hypothesis that one variable has
no association with another variable or set of variables, or that two or more popu-
lation distributions do not differ from one another. In simplest terms, the null
hypothesis states that the results observed in a study, experiment, or test are no
different from what might have occurred as a result of the operation of chance
alone.
NUMERATOR The upper portion of a fraction used to calculate a rate or a ratio.
NUMERICAL TA=ONOMY The construction of homogeneous groupings or taxa using nu-
merical methods; allied to CLUSTER ANALYSIS.
OBSERVATIONAL fTUDY (Syn: nonexperimental study, survey) Epidemiologic study in
situations where nature is allowed to take its course; changes or differences in one
characteristic are studied in relation to changes or differences in other(s), without
the intervention of the investigator.
OttSERVER VARIATION (ERROR) Variation (or error) due to failure of the nbsetver to
measure or to identify a phenomenon accurately. Observer variation erodes scien-
tific credibility whenever it appears. Sir Thomas Browne in Pieudodmxie EPidF+nice
(1646). subtitled "Enquiries into very many commonly received tenents and pre-
sumed truths," recognized several sources of error: "the common infirmity of hu-
man nature. the erroneous disposition of the people, misapprehension, fallacy or
false deduction, credulity, obstinate adherence to authority, the belief in popular
conceits, the endeavours of Satan."
All observations are subject to variation. Discrepancies between repeated obser-
vations by the same observer and between different observers are to be expectcd;
these can be diminished but probably never absolutely eliminated.
Variation may arise from several sources. The obscrver may miss an abnormality
or think he has found one where none is present: a measurement or a test may
give incorrect results due to faulty technique or incorrect reading and recording of
the results; or the observer may misinterpret the information. l wo varieties of ob-
server variation are interobserver variation, i.e., the amount observers vary from
one another when reporting on the same material, and intraobserver variation, the
amount one observer varies between observations when he reports more than once
on the same material.
OoCAM's RAZOR (Syn: scientific parsimony) William of Occam's 14th century dictum
was that. "the assumptions introduced to explain a thing must not be multiplied
beyond necessity." This useful maxim does not contradict the conclusion that mul-
tiple causes operate in any system. The number of causes implicated depends on
the frame of reference of the investigator and on the scope of the inquiry.
OCCURRENCE (Syn: frequency) In epidemiology, a general term describing the fre-
quency of a disease or other attribute or event in a population without distinguish-
Ing between INCIDENCE and PREVALENCE.
oDDa The ratio of the probability of occurrence of an event to that of nonoccurrence.
or the ratio of the probability that something is so, to the probability that it is not
so. If 60 smokers develop a chronic cough and 40 do not, the odds among these
100 smokers in favor of developing a cough are 60:40, or 1.5; this may be con-
trasted with the probability that these smokers will develop a cough, which is 60/
I00or0.6.
ODDS RAnO (Syn: cross-product ratio, relative odds) The ratio of two odds. The term
"odds" is defined differently according to the situation under discussion. Consider
91

one-tail test 92
the following notation for the distribution of a binary exposure and a disease in a
population or a sample.
Exposed Unexposed
Disease a b
No disease c d
The odds ratio (cross-product ratio) is arUbc.
The acposurr-odd+ ratio for a set of case control data is the ratio of the odds in
favor of exposure among the cases (a/b) to the odds in favor of exposure among
noncases (dd). This reduces to odlbc. With incident cases, unbiased subject selection,
and a"rare" disease (say, under 2% cumulative incidence rate over the study pe-
riod), ad/bc is an approximate estimate of the Rtsx RATIO. With incident cases, un-
biased subject selection, and DENSITY SAMPLING of controls ad1bc is an estimate of
t}]e ratio of the person-time on-time incidence rates ( FORCES oF MoantDtT9) in the exposed
and unexposed (no rarity assumption is required for this).
The dueaJr-oddt (ratr-oddt) ratio for a cohort or cross section is the ratio of the
odds in favor of disease among the exposed (alc) to the odds in favor of disease
among the unexposed b/d). This reduces to ad/bc and hence is equal to the expo-
sure-odds ratio for the cohort or cross section.
The Prnrolena-0ddt ratio refers to an odds ratio derived cross sectionally, as, for
example, an odds ratio derived from studies of prevalent (rather than incident)
cases.
The risA-odds ratio is the ratio of the odds in favor of getting disease, if exposed,
to the odds in favor of getting disease if not exposed. The odds ratio derived from
a cohort study is an estimate of this. See also CASE CONTROL STUDY.
ONE-TAIL TEST A statistical significance test based on the assumption that the d_ ata have
only one possible direction of variability.
OPERATIONAL Rs'SEARGH The systematic study, by observation and experiment, of the
working of a system, e.g., health services, with a view to improvement.
OPERATIONS RESEARt]t
1. The fitting of models to data, or the designing of models.
2. Svnonvm for OPERATIONAL RESEARCH.
o_!PORTtmtSlle tNFCTtON lnfection with organism(s) that are normally innocuous, e.g.,
commensals in the human, but become pathogenic when the body's immunologic
defenses are compromised, as happens in the acquired immunodeficiency syn-
drome (AIDS).
ORDINAL SCALE See MEASUREMENT UREMENT SCALE.
ORDINATE The distance of a point, P, from the horizontal or x axis_ of a graph, mea-
sured along the vertical or y axis. See also ARSCtssA; cttAPH.
OUTCOMES All the possible results that may stem from exposure to a causal factor, or
from preventive or therapeutic interventions; all identified changes in health status
arising as a consequence of the handling of a health problem. See a1sO cAUSAuTS';
CAUSATION OF DISEASE. FACTORS IN.
Ol_ffL1ER5 Observations differing so widely from the rest of the data as to lead one to
suspect that a gross error may have been committed, or suggesting that these values
come from a different population.
OUTIREAR (Syn: epidemic) Sometimes the preferred word, as it may escape sensation-
alism associated wiQi the word epidemic. Alternatively, a localized as opposed to
generalized epidemic.
OUTPUT The immediate result of professional or institutional health care activities, usu-
93 overwinterin6
ally expressed as units of service, e.g., patient ho_spita_ I days, outpatient visits, labo-
rato -ry tests performed.
ovE_RSrATt:rnwc A situation that may arise when groups are matched. Several varieties
can be distinguished:
1. The matching procedure partially or completely obscures evidence of a true
causal association between the independent and dependent variables. Over-
matching may occur if the matching variable is involved in, or is closely con-
nected with, the mechanism whereby the independent variable affects the de-
pendent variable. The matching variable may be an intermediate cause in the
causal chain or it may be strongly affected by, or a consequence of, such an
intermediate cause.
2. The matching procedure uses one or more unnecessary matching variables,
e.g., variables that have no causal effect or influence on the dependent vari-
able, and hence cannot confound the relationship between the independent
and dependent variables.
3. The matching process is unduly elaborate, involving the use of numerous
matching variables and/or insisting on very close similarity with respect to spe-
cific matching variables. This leads to difficulty in finding suitable controls.
See also MATCHING.
OVERWINTERING See VECTOR-UORNE INFECTION.
9TV%TsCz0z

P
P, P(reosAStuTV) VALUE The probability that a test statistic would be as extreme as
or more extreme than observed if the null hypothesis were true. The letter P, fol-
lowed by the abbreviation n.s. (not significant) or by the symhol < (less than) and a
decimal notation such as 0.01, 0.05, is a statement of the probability that the differ-
ence observed could have occurred by chance, if th. groups are really alike, i.e.,
under the NULL HYPOTHESIS.
Investigalors may arbitrari)yy set their own significance levels, but in most biomed-
ical and epidemiologic work, a study result whose probability va)ue is less than 596
(P<0.05) or 1% (P<0.01) is considered sufficiently unlikely to have occurred by
chance to justify the designation "statistically significant." See also STATISTICAL slc-
NIrICANCE.
rA1R.ED [iAMPLEE In a CLINICAL TRIAL, pairs of subject patients may be studied. One
member of each pair receives the experimental regimen, and the other receives a
suitably designated control regimen. Pairing should b_e_ based on a prognostic vari-
able such as age
Pairing ma)~ similarly be used in a CAS_E_ coNTaot. STUDY or in a cOHORT STUDY.
See aI3_O MATCHING.
rANDEMIC An epidemic occurring over a very wide area and usua)ly affecting a large
proportion of the population.
rANEL tiTVDY A combination of cross-sectional and cohort methods, in which the inves-
tigator conducts a series of cross-sectional studies of the same individuals or study
sample. This method of study permits changes in one variable to_ be related to
changes in other variables. See also_ NESTED CASE-CONTROL STUDY.
PANUM, PEnR Luowlc (1820-1885) A Danish physician who observed firsthand an
epidemic of measles in the Faroe Islands in 1846. This was the first outbreak there
for many years, and from the epidemic pattern, Panum deduced some basic, pre-
viously unknown details about the method of spread, and incubation period, the
lasting immunity that followed infection, and the relationship between age and se-
verity of infection.
rA1tADtcM A typical example, a pattern of thought or conceptualization; an overall way
of regarding phenomena, within which scientists normally work. A paradigm may
dictate what form of explanation will be found acceptable, but a science may change
paradigms. In many contexts in which it is used, the term is both ambiguous and
vague.' The word is often used loosely as a synonym for "factor" or "variable."
' Kuhn T. The Snwlrc" of Swnn(ir Rrvdutwru. Chicago: University of Chicago Preu, 1962.
PARAMETER In mathematics, a constant in a formula or model; in sutistics and epide-.
mio)og , a measureable characteristic of a population.
rARAMETRIC T-E.ST A statistical test that depends upon assumption(s) about the distri-
bution of the dau, e.g., that these are normally distributed.
95 Pathogeaicity
rAStwsnt An animal or vegetable organism that lives on or in another and derives its
nourishment therefrom. An obligate parasite is one that cannot lead an indepen-
dent nonparasitic existence. A faculutive parasite is one that is capable of either
parasitic or independent existence.
rAUASITE COUNT See wUtw couNT.
rAttASrTtE DENerrr The collective degree of parasitemia in a population, calculated by
the use of either the geometric mean or the weighted average of the individual
parasite counts; e.g., by using a frequency distribution based on a geometric pro-
gression.
rwsuttNtc HOST (Syn: transport host) A second, third, or subsequent intermediate host
of a parasite, in which the parasite does not undergo any development or replica-
tion, but remains, usually encysted, until the_ paratenic host is ingested by the defin-
itive host of the parasite.
rA_utnr The status of a woman as regards the fact of having borne viable children. The
number of full-_term children previously borne by a woman, excluding miscarriages
or abortions in early pregnancy, but including stillbirths.
rARTasuLASUZArtoN A method of analysis opposite to generalization or abstraction. It
focuses on the specificity of a number of facts and illustrate_s_ an issue through the
use of example.
rASSAC[ The transfer of micro-organisms from human to animal host(s) either directly
or via laboratory culture; in the laboratory, this procedure is used to establish the
fJenk-Koch postulates.
PASSENGER ENGER VAttu.LE A variable that varies systematically with the dependent variable
under study, without being causally related to it; a third (explanatory) variable, the
common cause of both the dependent and the passenger variable. "explains" or
accounts for their association.
rA3SIVE iMOiING See INVOLUNTARY SMOKING
PA_srstn4 Louts (1822-1895) A French chemist and biologist. One of the founders of
bacteriology and therefore an important figure also in epidemiology. Starting in
chemistry, he worked out the biological basis for fermentation, and then went on
to make many important discoveries in bacteriology, notably vaccines against an-
thrax and rabies. He_ is, of course, eponymously honored by the word "°pasteurita-
tion."
rATas ANAt-YSSS A mode of analysis involving assumptions about the direction of causal
relationships between linked sequences and configurations of variables. This per-
mits the analyst to construct and test the appropriateness of alternative models (in
the form of a path diagram) of the causal relations that may exist within the array
of variables included in the finite system studied. Identification of the less probable
sequences of causal pathways may permit them to be eliminated from further con-
sideration.
rATttaeEN Organism capable of causing disease (literally, c,using a pathological pro-
cess).
rATeocuvESU The postulated mechanisms by which the etiologic agent produces dis-
ease. The difference between imotvcY and pathogenesis should be noted: The
etiology of a disease or disability consists of the postulated causes that initiate the
pathogenetic mechanisms; control of these causes might lead to prevention of the
disease.
rATtsoGENtr2rt The property of an organism that determines the extent to which overt
disease is produced in an infected population, or the power of an organism to
produce disease. Also used to describe comparable propeltir.s of toxic chemicals,
aVC. weG.Vz 94

Pearson, Karl
etc. Pathogenicity of infectious agents is measured by the ratio of the number of
persons developing clinical illness to the number exposed to infection. See also vtR-
uLENcE, with which pathogenicity is sometimes confused.
PEAttaON, KARL (1857-1936) British mathematician, biologist and geneticist. Pearson
was a pupil of Francis Galton, who led the science of statistics further into applica-
tions in biology and genetics. He founded the journal Biorreniko, coined the word
"biometry." and taught the next generation of sutistician/epidemiologis,s, including
Major Greenwood, Raymond Pearl, and others.
PEARSON'S PRODUCT MOMvrr CORRELATION See CORRELATION COEFTICIENT.
PEDICREtt A diagram showing the ancestral relationships and transmission of genetic
traits over several generations of a family.
rrs:R REvtEw Process of review of research proposals, manuscripts submitted for pub-
lication, abstracts submitted for presentation at scientific meetings, whereby these
are judged for scientific and technical merit by other scientists in the same field.
Also refers to review of clinical performance, when it is a form of medical audit.
ruve'rnANCE The frequency, expressed as a percentage, with which individuals of a
given phenotype manifest at least some degree of a specific mutant phenotype as-
sociated with a trait. See also GENETIC PENETRANCE.
rttstGttvED NEED A felt need. The term usually refers to need for health care that is
felt by the person or community concerned, but which may not be perceived by
health professionals.
rERCENnLZ The set of divisions that produce exactly 100 equal parts in a series of
continuous values, such as children's heights or weights. Thus a child above the
90th percentile has a greater value for height or weight than over 90% of all in the
series.
PERINATAL MORTALTTT Literally, mortality around the time of birth. Conventionally this
time is limited to the period between 28 weeks gestation and one week postnatal.
However, as the following discussion indicates, other facton, especially the weight
of the fetus, should be considered. The Ninth (1975) Rnruion of thr lnternalionaf
Ctaur;Fcation ojDiurases includes the following:
p[rlnot0l lnorttrlllr JIOfUtip
It is recommended that national perinatal statistics should include all fetuses and
infants delivered weighing at least 500 g (or, when birth weight is unavailable, the
corresponding gesutional age 122 weeks) or body length (25 cm crown-heell),
whether alive or dead. It is recognized that legal requirements in many countries
may set different criteria for registration purposes, but it is hoped that countries
will arrange the registration or reporting procedures in such a way that the events
required for inclusion in the statistics can be identified easily. It is further recom-
mended that less mature fetuses and infants should be excluded from perinatal
statistics tistics unless there are legal or other valid reasons to the contrary.
It is recommended above that national statistics would include fetuses and infants
weighing between 500 g and 1000 g both for their inherent value and because their
inclusion improves the completeness of reporting at 1000 g and over.
Inclusion of this group of very immature births, however, disrupts international
comparisons because of differences in national practices concerning their registra-
tion. Another factor affecting international comparisons is that all livt-born infants,
irrespective of birth weight, are included in the calculation of rates, whereas some
lower limit of maturity is applied to infants born dead.
In order to eliminate these facton, it is recommended that countries should pres-
97 person-time incidence rate
ent, solely for international comparisons, "standard perinatal statistics" in which
both the numerator and denominator of all rates are restricted to fetuses and in-
fants weighing 1000 g or more (or, where birth weight is unavailable, the corre-
sponding gestational age (28 weeks) or body length (25 cm crown-heel)).
tERINATAL MORTALtTY RATE In most industrially developed nations, this is defined as
Fetal deaths (28 weeks + of
gestation) + postnatal
Perinatal deaths (first week)
mortality rate FetaTdeahs (28 weeks+ of
gestation) + live births
X )OOO
The World Health Organization's definition, more appropriate in nations with less
well-established vital records, is
Late fetal deaths (28
weeks+ of gestation) +
Perinatal postnatal deaths (first week)
mortality rate Uve births in a year -
X 1000
Note the differences in denominator of the perinatal mortality rate as defined by
WHO and in industrially developed nations. This makes international comparison
difficult. The WHO Expert Committee on the Prevention of Perinatal Mortality
and Morbidity (1970) recommended a more precise formulation: "Late fetal and
earlv neonatal deaths weighing over 1000 g at birth expressed as a ratio per 1000
live births weighing over 1000 g at birth."
PERIODIC (MEDICAL) EIAMINATIONS Assessment of health status conducted at predeter-
mined intervals, e.g., annually or at specified milestones in life such as infancy,
school entry, preemployment, or preretirement. This form of medical examination
generally follows a formal protocol, e.g., employing a set of structured questions
and/or a predetermined set of laboratory tests.
PERIOD OF COMMUNICAIIUT-V See CO MMUNICARtE PERIOD.
-ERMtst1BLE EXPOSURE uwtT (rEL) An occupational health standard to safeguard em-
ployees against dangerous chemicals or contaminants in the workplace. See SAFETY
STANDARhS.
PERSONAL BEALTH CARL Those services to individuals that are performed on a one-to-
one basis by a health care worker for the purpose of maintaining or restoring health.
PERSONAL MONFrORING DEVICE An instrument attached to a person to measure the ex-
posure of that person to hazardous substance(s).
rE1LSON-nME A measurement combining persons and time, used as denominator in
person-time incidence and mortality rates. It is the sum of individual units of time
that the persons in the study population have been exposed to the condition of
interest. A variant is person-clistance, e.g., as in passenger-kilometers. The most
frequently used person-time is person-yean. With this approach, each subject con-
tributes only as many years of observation to the population at risk as he is actually
observed; if he leaves after one year, he contributes one person-year; if after ten,
ten person-years. The method can be used to measure incidence over extended and
variable time periods.
PEREON-TIME INCIDENCE RATE (Syn: interval incidence density) A measure of the inci-
dence rate of an event, e.g., a disease or death, in a population at risk. ¢ivrn hv
STVt[i rsczaJG

person-to-person spread 98
Number of events occurring during the interval
Num6er of person-time units at risk observed
during the interval
ERSON-TO-t'ERSON SPREAD OF DISEASE (Syn: prosodemic) See TRANSMISSION OF INFEC-
TION.
PERSON-YEARS See PERSON-TIML.
PesTV, Wsu1AM (1629-1687) A member of the same circk as fohn Graunt. he is equally
recognized as a pioneer in vital statistics and economics. His ideas and concepts of
lilctime earning capability are contained in Political Aruhrnetic (London, 1691).
PHARMAODErtDE>tooLOOY The study of the distribution and determinants of drug-related
events in populations, and the application of this study to efficaceous drug treat-
ment.
fHYStcsAN (Syn: medical practitioner, doctor) Professional person qualified by educa-
tion and authorized by law to practice medicine.
PIE CHART A circular diagram divided into_ segmenu, each representing a category or
subset of data. The amount for each category is proportional to the angle sub-
tended at the center of the circle and hence to the area of the sector.
When several pie charts are used to describe several populations, the area of each
circle is proportional to the size of the population it represents.
Pttat FNVrsncATSON, STUDY A small-sca(e test of the methods and procedures to be
used on a larger scale if the pilot study demonstrates that these methods and pro-
cedures can work.
tucEao, tLACEao EFFECr An inert medication or procedure. The placebo effect (usu-
ally but not necessarily beneficial) u attributable ble to the expectation that the regimen
will have an effect, i.e., the effect is due to the power of suggestion. See also HAt.o
EFFECT. _ . POtNT SOURCE EPIDEMIC See EPIDEMIC, COMMON SOURCE.
POISSON DtST11rRVT70N A distribution funcuon used to describe the occurrence of rare
events or to describe the sampling distribution of isolated counts in a continuum of
time or space (e.g., sample counts of radioactive disintegration per minute). The
number of events has a Poisson distribution with parameter A(lambda) if the prob-
ability of observing A events (A=0, I, ...) is equal to
~=a4) c IAr
~
where e is the base of natural logarithm, 2.7185. . . . The mean and variance of
the distribution are both equal to A. This distribution is used in modeling person-
ume incidence rates.
POLLUTION Any undesirable modification of air, water, or food by substance(s) that are
toxic or may have adverse effects on health or that are offensive though not nec-
essarily harmful to health.
POLY6ENIC INHERITANCE The transmission of a phenotypic trait whose expression de-
, pends upon the additive effect of a number of genes.
PUNDERAL INDEx The anthropometric index of body mass. Defined as height divided
by the cube root of the body weight. The BODY MASS INDEx is generally regarded as
a better index of body mass.
99
65 +
years
1980
65 +
years
Developlnp countries
population attributable risk
65 +
years
2000
19W Ueveloped countries 2000
Pie charts of age structure of the population.
(Figures outside the circle show the population in millions.)
Fronr World Health Organization.
POPULATION
I. All the inhabitants of a given country or area considered together; the number
of inhabitants of a given country or area.
2. (In sampling) The whole collection of units from which a sampte may be drawn;
not necessarily a population of persons; the units may be institutions, records,
or events. The sample is intended to give results that arc representative of the
whole population.
POPULATION .rnunrrwu.E RISR (PAR) This term is used by many e_pidemiologists'"
in preference to the terms "attributable fraction (population)" or "etiologic fraction
sTV0GVTqM0%

population attributable risk percent 100
(population)." It is the incidence of a disease in a population that is associated with
(attributable to) exposure to the risk factor. It is often expressed as a percentage.
It is calculated by similar methods to those described for attributable fraction (pop-
ulation), i.e.,
101
A"
854
80-84
PAR% a P`(1' -1") x 100 70-74
P, x !, 60-64
where Pr=number of persons exposed 50 -54
P, = persons in the population 40-44
/, =incidence rate among the exposed 30-34
/ =incidence rate among the unexposed 20-24
l, a incidence rate for the total population 10-14
In a case-control study, PAR can be estimated in various ways; Cole and Mac-_
Mahon' give the following formula:
PAR% ~ P,(RR- 1) x 100
+1- P.(RR = f )
where P, -n proportion of controls exposed
RR=relative risk for exposed, compared to risk of I for the unexposed.
' MacMahon B. Pugh TF: EpidewiobpD: PrnnnPln and MttAodt. Boston: Uttle, Brown, 1970.
0-4
Age
'Fletcher RH. Fletcher SW, Wagner EH: Cfnural Epidewinlobr--lM Enentiali. Baltimore: Williams 85+
& Wilkins, 1982. 90-64
'Cole P. MxMahon B: Attribuubk risk percent in nsecontrol studies. BrU J Pm, Soc Med 25:242- 70-74
244. 1971. 60-64
roruLAnoN ATTRtsvrASLE Rt.SR rEttcENr This is the attributable fraction in the pop- 50-54
ulation, expressed as a percentage. See
?Iso ATTRtstrrAStE rawcnoN (roruuTtoN). 40-44
,
rorut-ATtoN RASED Pertaining to a general population defined by geopolitical bounda- 30-34
ries; this population is the denominator and/or the sampling frame. 20-24
rorvwnoN DYNAMICa Changes in the structure of a population; loosely used as a syn- 10-14
onym for demography. 0-4
rorut.ATtoN Exerss RnTt A measure of the amount of disease associated with exposure
to a putative cause of the disease in the population. It is the difference between the
rates of disease in the entire population and among the nonexposed.
rOrULAT10N MEDICINE See COMMUNITY MEDICINE.
rorULA770N MOMENTUM In a growing population, the phenomenon menon of continuing pop-
ulation growth beyond the time when replacement level fertility has been achieved,
because of the increasing size of child-bearing and younger age cohorts, resulting
...... ...~, r. . ...,.,.,.t .,.,..... a.....6 .. ,.....~ ... t..~~...,,.~ ~~....
rorut.AnoN YRAMID_ A graphtc presentation of the age and sex composition of the
W population. The population pyramid is constructed by computing the percentage
(JI distribution of a population, simultaneously cross,classified by sex and age. The
~.a percentage that each female age group is of the total is plotted on the right and the
ZV
corresponding percentages for males are plotted on the left. A population pyramid
is intended to provide a quick overall comprehension of age and sex structure in
~
N the population. A population whose pyramid has a broad base and narrow apex
~ may be identified as a high fertility population. Changing shape over time reflects
~../ the changing composition of the populauon, associated with changes in fenility and
mortality at each age.
Since the figure is two dimensional, the word "pyramid" is incorrectly used, but
the more accurate word "profile" has never caught on.
IBOS-'99
1@95-'99
1905-'09
1915-'19
1925229
t935-'39
19d5= 49
t955 '59
1965=69
ToP: High fenility, low proportion survive to old age (Mexico).
Baiow: Low fenility, high proportion survive to old age (Sweden).
From Last, 1980.
POPULATION, trT-upY The group selected for investigation.
rorvt.AnoN, TARCET The group from which a study population is selected.
rOSTERIOR ODD1, rosTESUOR rRO.AStuTY Probability calculated after reference to re-
sults of a study. See BAYES THEOREM.
rOST-MARIIET7N6 LURVEILLANCE A procedure implemented after a drug has been lic-
enced for public use, designed to provide information on the actual use of the drug
for a given indicauon, and on the occurrence of side effects, adverse reactions, etc.
A method for epidemiologic study of adverse drug reactions.
r06TNEONATAL MORTArlTY RATE The number of infant deaths between 28 days and
one year of aRe in a given year per 1000 live hirth% in that vFar It i. an i..,.....,,..,
MedEo 1970
Sweden 1970
m
0
e
6
4
©
©
4
6
e
P+rosntops
kAola Femol.s
Population pyramids.
po.tneoDaal mortality rate
te85- e9
t89` -199
t905 '09
t915= 19
t925=29
1935= 39
t945=49
t955= 59
1965=69
8irth CoMort

potency 102
rate to monitor in developing countries where older in fants frequently die of infec-
tions and malnutrition.
PoTzNCY The strength of a particular drug, toxin, or hazard; the ratio of the dose of a
standard amount required to elicit a specific response, to the dose of the test agent
that elicits the same response.
POTENTIAL YEARS OF urE LOST (rvu,) A measure of the relative impact of various
diseases and lethal forces on society. PYLL highlights the loss to society as a result
of youthful or early deaths. The figure for potential years of life lost due to a
particular cause is the sum, over all persons dying from that cause, of the years that
these persons would have lived had they experienced normal life expectation. The
concept derives from Petty's Political Ariihmetic (1687) and is elaborated upon pon in
Dublin and Lotka's Monry Vulut of a Man (1930).
PowER A characteristic of a statistical hypothesis test, denoting the probability that the
null hypothesis will be rejected if it is indeed false. It is equal lo I minus the prob-
ability of type 11 error. See a1S_O ERROR, RESOLUTION. Resolving power is the com-
parable property of individual measurements.
PRAGMATIC STUDY A study y whose aim is to improve health status or health care of a
specified population, provide a basis for decisions about health care, or evaluate
previous actions. See also EXPtANATORY STUDY; COMMUNITY DIAGNOSIS; PROGRAM
REVIEW.
PRECISION
1. The quality of being sharply defined or stated. One measure of precision is
the number of distinguishable alternatives from which a measurement was
selected, sometimes indicated by the number of significant digits in the mea-
surement. Another measure of precision is the -undard error of measure-
ment, the standard deviation of a series of replicate determinations of the
same quantity. Precision does not imply accuracy. See also MEASUREMENT,
PRORLEMS WITH TERMINOLOGY.
2. In sutistics, precision is defined as the inverse of the variance of a measure-
ment or estimate.
PRECURSOR An early stage in the course of a disease, or a condition or state preceding
pathological onset of a disease_; sometimes detectable by SCREENING; may be identi-
fled as a RISK MARKER. PRED1cnvE VALUE In screening and diagnostic tests, the probability that a
person with
a positive test is a true positive (i.e., does have the disease) is referred to as the
"predictive value of a positive test." The predictive value of a negative test is the
probability that a person with a negative test does not have the disease. The predic-
tive value of a screening test is determined by the sensitivity and specificity of the
test, and by the prevalence of the condition for which the test is used. See also
SCREENING; SENSITIVITY AND SPECIrICITY.
rREauNtnoN A term used mainly in the epidemiology of parastic diseases, especially
malaria. It signifies a state of resistance, in a host harboring a parasite, to superin-
fection by a parasite of the same species. This state is dependent on the continued
survival of parasites in the body and disappears after their elimination. It may be
complete or partial.
PREPATENT PERIOD In parasitology, the period equivalent to the incubation period of
microbial infections; the corresponding phase may be biologically different from
microbial multiplication when the invading organism is a multicellular lar parasite that
undergoes developmental stages in the host.
PRESCRIPTIVE SCREENING See SCREENING.
103 prevention
ttREVALRNCE The number of instances of a given disease or other condition in a given
population at a designated timeE sometimes used to mean PREVALENCE RATE. When
used without qualification the term usually refers to the situation at a specified
point in time (point prevalence).
prevalence, annual (An occasionally used index) The total number of persons
with the disease or attribute at any time during a year. It includes cases of the
disease arising before but extending into or through gh the year as well as those having
their inception during the year.
prevalence, lifetime The total number of persons known to have had the disease
or attribute for at least part of their life.
prevalence, period The total number of persons known to have had the disease
or attribute at any time during a speLified period.
prevalence, point The number of persons with a disease or an attribute at a
specified point in time.
PREVALENCE RATE (RATIO) The total number of all individuals who have an attribute
or disease at a particular time (or during a particular period) divided by the popu-
lation at risk of having the attribute or disease at this point in time or midway
through the period. A problem may arise with calculating period prevalence rates
because of the difficulty of defining the most appropriate denominator. See als_o
PREVALENCE.
PREVALENCE STUDY S[e CROSS-SECTIONAL STUDY.
PREVENTApLE FRACTION (population) In a situation in which exposure to a given factor
is believed to protect against a disease (or other outcome), the preventable fraction
in the population is the proportion of the disease (in the population) that would be
prevented if the whole population were exposed to the factor. This value must be
interpreted with caution, as part or 211 of the apparent protective effe_ct may be due
to other factors associated with the apparent protettive factor.
In a study of a total population, the preventable fraction (population) is com-
puted as /P-/ where /P is the incidence rate of the disease (or other outcome)
in the population, and 1, is the incidence rate in the exposed persons in the popu-
lation.
PRE_VENTED i7lACrloN (population) In a situation in which exposure to a given factor is
believed to protect against a disease (or other outcome), the prevented fraction is
the proportion of the hypothetical total load of disease (in the population) that has
been prevented by exposure to the faaor. This value must be interpreted with
caution, as part or all of the apparent protective effect may be due to other factors
associated with the apparent protective factor.
In a study of a total population the prevented fraction is computed as I-IP,
-1"
where IP is the rate of the disease in the population, and / is the rate among people
unexposed to the factor.
vREVENnoN The goals of medicine are to promote health, to preserve health, to restore
health when it is impaired, and to minimize suffering and distress. These goals are
embodied in the word "prevention." which is easiest to define in the context of
levels, customarily called primary, secondary, and tertiary prevention. AulhoritiEs
on PREVENTIVE MEDICINE do not agree on the precise boundaries between these
levels, nor on how many levels can be distinguished. but the differences of opinion
are semantic rather than substantive.
An epidemiologic interpretation of the distinction between prima -ry and second-
zzVzTsCzaz

preventive medicine 104
ary prevention is that primary prevention is aimed at reducing incidence of disease
and other departures from good health, secondary prevention aims to reduce prev-
alence by shoriening the duration, and tertiary prevention is aimed at reducing
complications.
Primary prevention can be defined as the protection af health by personal and
community-wide effects, e.g., preserving good nutritional status, physical fitness,
and emotional well-being, immunizing against infectious diseases, and making the
environment safe. (But sfe 2150 t1EALrH PROMOTION.) Secondary prevention can be defined as the
measures available to individuals and
populations for the early detection and prompt and effective intervention to correct
departures frnm good health.
Tertiary prevention consists of the measures available to reduce or eliminate long-
term impairments and disabilities, minimize suffering caused by existing departures
from good health, and to promote the patient's adjustment to irremediable condi-
tions. This extends the concept of prevention into the field of rehabilitation.
PREVENTIVE MEDICINE The application of preventive measures by clinical practitioners.
A specialized field of medical practice composed of distinct disciplines that utilize
skills focusing on the health of defined populations in order to promote and main-n uin health and
well-being and prevent disease, disability, and premature death.
In addition to the knowledge of basic and clinical sciences and the skills common
to all physicians. Ihe distinctive aspects of preventive medicine include knowledge
of and competence in biosutistics, epidemiology, administration including plan-
ning, organization, management, financing, and evaluation of health programs; en-
vironmental health; application of social and behavioral factors in health and dis-
ease; and the application of primary, secondary, and tertiary prevention measures
within clinical medicine. (The above is the definition and description of the field
that has been adopted by the American College of Preventive Medicine; for com-
pleteness, at least two other items ought to be added, i.e., health education tion and
nutrition).
PRIMARY CASE The individual who introduces the disease into the family or group un-
der study. Not necessarily the first diagnosed case in a family or group. See also
INDEX CASE.
PRIMARY HEALTH CARE
I. Health care that begins at the time of first encounter between a patient and a
provider of health care; An alternative term is primary medical care.
2. The WHO definition of primary health care includes much more: Primary
health care is essential health care made accessible at a cost the country and
the community can afford, with methods that are practical, scientifically sound,
and socially acceptable. Everyone in the community should have access to it,
and everyone should be involved in it. Related sectors should also be involved
in it in addition to the health seclor. At the very least it should include edu-
cation of the community on the health problems prevalent and on methods of
preventing heallh problems from arising or of controlling them; the promo-.
tion of adequate supplies of food and of proper nutrition; sufficient safe water
and basic sanitation; maternal and child health care including family planning;
the prevention and control of locally endemic diseases; immunization against
the main infectious diseases; appropriate treatment of common diseases and
injuries; and the provision of essential drugs. (From Glossary oJ Terms Used in
the Hralth for All Srrut No. 1-8. Geneva: W HO, 1984.)
PRINCIPAL COMPONENT ANALYf1S A statistical method to simplify the description of a
105 program
set of interrelated variables. Its general objectives are data reduction and inlerpre-
tation; there is no separation into dependent and independent variables; the origi-
nal set of correlated variables is transformed into a smaller set of uncorrelated
variables called the principal components. Often used as the first step in a factor
analysis.
PRSOR raoRARtt.rrY Probability calculated or estimated from theory or belief, before a
study is done. See RAVES'THEOREM.
rROSARaLtTY
1. The limit of the relative frequency of an event in a sequence of N random
trials as N approaches infinity, i.e., the limit of
Number of occurrence of the event
N
2. A measure, ranging from zero to 1, of the degree of belief in a hypothesis or
statement.
PRONAtOLrTY DENSITY The frequency distribution of a continuous s random variable.
PRO_ aAatLITY D/STRIRITr7ON For a discrete random variable, the function that gives the
probabilities that the variable equals each of a sequence of possible values. Exam-
ples include the binomial and Poisson distributions. For a continuous random vari-
able, often used synonymously with the probability density function.
PROSAettm SAMPLE (Syn: random Sample) See SAMPLE.
rRQ11AatLtTY THEORY The branch of mathematics dealing with the purely logical prop-
erties of probability. Its theorems underly most statistical methods.
PRORAND See PROPOSITUS. rRORLEIN-0RIENTED MEDICAL RECORD (roMR) A medical record in which the
patient's
history, physical findings, laboratory results, etc., are_ organized to give a cumulative
record of problems, e.g., hemoptysis, rather than disease, e.g., pneumonia. The
record includes subjeciive, objective, and significant negative information, discus-
sions and conclusions, and diagnostic and treatment plans with respect to each
problem. The record, which was developed by Lawrence Weed.' contrasts with the
traditional medical record, which is less formally organized, usually recording all
information from each source (history, physical, and laboratory findings) together
without regard to the problems the information describes.
Since the problems may not be described in terms of conventional disease labels,
their classification and counting for epidemiologic purposes are snmetimes difficult.
The INTERNATIONAL CLASSIFICATION OF HEALTH PRORLEMS IN PRIMARY CARE (IC/IPPC)
is an attempt to overcome this difficulty.
1 Weed LL: Medical records Ihat guide and teach. Ncn, fnRl f Med 278:593-600, 652-657. 1968.
PROCATARCTIC CAUSE A term used by epidemiologists of the late 19th and early 20th
centuries, probably last used by cREENwooD, to describe predisposing causes asso-
ciated with habits of life.
PROFESStONAL ACTIVITY STUDY (PAS) The HOSPITAL DISCIIARGE ARSTRACT SYSTEM that
covers many acute short-stay hospitals in the United States. It provides regularly
published statistical tables arranged according to hospital service, diagnostic cate-
gory, etc., giving details on diagnostic and therapeutic procedures, length of stay
and outcome.
rROGRAM
I. A(fvrmal) set of procedures to conduct an activity, e.g., control of malaria.
2. An ordered list of instructions directing a computer to carry out a desired
sequence of operations. The objective is normally the solution of a problem.

PERT 106 107 P value
PROGRAM EVALUAnON AND RtvltW rE(atNtqurs (rERT) A work-scheduling method that
Uses ALCORrcHMS and also enunciates general principles of procedure for allocating
resources. Calls for listing specific tusks to be completed and the resources--person=
nel, equipment, supplies, and other items--that will be needed, along with their
costs, a time chart indicating when each component task is to begin and end, giving
interim accomplishment levels during that period, and a specification of times for
interim review of the progress of the plan.
RROGRAM REVIEW An evaluative study of a specific health program operating in a spe-
cific setting, performed to provide a basis for decisions concerning the opFration of
the program.
PROGRAM TRIAL An experimental or quasi-experimental evaluative study of a (health)
program.
EROLEe7TYE Pertaining to data collected by planning in advance. Contrast retrokctive.
The terms prolective and retrolective, coined by AR Feinstein' are said to describe
more precisely the actions of research workers than the common terms prospective
and retrospective; use of these terms is limited, and is deprecated by many epide-
miologists.
'CGn Phan-wol TAn 50:564-577.1981.
rROroRnoN A type of ratio in which the numerator is included in the denominator.
The ratio of a part to the whole, expressed as a "decimal fraction" (e.g.. 0.2), as a
"vulgar fraclion" ('/,), or, Ioosely, as a percentage (20%). By dehnition. a proportion
(Q) must be in the range (decimal) 0.03l;PK 1.0. Since numerator and denominator
have the same dimension, any dimensional contents cancel out. and a proportion is
a dimensionless quanlity. Where numerator and denominator are bascd upon counts
rather than upon measurements, the originals are also dimensionless, although it
should be understood tood that proportions can be used for measured quantities, e.g.,
the skin area of the lower limb is x percent of the total skin area, as well as for
counts, e.g., 0.15 of the population died. A prevalence rate is a count-based pro-
portion. The nondimensionality of a proportion, and its range limitations, do not
necessarily apply to other kinds of ratios, of which "proportion" is a subset. See also
Iso
RATE; RATIO.
.
PROtrORT1ONAL HAZARDS MODEL (Syn: Cox model) A statistical MODEL In SURVIVAL
ANALYSIS (hat asserts that the effect of the study factors on the HAZARD RATT In llle
study population is multiplicative and does not change over time. For example, the
model for two factors xi and x, asserts that the rate at time f a(q, is given by
trO,x, .Oui Ko(()
where A.(t) is the rate when:, =zz=0, and e is the (natural) exponential function.
PROPORTIONATE MORTALrTT RATE, RAno (eMR) Number of deaths from a given cause
in a specified time period, per 100 or 1000 total deaths in the same time period.
Can give rise to misleading conclusions if used to compare mortality experience of
populations with different distributions of causes of death.
tROrostT-us (Syn: proband) The family member who first draws attention to a (genetic)
pedigree of a given trait. The tNDEx CASE in a genetic study.
PRUSrECraVE dil1DV See COt1ORT STUDY.
PROTOooL The plan, or set of steps, to be followed in a study or investigation, or in an
intervention program. See also At.coRITtIM, CLINICAL.
PROXIMATE DETERMINANT OF CERnLITY Factor having a direct influence on fertility;
such factors include age at marriage, breastfeeding, abortion, and contraceptive
rvsuc xeAt.nt Public health is one of the efforts organized by society to protect, pro-
mote, and restore the people's health. It is the combination of sciences, skills, and
beliefs that is directed to the maintenance and improvement of the health of all the
people through collective or svcial actions. The programs, services, and institutions
involved emphasize the prevention of disease and the health needs of the popula-
tion as a whole. Public health activities change with changing technology and social
values, but the goals remain the same: to reduce the amount of disease, premature
death, and disease-produced discomfort and disability in the population. Public health
is thus a social institution, a discipline, and a practice.
rtn+eH CARD A card on which data are stored by means of holes punched in specified
positions; useful in storing, processing, and analyzing data. Edge-punch cards have
marginal holes converted to slots by punching so that they can be manually sorted.
The commonly used variety of punch cards have 80_ columns and 12 rows. In each
column of the card there are 12 positions at which holes may be punched, accord-
ing to a predetermined code. The position of the hole is the means of identifying
the value of a variable. Punch cards of this type are sorted mechanically or electri-
cally to provide a rapid means of processing and analyzing data, sometimes of great
C-Omplexity. See alsoDATA PROCESSINC.
P VALUE See P (PROSAa1LITV).
4
r
use.
EzVZtisCzoz

Q
/
(tAL_T Acronym for quality-adjusted life years; this is an adjustment of life expectancy
that allows For prevalence of activity-limitation, assessed from hospital discharge
data or by health survey data, in the population subgroup for which QALY is cal-
culated. For example, the life expectancy of males at birth in Canada in 1978 was
70.8 years; after adjusting for activity-limiution using health survey data, quality-
adjusted life expecuncy, or QALY, was 65.8 years.'
' Wilkins R. Adams 0: Htabhfufntn e/Li/r. Montreal, 1983.
QUAL7TATIVE DATA Observations or information characterized by measurement on a
categorical scale, i.e., a dichotomous or nominal scale, or, if the categories are or-
dered, an ordinal scale. Examples are sex, hair color, death or survival, and nation-
aht)'. See 2150 MEASUREMENT SCALE. QUALITY CONTROL The supervision and control of all operations
involved in a process,
usually involving sampling and inspection, in order to detect and correct systematic
or excessively random variations in quality.
QuAttn' or ewstE A level of performance or accomplishment that characterizes the health
care provided. Ultimately, measures of the quality of care always depend upon
value judgments, but there are ingredients and determinants of quality that can be
measured objectively. These ingredients and determinants have been c)assified by
Donabedian' into measures of structure (e.g., manpower, facilities), process (e.g.,
diagnostic and therapeutic procedures), and outcome (e.g., case fatality rates, dis-
ability rates, and levels of patient satisfaction with the service). Sec also HEALTH
SERYICES RESEARCH.
' Donabedun A: A Guu/r to Mtdicul Cerr Adanutra/ion (Vol. 2). New 1'ork: American Public Health
Association. 1969.
QUALrrY or urE In a general sense, that which makes life worth living. In a more
"quantiutive" sense, an estimate of remaining life free of impairment, disability or
handicap, as used in the expression "quality adjusted life years;" somewhere be-
tween these is an estimate of the utility of life-for instance, in clinical decision
analysis, the utility of life that is impaired by a disabling degree of angina pectoris
may be compared with that of a life that may be shorier in duration but free of
disabling pain, as a result of applying therapeutic procedures. Such trade-offs are
part of clinical decision analysis. See asscs trrluTV.
QUANnLrs Divisions of a distribution into equal, ordered subgroups. Deciles are tenths;
quartiles, quaqers; quinti)es, fifths; terciles, thirds; and centiles. hundredths.
QUANTITATIVE DATA Data in numerical quantities such as continuous measurements or
counts.
qUARANTINE The 14th edition of Control oJConttnunicoblt Diseaae in Man' gives the fol-
lowing:
Restriction of the activities of well persons or animals who have been exposed to
M
IN
N
) 09 quotient
a case ol communicable disease during its period ol communicability (i.e., contacts)
to prevent disease transmission during the incubation period if infection should oc-
cur.
a) Absolute or complete quarantine: The limitation of freedom of movement of
those exposed to a communicable disease for a period of time not longer than
the longest usual incubation period of that disease, in such manner as to prevent
eflective contact with those not so exposed (see Isolation).
b) Modified quarantine: A selective, partial limitation tion of freedom of movement of
contacts, commonly on the basis of known or presumed differences in suscepti-
bility and related to the danger of disease transmission. It may be designed to
meet particular situations. Examples are exclusion of children from school, ex-
emption of immune persons from provisions applicable to susceptible persons, or
restriction of military populations to the post or to quarters. It includes: Personal
surveillance, the practice of close medical or other supervision of contacts in or-
der to permit prompt recognition of infection or illness but without restricting
their movements; and Segregation, the separation of some part of a group of
persons or domestic animals from the others for special consideration, control or
observation-removal of susceptible children to homes of immune persons, or
establishment of a sanitary boundary to protect uninfected from infected por-
tions of a population.
See a1S0 ISOLATIUN.
' WashinRton IK;: American Public Health Aaocution. 1985.
QUA31-ExrE1lIMENT An experiment in which the investigator lacks full control over the
allocation and/or the timing of the intervention.
QUEST7ONNAIIIE A predetermined set of questions used to collect data-clinical data,
social status, occupational group, etc. This term is often applied to a self-completed
survey instrument, as contrasted with an INTEav1Ew SCHEDULE. QUETELET, LAM<ERT ADOLrMLJAOQUf3
(1796-1857) $elgian astronomer, sutistician,
and social scientist, one of the first to apply statistical thinking to the social and
biological sciences, e.g., in delineating the (normal) distribution of variables such as
height in the population. He influenced others who fo)lowed, e.g., FtARENCE N1f.HT-
tNGAtt.
QUETELET'6 INDEX See aODY MASS INDEx.
QuoTA sAMruNC A method by which the proportions in the sample in various subgroups
(according to criteria such as age, sex, and social status of the individuals to be
selected) are chosen to agree with the corresponding proportions in the population.
The resulting sample may not be representative of characteristics that have not
been taken into account.
QUOnENr The result of the division of a numerator by a denominator.
Vf.ir'G~~(iC~,JC~ 108

R
RAcE Persons who are relatively homogeneous with respect to biological inheritance.
$e_e a1S0 ETNNIC GROUP.
RADtx The hypothetical size of the birth cohort in a life table, commonly 1000 or 100,000.
RANE-HOLl1ES SOCIAL READJUSTMENT RATING SCALE See LIFE EVENTS.
RAStAZZtNt, BERNARDINO (1633-1714) An Italian physician. "Father of Occupational
Medicine;" he published D. Morbiu ArtificuM (On the Diseases of Workers) in 1700.
Based on observation and anecdote, this was the first systematic account of diseases
related to workplace exposures.
RANDOM Governed by chance: not completely determined by other factors. As opposed
to deterministic.
RANDOM ALLOCATION See RANDOMIZATION.
RANDQMIZATION Allocation of individuals to groups, e.g., for experimental and control
regimens, by chance. Within the limits of chance variation, randomization should
make the control and experimental groups similar at the start of an investigation
and ensure that personal judgment and prejudices of the investigator do not influ-
ence allocation.
Randomization or random assignment should not be confused with haphazard
assignment. Random assignment follows_ a predetermined plan that is usually de-
vised with the aid of a table of random numbers. The pattern of assignment may
appear to be haphazard, but this arises from the haphazard nature with which
digits occur in a table of random numbers, and not from the haphazard whim of
the investigator in allocating patients.
RANDOMIZED CONTIOLLP.D TRIAL (RCr') An epidemiologic experiment in which subjects
in a population are randomly allocated into groups, usually nikd "study." and "con-
trol" groups, to receive or not to receive an experimental preventive or therapeutic
procedure, maneuver, or intervention. The results are assessed by rigorous com-
parison of rates of disease, death, recovery, or other appropriate outcome in the
study and control groups, respectively. Randomized controlled trials are generally
regarded as the most scientifically rigorous method of hypothesis testing available
in epidemiology. A few authors refer to this method as "randomized control trial."
See also EXPERIMENTAL EPIDEMIOLOGY.
RANDOM SAMPLE A sample that is arrived at by selecting sample units such ch that each
possible unit has a fixed and determinate probability of selection. See also SAMPLE.
RAPIGE OF DISTAIRUTION The difference between the largest and smallest values in a
distribution.
RANKING SCALE (Ordinal Scale) A scale that arrays the members of a group from high
to low according to the magnitude of the observations, assigns_ numbers to the ranks,
and neglects distances between members of the array.
I11 r.tio
RATE A rate is a measure of the frequency of a phenomenon. In epidemiology, demog-
raphy, and vital sutistia, a rate is an expression of the frequency with which an
event occurs in a defined population; the use of rates rather than raw numbers is
essential for comparison of experience between populations at different times, dif-
ferent places, or among different classes of persons.
The components of a rate are the numerator, the denominator, the specified time
in which events occur, and usually a multiplier, a power of 10, which converts the
rate from an awkward fraction or decimal to a whole number:
Rate - Number of events in specified period X10
~
Average population during the period
All rates are ratios, calculated by dividing a numerator, e.g., the number of deaths,
or newly occurring cases of a disease in a given period, by a denominator, e.g., the
average population during that period. Some rates are proportions, i.e., the nu-
merator is contained within the denominator. Rate has sevenl different usages in
epidemiology.
1. As a synonym for ratio, it refers to proportions as rates, as in the terms cu-
mulative incidence rate, prevalence rate, survival rate (cf. Websttr's Dictionary,
which gives proportion and ratio as synonyms for rate).
2. In other situations, rate refers only to ratios representing relative changes (ac-
tual or potential) in two quantities. This accords with the OED, which gives
"relative amount of variation" among its entries for rate.
5. Sometimes rate is further restricted to refer only to ratios representing changes
over time. In this usage, prevalence rate would not be a"trve''rate because it
cannot be expressed in relation to units of time but only to a"point" in time;
in contrast, the force of mortality or force of morbidity (hazard rate) is a"true"
rate for it can be expressed as the number of cases developing per unit time,
divided by the toul size of the population at risk.
RATE DIPFERENCE (RD) The absolute difference between two rates, for example, the
difference in incidence rate between a population group exposed to_ a causal factor
and a population group not exposed to the factor:
RD=I,-I
where /,=incidence rate among exposed, and /=incidence rate among unex-
posed. In comparisons of exposed and unexposed groups, the term excess rate may
be used as a synonym for rate difference.
RATE-ODDS RATIO See ODDS RATIO.
RATE RArsO (RR) The ratio of two rates. The term is used in epidemiologic research
with a precise meaning, i.e., the ratio of the rate in the exposed population to the
rate in the unexposed population:
I
RR ~ ~Y
where l, is the incidence rate among exposed, and / is the incidence rate among
unexposed. See also RELATIVE RtsR.
RATIO The value obuined by dividing one quantity by another: a general term of which
rate, proportion, percentage, etc., are subsets. The_ important difference between a
proportion and a ratio is-that the numerator of a proportion is included in the
population defined by the denominator, whereas this is not necessarily so for a
S4VC. Nai.Cz0(z 110

ratio scale 112
ratio. A ratio is an expression of the relationship between a numerator and a de-
nominator where the two usually are separate and distinct quantities, neither being
included in the other.
The dimensionality of a ratio is obtained through algebraic cancellation, sum-
malion, etc., of the dimensionalities of its numerator and denominator terms. Both
counted and measured values may be included in the numerator and in Ihe denom-
inator. There are no general restrictions on the dimensionalities or ranges of ratios,
as there are in some of its subsets (e.g., proportion, prevalence). Ratios are some-
times expressed as percentages (e.g., standardized mortality ratio, FEVI percent).
In these cases, unlike the special case of a PROPORTION, the value may exceed 100.
SEe 2150 PROPORTION; RATE.
RATIO SCALE See MEASUREMENT SCALE.
RECEIVER OPERATING CHAIrACT-ERISTIC (ROC) CURVE (Syn: relative operating character-c istic curve) A
graphic means for assessing the ability of a screening test to discrim-
inate between healthy and diseased persons. The term "receiver operating charac-
teristic" comes from psychometry where the characteristic operating response of a
receiver-individual to faint stimuli or nonstimuli was recorded.
RECORD LtNKAGE A method for assembling the information contained in two or more
records. e.g., in different sets of medical charts, and in vital records such as birth
and death cenificates, and a procedure to ensure that the same individual is counted
onlv once. This procedure incorporates a unique identifying system such as a per-
sonal identification number and/or birth name(s) of the individual's mother.
Record linkage makes it possible to relate significant health events that are re-
mote from one another in time and place or to bring together records of different
individuals. e.g., members of a family. The resulting information is generally stored
and retrieved by computer, which can be programmed to tabulate and analyze the
data.
"Fitch person in the world creates a book of life. This book stans with birth and
ends with death. Its pages are made of the records of the principal events in life.
Record linkage is the name given to the process of assembling the pages of this
book into a volume."1 -
'!)unn HL: Record IinkaRe. Aw / Pub Il.aUh 36:1412, 1946
RECRUDESCENCE Reactivation of infection.
RE_E_D, WALTER (1851-1902) US Army physician and epidemiologist. Responsible for
cpidemiologic investigations and expcrimenls that established the transmission of
yellow fever by a filterable virus carried by culicine mosquitoes. The rigorous logic
applied to both the experimental and incidental observations by Reed and his col-
leagues is recognized as one of the great achievements of medical science.
REFERENCE'OrULAT1ON The standard against which a population that is being studied_
can be compared.
REqNEMENT The process of identifying new subcategories of study variables for the
purpose of more accurate or more detailed description of relationships. An exam-
ple is refinement of the concept of serum cholesterol level into high, low, and very
low density lipoproteins.
REGISTER, REGISTRY In epidemiology the term "register" is applied to the file of data
concerning all cases of a particular disease or other heahh-relevant condition in a
defined population such that the cases can be related to a population base. With
this information incidence rates can be calculated. If the cases are regularly fol-
lowed up, information on remission, exacerbation, prevalence, and survival can also
be obtained. The reguter is the_ actual ual document, and the rrgistry is the system of
ongoing registration.
113 relative risk
In most developed countries all births and deaths are recorded through birth
and death registration systems. Results and summaries are then tabulated and puh.
lished. Examples of registries that have epidemiologic value include the following:
Gancn r.pttnes, which secure reports of cancer patients as scxan as possible after
First diagnosis. The principal sources for Ihese reports are the hospitals serving the
community, but a few cases are not reported until death.
7unn rrRirtna, which have provided the basis for studies attempting to differen-
tiate genetic from environmental factors in the etiologv of cancer, and other con-
ditions where both genetic and environmental factors mac be contributing causes.
Rrrth d./rct rrRiunrs, which seek In document anomalies that are apparent at or
soon alter birth. Thev suffer from incompleteness due to omission of slillbirths and
of anomalies that do not declare their presence until later in life, such as certain
forms of congenital heart lesion, mental deficiency, and neurological disorders.
Other types of registers include blindness and other forms of physical handicap,
high-risk infants, persons addicted to drugs, etc. Mosl of these, however. are not
truly population based, but merely list those persons known to or attending sontc
aRencv or service that provides for them.
RE_G_ ISTRATION The term "registration" implies something more than notification for the
purpose of immediate action or to permit the counting of cases. A register requires
that a permanent record be established, including identifying data. Cases mav tx
followed up. and statistical tabulations may be prepared both on frequencv and on
survival. In addition, the persons listed on a register mat be subjects of special
studies.
REGRESSION
I. AS used b1'FRANCIS GALTON, regression meant the tendency for offspring of
exceptional parents (ven- tall, very intelligent, etc.) to possess characteristics
closer to the average for the general population. (Hence. "regression lo the
mcan.")
2. In statistics, regression is a synonym for REGRES31oN ANALVSIS.
REGRESStON ANALYSIS Given data on a dependent variable t and one or more indefxn-
denl vari;drles xl, x7. etc. regression analvsis involves hnding the "best" mathemau(al
model (within some restricted class of models) In describe s as a function of the x's,
or to predict r from the s's. The most common lorm is a linear model: in epide-
miolog), the logistic and proportional hazards models are also common.
REGRESSION LINE Diagrammatic presentation of a regression equation, usualh drawn
witlr the independent variable, :, as the abscissa and the dependent +ariable. Y, as
ordinate. Three variables can be shown diagrammalicalh on an isometric chart or
stereogram.
RELAT-IONS111P See ASSOF:IATION.
RELATIVE ODDS See OnDS RATIO.
RELATIVE RISK
I. The ratio of the Rlsl: of disease or death among the exposed to the risk among
the unexposed; this usage is synonymous with Rlsx RAT)u.
2. Ahernativeh, the ratio of the cumulative incidence rate in the exposed to the
cumulative incidence rate in the unexposed. i.e.. the cumulative incidence ra-
tio.
3. The term "relative risk" has also been used svnonvmously with "odds ratio"
and, in some biostatistical articks. has been used for the ratio of FOReES Or
MoRRtDITV. The use of the term "rclative risk" for several differem quantities
arises from the fact that for "rare" diseases (c'g., tnost cancers) 211 the quan-
lities approximate one another. For common occurrences (e.g., neonatal nlor-
90GV'bZTS;E'Z()z

reliability 114
tality in infants under 15(10_-g birth weight), the approximations do not hold.
See also CUMULATIVE INCIDENCE RATIO: ODDS RATIO: RATE RAT1O; RISK RATIO.
RE_LUS1uTV The degree of stability exhibited when a measurement is repeated under
identical conditions. RrlinbiGh refers to the degree to which the results obtained by
a measurement procedure can be replicated. Lack of reliability may arise from di-
vergences between observers or instruments of measurement or instability of the
attribute being measured. See atso MEASUREMENT. PROELEMS WITH TERMINOLOGY;
ORSERVER VARIATION.
REP.EATARILq'Y (Svn: reproducibility) A test or measurement is repeatable if the results
are identical or closely similar each time it is conducted. See als0_ MEASUREMENT,
PRORLEM5 WITrI T$RMINOLOGI': RELIARILITV.
REPLACEMENT LEVEL FERTIUTY The level of fertility at which a cohort of women are
having only enough daughters to replace themselves in the population. (iv defini-
tion, replacement level fertility is equal to a net reproduction rate of 1.00. The total
fertility rate is also used as a measure of replacement level fertility: in the United
States today. a total fertility rate of '!.12 is considered to be replacement level: it is
higher than 2 because of mortalitv and because of a sex ratio greater than I at
birth. The higher the mortality rate. the higher is replacement level fertilitv.
REPLICATtON The execution of an experiment or survelmore than once so as to con-
firm the findings, increase precision, and obtain a closer estimation of sampling
error. Exacf replication should be distinguished frorn comisanry o/ resul[I nn riAllcntiori.
k,xact replication is often possible in the physical sciences. but in the biological and
behavioral sciences. to which epidemiologc txlongs. consistency of results on repli-
cation is often the best that can be attained. C;onsistencv of results on replication is
perhaps the most important criterion in judgments of causalitv.
REPRESENTATIVE SAMPLE The term °representaUve"as it is commonly used is unde-
hned in the statistical or mathematical sensec it means simply that the sample re-
sembles the population in some w'av.
"1 he use of probability sampling Mill not ensure that am single sample will be
"representative" of the population in all possible respects. If, for example. it is fuund
that the sample age distribution is quite different from that of the population, it is
possible to make corrections for the known differences. A common fallacr lies in
the unwarranted assumption that, if the sample resembles the population closelv
on those factors that have been checked, it is "totallv representative" and that no
difference exists between the sample and the universe or reference population.
Kendall and Buckland' comment as follows: "In the widest sense, a sample which
is representative of a population. Some confusion arises according to whether 'rep-
resentative' is regarded as meaning 'selected by some process which gives all sam-
ples an equal chance of appearing to represent the population'; or, alternatively,
whether it means 'tvpical in respect of ceruin characteristics, however chosen'. On
the whole. it seems best to confine the word 'represenutive" to samples which turn
out to be so. however chosen, rather than apply it to those chosen with the object
of being representative."
' liendall M(:. Ruckland WR A Dirrionart o/Srarufwal TmR.i. 4th ed. London: L_anRman. 1982.
REPRODUCISILITY See REPEATARILITI'.
REPROOUCrtvE IsouT1ON Absence of interbreeding between populations.
RESEARCH DESIGN "I he procedures and methods, predetermined rmined by an investigator, to
be adhered to in conducting a research project.
RESERVOIR OF INFECEION
I. Any person, animal, arthropod. plant, soil, or substance, or a combination of
115 risk factor
these, in which an infectious agent normally lives and multiplies, on which it
depends primarily for survival, and where it reproduces uces itself in such a man-
ner that it can be transmitted 10 a susceptible host.
2. The natural habitat of the infectious agent.
RF_3OLUTION (Syn: resolving power) A component of a measuring ring instrument that helps
determine precision. The degree of refinement of the measuring process is com-
monlv referred tu as the "resolution' or the "resolving power of the svstem." See
also POWER. The capability of distinguishing between things that are indeed sepa-
rate or distinct from one another.
RESOLVINC POWER The capacitv of a system to_ distinguish between truly distinct things
that are close tngether.
RESPONSE RATE The number of completed or returned survey instruments (question-
naires. interviews. etc.) divided by the total number of persons who wnuld have
been surveyed if' all had participated. Usually expressed as a percentage. Nonres-
ponse can have several causes, e.g., death, removal out of the survey community,
a/rd refusal. See alstr RIAS: COMPLETION RATEL NONPARTICIPANTS.
RErKOLECflVE fraaining to data gathered from medical records or other sources, when
data collectiun urok place without prior planning for the needs of an investigation.
Sec also PROLECrnr.: term in limited use.
RETROSPECTIVE aruDY A research design that is used to test etiologic hypotheses in
Nhich inferarces alxrut exposure to the putative causal factor(s) are derived from
data relating to characteristics of the persons under studv or to events or experi-
ences in their past. The essential featur-e is that some of the persons under studr
have the disease or other outcome condition of interest, and their characteristics
and past exlx-riences are compared with those of other, unaffected persons. I'er-
suns who differ in the severitv of the disease may also be compared. There is dis-
agreement among epidemiohtgists as to the desirability of using the term "retro-
spective studi'" rather than "case control studv" to describe this method. See also
CASC CONTROL STUDY. RETROVIRUS This name_ is given to a familv of RNA viruses characterized bv the
pres-e ence of an enzvme, reverse transcriptase. that enables transcription of RNA to ONA
inside an affected cell. Thus, retroviruses can make copies of themselves in host
cells. The most important retrovirus is the human immunrKfefrciency virus (H1V):
this makes copies of itself in host cells such as "1'4 "helper" Ymphocytes and normal
immune responses are disrupted.
RISK The prnbabilitv that an event will occur, e.g., that an individual will become ill or
die within a stated period of time or age. Also, a nontechnical term encompassing
a cariety of measures of the prohabilitv of a(generally) unfavorable outcome. See
also PRORAIr1uT1'. . . .. . ...
RISK ASSESSMENT The qualitative or quantitatisr estimation of the likelihood of adverse
effects that ma% result from exposure to specified health hazards or from the ah-
sence of beneficial influences.
RISK SENEFIT ANALYSIS 7-he process of anah_zing and comparing on a single scatc the
exfxcted pen_ itive (henehts) and negative (risks. costs) results of an actiun, or Iack uf
an action.
RISIt RE_NEFIT RATIO The_ results of a risk benefit analysis, expressed as the ratio of risks
to bcnefrts.
RISK DIFFERENCE (Syn: excess risk) The absolute difference between two risks.
RISK McsuR An aspect of personal behavior or lifestyle, an environmental exposure,
or an inborn or inherited characteristic, which on the Ilasis of epidemiologic eni-

risk management 116
dence is known to be associated with heafth-related condition(s) considered impor.
tant to prevent. The term "risk factor" is rather loosely used, with any of the fol-
lowing meanings:
I. An attribute or exposure that is associated with an increased probability of a
specified outcome, such as the occurrence of a disease. Not necessarily a causal
factor. A RI[K MARKER.
2. An anribme or exposure that increases the probability of occurrence of
case or other specified outcome. A nCTERMINANT. 3. A determinant that can be modified by
interventiun, thereby reducing the
probabilitv of occurrence of disease or other specified outcames._ To avoid
confusion, it may be referred to as a modifiable risk facmr.
RISK MANAGEMENT The steps taken to alter, i.e., reduce, the levels of risk to which an
individual or a population is subject.
RISK MARKER (Svn: risk indicator) An attribute that is associated with an increased
probabilirv of occurrence of a disease or other specifierl outcome and that can be
tlsed as an indicator of this increased risk. Nnt necessarils a causal factor. See also
RISK FACTOR. - - RISK RATIO Tile ratio nf two risks.
ROBUST A statistical tesl or proceclure is said to he robust if it is not verv sensitive to
departures Iront the assumptions on which it is strictlv predicted (e.g.. that the data
are nnrmallv distributed).
Ross, RONALD (Ifl:r7-1932) Cwrntinued in India the work begun bc Laveran and Man-
son on mosquitoes as vectors of infectious disease. In a series of experiments and
microscopic dissections. he concluded that only the anopheles mosquiuxs carried
-
the malaria parasite and that a developmental staFe ul the parasite took place in
the mnsquilo (On some peculiar pigmented cells found in Iwo mosquiuoes fed on
malarial blood Nnt AfrdJ 17H6-17Ni. 1897). Awarded the Nobel prize for merlicine
in 1902.
RUBRIC Section or chapter heading. Used in epidemiology with reference to groups of
dlseases, e.Q., as in the INTERNATIONAL t't.ASSIFICATION OF nISEASE (1CD).
~~~riG~~C,G1Jf~
s
SAFETY FAGTOR A multiplicative factor incorporated in risk assessments or safety stan-
dards to allow for unpredictable types of variation, such as variability from test
animals to humans. random variation within an experiment, and person-to-person
variability. Saletyy factors are often in the range of 10 to 1000.
fAFETY STANDARDS Under the requirements of the Occupational Safety and Health Act
(OSHA, 197/1), "occupational safety and health standard" means a standard that
requires conditions, or the adoption of one or more practices, means, methexis,
operations, or processes reasonably necessary or appropriate to provide safe or
healthful employment and places of emplovment. Safety standards may be adopted
by national consensus or established by federal regulation. These standards have
been adopled in manv other nations besides the United States, although some f:u-
ropean and other countries have their own standards, which may be lower or higher
than those in the United States.
There are several tiarieties of safety standards:
I. C)SHA-promulgated, mainly for carcinogens, also for cotton dust and lead.
These are Permissabfe Exposure Limits (1'ELs).
2. National Institute of Occupational Safttt and Health (NIOSH) recommenda-
tions, often lower limits, based on animal toxicity tests. empirical observations,
epidemiofogic investigations: these are Recommended Exposure Limits (RELs).
3. An older-established set of criteria has been set by the American Conlerence
of Governmental Industrial Hygienists; these are Threshhold Limit Values
(TLVs) that have now replaced an earlier set of Maximum Allowable Concen-
trations (MA(a)
EAMRLE A selected subset of a population. A sample may be random or nonrandom
and may Ix representative or nonrepresentative. Several types of sample can be
distinguished, including the following:
Cluster sample: Each unit selected is a group of persons (all persons in a city block,
a family, etc.) rather than an individual.
Grab sample (Svn: sample of convenience): These ill-defined terms describe sam-
pfes selected by easilv employed but basically nonprobahilistic methods. "Man-in-
the-street" surveys and a survey of blood pressure among volunteers who drop in
at an examination booth in a public place are in this category. It is improper to
Reneralize from the results of a survey based upon such a sample for there is no
wav of knowing what sorts of bias mat have been operating. See also RtAS.
Probability (random) snmplr: All individuafs have a known chance of selection. They
mav all have an equal chance of being selected, or, if a stratified sampling methtxl
is used, the rate at which individuals from several subsets are sampled can be varied
so as to produce greater representation of some classes than of others.
A probability sample is created by assiRning an identily (lahel, numlxr) to all
individuals in ihe "universe" population. e.g., by arranging Ihem in alphabetical
order and numbering in sequence, or simply assigning a number to each. or by
grouping according to area of residence and numfxrinF the groups. Thc next step
117

sample, epsem 118
is to select individuals (or groups) for study by a procedure such as use of a table
of random numbers (or comparable procedure) to ensure that the chance of selec.
tion is known.
Simple random iamplr. In this elementary kind of sample each person has an equal
chance of being selected out of the entire population. One way of carrying out this
procedure is to assign each person a number, starting with I, 2. 3, and so on. Then
nundxrs are selected at random, preferably from a table of random numbers, until
the desired sample size is attained.
Srralifird randmm sompk. This involves dividing the population into distinct subgroups
according to some important characteristic, such as age or socioeconomic status,
and selecting a random sample out of each subgroup. If the proportion of the
sample drawn from each of the subgToups, or strata, is the same as the proportion
of the total population contained in each stratum (e.g., age group 40-59 constitutes
2(l1; of the population, and 20%, of the sample comes from this age stratum), then
all strata will be fairlr represented with regard to numbers of persons in the sample.
Svstrmarir jamplr: The procedure of selecting according to some simple. systematic
rule. such as all persons whose names begin with slxcified alphabetic letters, born
on certain dates, or located at specified points on a master list. A svslematic sample
mac lead to errors that invalidate generalizations. For example. persons' names
more olten begin with certain letters of the alphabet than with other letters, e.g., q,
x. A svstematic alphabetical sample is therefore likely to be biased.
SAMPLE, ErsEl,l ("equal probabilit)- of selection method") A sample selected in such a
manner that all the population units have the same probability of selection. A sim-
ple random sample is an Epsem sample: a stratified sample is not unless the prob-
abiht% of selection is the same for 211 strata.
SAMPLINC The process of sekcting a number of subjects from all the subjects in a par-
ticular group p or "uniserse."Conclusions based on sample results mav be attributed
onlv to the population sampled. Am extrapolation to a larger or diflerent popula-
tion is a judgment or a guess and_ is not part of statistical inference.
SAMPLING ERROR See ERROR.
SAMPLING VARIATION Since the inclusion of individuals in a sample is determined by
chance. the results of analvsis in two or more samples will difler, purely hc chance.
This is known as "sampling variation."
SAN/TARY CORDON See CORDON SANITAIRE.
SCATTER DIAGRAM (Svn: scattergram) A graphic method of displaying the distribution
of two variables in relation to each other. The values for one variahle are measured
on the horizontal axis and the values for the other on the vertical axis.
SCENARIO /UILDINC A method of predicting the future that relies on a series of as-
sumptions about alternative possibilities, rather than on simple extrapolation of ex-
isting trends. Trend lines for demographic composilion, morbidity and mortality
rates, etc.. can then be modified by allowing for each assumption in turn. or cont-
binations of assumptions. The method is claimed to lead to greater flexibility in
long-range health planning than simple forecasting thal relies only upon extrapo-
lalion of trends.
sCREENING Screening was defined in 1951 by the US Commission on Chronic Illness as.
"The presumptive identification of unrecognized disease or defect by the applica-
tion of tests, examinations or other procedures which can be applied rapidly.
Screening tests sort out apparently well persons who probably have a discase from
those who probabl% do not. A screening test is not intended to be diagnostic. Per-
sons with positive or suspicious findings must be referred to their physicians for
diagnosis and necessarr treatment."
119 sensitivity
Screening is an initial examination on(y, and positive responders require a sec-
ond, diagnostic examination. The initiative for screening usually comes from the
investigator or the person or agency providing care rather than from a patient with
a complaint. Screening is usually concerned with chronic illness and aims to detect
- - -
disease not yet under medical care.
There are different types of medical screening, each with its own aim: mass,
multiple or multiphasic, and prescriptive.
Mass screening simply means the screening of a whole population.
Multiple or multiphasic screening involves the use of a variety of screening tests
on the same occasion.
Prescriptive screening has as its aim the early detection in presumptively healthy
individuals of disease that can be controlled better if detected eark in its natural
histon .
l'he characteristics of a screening test include accuracy, estimates of yield, preci-
sion, reproducibility, sensitivity and specificity, and validity. See entries under these
headings.
SCREENING LEVEL The normal limit or cutoff point at which a screening test is regarded
as positive.
SEASONAL VARIATION Change in physiological status or in disease occurrence that con-
forms to a regular seasonal pattern.
SECONDARY ATTACK RATE The proportion of contacts who get a communicable disease
as a consequence of contact with a case. The secondary attack rate is a measure
of contagiousness and is useful in evaluating control measures. See also ATTACI:
RATE.
SECULAR TREND (Syn: temporal trend) Changes over a long period of time, generally
vcars or decades. Examples include the decline of tuberculosis mortalitv and the
rise, follnwed by a decline, in coronary heart disease mortalitv in the United States
and many other countries in the past 50 years.
6ELECT70N In genetics, the force that brings about changes in the frequencv of alleles
and genotypes in populations through differential reproduction. In epidemiologv,
the process and procedure for choosing individuals for study, usualh b) an orderl)
-
means such as random allocation.
SELECTtON DIAS See RIAS. S_E_M_1NELWEIS, ICNAZ Pttlurr (1R18-18Ci5) An Austro-Hungarian
physicianobstelrician,
who discovered the cause of puerperal fever by carefully comparing infeclion rates
in two wards of the AllRrmrinej Kranbnhaus in Vienna. In one ward students cus-
Iomarilv came direct from the mortuary or the dissecting room to the patients'
bedside whereas in the other, they did not. Puerperal infection death rates were
much greater in the former. Semmelweis concluded that some morbid factor was
thus transmitted to women in the worse-affected ward. Unhappily. his conclusions
were rejected by his colleagues.
SENSrI7vITY AND SPECIPiCITY (of a screening test) Seruitirnh is the proportion of truly
diseased persons in the screened population who are identified as diseased by the
screcning test. Sensitivity is a measure of Ihe probability of correctly diagnosing a
case, or the probability that any given case will he identified by the test (Syn: true
(wsitive rate).
Sprrifirih is the proportion of truly nondiscased persons who are so identified by
the screening test. It is a measure of the probability of correctlv identifying a non-
diseased person with a screening test (Syn: true negative rate). The relationships
are shown in the following fourfold table, in which the letters a, b, r, and d repre-
sent the quantities specified below the table.
V%VZT5rCPW.VZ

/
OC1IzY5L.zsJ%
itivity testing 120
creening test results True status ToTAL
Diseased Not diseased
'ositice a b a+b
Vegative c d c+d
r,ntal a+c b+d a+b+c+rl
~
Diseased individuals detected bv the test (true positives)
b. Nondiseased individuals positive by the test (false IHrsitives)
!c. Diseased individuals not detectable by the test (false negatives)
i d. Nolullseased individuals negative by the test (true negatives)
-
Sensitivit,v = }~ Specificity = b+d
Predictive value (positive test result) = a
,t+b
Predictive value (negative test result) _ `~
_ c+d
See also VOUDEN's TEST.
iENSITIVITY TESTING A study nI Itnw lhe flnal outcome of an :Ina(VSIs changes as a func-_
Inlll of varying one or more 1)f the Inpllt paranlelerY 111 a prescribed manner.
SENTINEL HEALTH EVENT A ennditiun that can be used to assess the stability or change
in health levels IIf a pnpulatlon, usually by 111n11NUrIt1K mOrtalltV statistics. Thus,
death due to acute head InlurV is a sentinel evelll Ior a class nI severe trafflc injury
that mav be reduced by such preventive measures as use of seatbelts and crash
helmets.
SENTINEL PHYStCIAN, SENTINEL PRACTICE In fam1lY merllclne. a phYslclan, practice, that
undertakes to maintain surveillance for and report certain specific predetermined
events. such as cases of certain communicable diseases, adverse drug reactions.
SE(tVENT1AL ANALYSIS A statistical nlethtxl that allows an eRperillleltt to he ended as
srMm as an answer of the desired precisiun is ubtained. Study and control suhjects
are randomly allocated in pairs or blocks. The result of the comparison of each pair
o( subjects. one treated and une control, is exarnined as sr>,rn as it becunles available
and is added to all previous results.
SERENDIPITY The accidental (and happy) discovery of important new information. A
well-known example is FleminR's discovery of the bacterincidal properties of peni-
cillin mould. In case-cnntrul studies airned at testing a specific hypnthesis, e.g., about
the relatiunship between tobacco and cancer. questions on other aspects of )ife-style
have serendipitnuslv revealed statistically significant assuriations, e.g., between al-
cnhol consmnptinn and certain cancers.
S_E_ROEPIDEMIOL_OGY I'.pldemlologlc study or activity based on the detection on serologi-
c:d testing of characteristic change in the serum level of specific antibodies. Latent,
sulxlinical infections and carrier states can thus be detected, in adclition to clinically
overt cases.
SEX RATIO The ratio of one sex to the other. Usually defined as the ratio of males to
females (or of the rates observed in males and females).
°SHOE-LEATHER" EPIDEMIOLOGY Gathering information for epidemiologic studies by
direct inquiry among the people. e.g.. walking from door to door and asking ques-
tinns of every householder (wearing out shoe leather in the process). JorrN sNOM'
(lid this when investigating the sources of water supply to households in the cholera
epidemic in London in 1854; the method has been successfully used in many sub-
12)
sequent epidemic investigatinns. It is especially useful in
transmitted diseases.
SIBLINGS Chik)ren Iwrne by the same mother.
suslttP All the brothers and sisters borne by the same rnothel
SICKNESS See DISEASF..
SIDE EFFECT An ef7ect, other than the intended one, prtx)u_c,
nostic, or Iherapeutic procedure or regimen.
SIGNAL-TO-NOISE RATIO A jargon term for the relationship nf
which is ertraneuus or irrelevant, or intrudes because n
other procedures are insufficiently sensitive.
SIGNIFICANCE See STATISTICAL SICNIFICANCE.
S_ IMPSON'S PARADOX A form of confounding, in which the pr
variable changes the direction of an association. Simpsol
meta-anahsis, because the sum of the data or results fro
studies may he affected by confounding variables that ha
sign features from some studies but not others; if this
analysis will be flawed. Rothman' has pointed out that '
really a paradox but the logical consequence of failing to 1
confounding variables.
' Ruthman K): A pictorial representation of confounding in epiden
4H:I111-IIIH, 1973. -- - sutuLATtoN The use of a model system, e.g., a mathematical m
to approximate the actinn (if a real system, often used to
real svslem.
SITUATION ANALY313 Study of a situation that may require in
with a definition of the problem. and an assessment or in
severitv, causes, and impacts upnn the communitv, and is
interactions between the system and its environment an
Inlance.
SKEW DISTRISUTION .\n older and less recommended term I
quency distribution. II a uninlndal distribution has a lun)
lower values nl the variate, it is said to have negative skewl
positive )kewness. See :I1s11 L()t:-NURMAL U1ITR/BUTIUN.
-YC.r.LMO .wC.r-6 .r.0
AT TACK
RATE
AGE IN YEARS
Skew distribution nf anack rate of measles in relal
From Llienleld and Lilienleld. 1979
SLOW VIRUS .\gent causing degenerative (neurological) diseasr
incubation perilK) and a prnlonged, slowly progressive cou
firmed slow virus diseases are Creulzlchlt-)aknb discase
rosis is possibly a slow virus disease. Some c:ases of AIDS
ease.

,
Snow,JOhn 122
SNOw, JoetN (1813-165N) London general practitioner and early anesthetist (he assisted
(Lueen Victoria's deliver), of two of her children with chloroform). His fame rests
ulwn his observations, brilliant deductions, painstaking personal enquiries, and an-
alytic studies of cholera outbreaks in the mid-19th century in London and else.
where. All are recorded in fln thr Alodr oJCommunication o/Cholera (London: Chur-
chill. 2nd ed., 1855). which can be regarded as the first definitive working text on
epiderniologv and which also contained an explicit statement of the germ theory of
transmission, written 34) years before I:och discovered the cholera vibrio. S_ee also
NATURAL EXPERIMENT. SOCIAL CLASS A stratum in society composed of individuals and families of f
equal sund-
Ing. See also S(lCl(IECONOMIC CLASSIFICATI(IN.
SOCIAL DRIFF Uownward social class mobililv as a result of impaired health often due
to mental disorders.
SOCIAL MEDICINE Tllt practice of medicine concerned with health and disease as a func-
tion of group living. Social medicine is concerned with the health of pcople in
relation to their behavior in social groups and as such imolves care ol the individual
patient as a member of a lamilv and of other significant groups in everyday lile. It
is also curlcerned with the health of these groups as such and with that of the whole_
commulllllas a community. See a1S0 CUMMUNITY MEnICINC PURLI(: HEALTH.
SOCIOECONOMIC CLASSIFICATION Arrangement olpersons into groups according to suc-h
characteristics as prior education. occupation, and income. This usually reveals upon
analysis a strong correlation with health-rclated_ characteristics such as average length
of life and risk of dving from certain specific causes.
Fhe oldest such classification that is epielemiologicallc useful is the Registrar-
(:eneral's (R(:'s) occupational classification, developed in 1911 by Stephensun.
Registrar-General of England and 1N'ales. This classihed all occupations into five
groups-the five "social classes." Social class Ill is olten further subdivided into
nonmanual and manual groups:
I Professional occupations
11 Inlernlediate txcupations
11IN Nonmanual skilled occupations
IIIM Manual skilled occupations
IV parth skilled occupations
V Unskilled occupations
This has proven to be a valuable epidemiolo_ gic tool; social class is an accurate,
consistent predictur of health experience.
There hare been several other ancmpts to develop a more refined classification;
howecer, rtlosl refinements require collection of more detailed informatiun. For
example, Hollingshead's scale requires details about education and income as well
as txcupalion, and so is more lime-consuming, more likely to be incomplete, and
requires more costly analysis than the RG's classilication. In developing countries.
where up to 4prAr of the population may be classified under "agriculturalist" or
"pastoralist" (farming or herding), other types of classifiutions have been devel-
oped.
One's prestige in society, and attitudes or values, e.g., setting a high value on
getting a gtxxl education, are generalh an integral part of social class or socioeco-
nonlic status. Attitudes toward health are often part of the sct of values arid may
explain part of the observed difference in health between social classes.
SOCIOECONOMIC STATUS (sES) Descriptive term for a person's position in society, which
ICYV IS``ZO?'
I
I
123 standard etror
may be expressed on an ORDINAL SCALE using Such criteria_ as income. educational
level attained. occupation, value of dwelling place, etc.
SOFTWARE See COMPUTER.
SOUNDEX CODE A sequence of letters used for recording names phonetically, especialk
in RECORn LINKAGE_.
SOURCE OF INFECTION The person, animal, object, or substance from which an infec-
tious agent passes to a host. Source of infection should be clearly distinguished
from source of contamination, such as overflow of a septic tank contaminating a
water supph, or an infected cook contaminating a salad. (See RESERVOIR.)t
1 Frnm Cnntrnl n/ Cmemurura6lr putour in Mon, 140% ed. WashinRlon nC: American Public Iaealdl
Avocialinn. 14_ K5. - - - - -- - - - - SPEARMAN'S RANK CORRELATION See CORRELATION COEFFICIENT.
-
6lECIF7CATION
I. The process of selecting a particular functional form or model for the rela-_
tionship5 to be analyzed in a study.
2. Thc process of selecting variables for inclusion in the analvsis of an effect or
association. This process leads to the identification Of MODERATOR VARIANISS
and Ct1NFOUNn1NG VARIARLES. See aI50 STRATIFICATION.
SrECIF1CtTV (OR A TFST) See SENSITIVITY ANU SPECIFICITV.
SPECTRVM OF DISEASE The full range of manifestations of a disease; a vague term, that
can mcan everrtFhing from mild or sulxlinical or precursor states to fulrninating.
florid disease. nr alternatively the natural history of a disease from onset to resolu-
linn.
SPELL OF SICKNESS An episode of sickness with a well<lefined onset and termination.
As use(I in the monitoring or surveillance of disease, the spell is often defined bt-
the duration of absence from work or school.
SPLEEN RATE A term used in malaria epideminlogy, to define the frequency of enlarged
spleens detected on survey of a population in which malaria is prevalent. In asso-
ciation with the HACKETT SPLEEN CLASSIFICATION it summarizes the severity of ma-
laria endemicitv.
SPORADIC Occurring irregularh, haphazardlv from time to time, and generalh' infre-_
quentlc. e.g.. cases of certain infectious diseases.
arOT MAP Map showing the geographic location of people with a specific attribute, e.g.,
cases of a disease or elderly persons living alone. The making of a spot map is a
common procedure in the investigation of a localized outbreak of disease. Infer-
ences from such a map depend on the assurrlption that the population at risk of
developing the disease is fairly evenh distributed over the area, or that at least the
hetcrogeneitles are known and can be considered in interpreting the map.
STABLE POPULATION A population that ha5 constant fertility and mortality rates, n(
migration, and consequently a fixed age distribution and constant growth rate. Sel
a1S0 STATIONARI"ROpULATION. STANDARD Something that serves as a basis for comparison; a technical
specification a
written report drawn up by experts based on the consolidated results of scientifi
study, technoingc, and experience, aimed at optimum benefits and approved by
recognized and representative body.
STANDARD DEVIATION A measureof dispersion or variation. It is the most widely usr
measure of dispersion of a frequency distribution. It is equal to the Ix)-itive squa
RfxIT OF THE VARIANCE. The mean tells where the values for a Froup are ccnlerr
'llte standard deviation is a summary of how widcly dispersed the valucs are arott
this center.
STANDARD ERROR The standard deviation of an estimate.
I

standudiration 124 ~ 125 survey
STANDARDIZATION A set of techniques used to remove as far as possible the effects of
differences in age or other confounding variables, when comparing two or more
populations. The common method uses weighted averaging of rates specific for
age. sex, or some other potential confounding variable(s) according to some speci.
lied distribution of these variables. There are two main methods, as follows:
Unrcl rnN/uxf: The specific rates in a study population are averaged, using as weights
the distribution of a specified standard population. The directly standardized rate
represents what the crude rate would have been in the study population if that
population had the same distribution as the standard population with respect to the
variable(s) for which the adjustment or standardization was carried out.
Iruliriat rnrthrrA: This is used to compare study populations for which the specific
rates are either statistically unstable or unknown. The specific rates in the standard
population are averaged, using as weights the distribution of the study population.
The ratio of the crude rate (or the study population lu the weighted average so
obtained is the standardized mortalitv for morbidity) ratio, or SMR. The indirectly
standardized rate itself is the product of the SMR and the crude rate for the stan-
dard population.
STANDARDIZED MORTALITY (MORBIDITY) RATIO (SMR) The ratio of tlle number of events
observed in the sludy group or population to the number that would be expected
if the study population had the same specific rates as the standard population, mul-
tiplied bv 111(l.
STANDARDIZED RATE RATIO (SRR) A rate ratio in which the numerator and denominator
rates have been standardized to the same (standard) population distribution.
STANDARD METROPOLITAN STATISTICAL AREA Because of the exlensive interactions be-
tween a city and its surrounding areas- a unit encompassing both is needed as a
base for statistical description. The concept of a standard metropolitan statistical
area (SMSA) was introduced in the United States lo lurnish such a unit. To qualify
as an SMSA an area has to meet criteria related to size, social and economic ime-
gration of the city and surrounding county or counlies, minimum population den-
sitv. and minimum proportion of the labor force engaged in nonagricuhural work.
STATIONARY POPULATION A stable population that has a zero growth rate with constant
numbers of births and deaths each year.
STATISTICS The science and art of collecting, summarizing, and analyzing data that are
subject to random variation. -Thc term is also applied to the data themselves and to
summarizations of the data. Statistical terms are defined by Kendall and Buckland.'
' I:endall MG, Buckland W R: A Dktionon a/ StatwKUf Tmwi, 4th cd. Londan: Longman. 1982.
STATISTICAL ERROR See ERROR.
STATISTICAL INFERENCE See INFERENCE.
STATISTICAL MODEL See MATNEMATIC_AL MODEL.
STATISTICAL SIGNIFICANCE Statistical methods allow an estimate to be made of the prob-y ability of
the observed or greater degree of association between independent and
dependent variables under the null hypothesis. From this estimate, in a sample Of
given size, the statistical "signifipnce" of a result can be stated. Usually the level of
statistical significance is stated by the P VALUE.
STATISTICAL TFSF A procedure ure that is intended to decide ecide whether a hypothesis about
the distribution of one or more populations or variables should be rejected or ac-
cepted. Statistical tests may be parametric or nonparametric.
STEREOGRAM (Syn: isometric chart) A graph or chart that displays more than two vari-
ables in a manner that appears three-dimensional to the eye.
STOCHASTIC rROCE55 A process that incorporates some element Of randomness.
STRATECY In game theor., a mathematical function.
sTRATTnCATtoN The process Of or result Of separating a sample into several subsamples
according to specified criteria such as age groups, socioeconomic status, etc. The
effect of confounding variables may be controlled by stratifying the analysis of re-
stilts. For example. lung cancer is known to be associated with smoking. 7 o examine
the pnssihk assnciation between urban atmospheric pollution and lung cancer, con-
trolling for smoking. the population mav be divided into strata according to smok-
ing status. The association between air pollution and cancer can then be appraised
separately within each stratum. Stratification is used not only to control lor con-
founding effects but also as a way of detecting modifying effects. In this example,
stratification makes it possible to examine the effect of smoking on the asstxiation
belween atmospheric pollution and lung cancer.
sTRAT7FIED RANDOMIZATION (Svn: blocked randomization) A randomization procedure
in tifiich strata are identified and subjects randomly allocated within each. This
produces a situation intermediate between paired allocation and simple random
allocation.
iTUDY DESIGN See RESEARCH DESIGN.
SUSCWNICAL DISEASE See DISF.ASE, SUSCLINICAL.
SURVEILLANCE Ongoing scrtuiny, gencrallc using methods distinguished by their prac-
licabiliti. uniformilc, and frequentlc their rapidity, rather than by complete accu-
rac%. Its main pnrpose is to detect changes in trend or distribution in order to
initiate InvesUgatn'e or control measures. See also MONITORING.
SURVEILLANCE OF DISEASE The continuing scrutiny nl' 211 aspects of occurrence and
spread of a disease that are pertinent to effective control.
Included are the svstematic collection and evaluation of (I) morbidity and mor-
talin repetrts. (2) special reports of (ield investigations of epidemics and ol' individ-
ual cases. (3) isolation and identilication of inlectious agents bc Ialx/ratories. (4) data
cnncerning the acailability, use, and untoward eflccts Of vaccines and toxnids. im-
mune globulins. insecticides, and other substances used in control. (5) information
regarding immunilv levels in segments Of the population, and (6) other relevant
epidemiologic data. A report summarizing Ihcsc data should be prepared and dis-
tribuled to all cooperating persons and others with a need to know thc results of
the sun'rillance aclivities. The procedure applies to all iurisdictional levels of public
health from local to international.' Serological surveillance identifies patterns of
current and past inlection using serological test. See also sEROEetDEwloLOCV.
'Bcnensnn AS (Ed.): Control o/CmewuniraMt f)uraut in Man, 14th ed. Washington tX:: American
f ublic flcalth Assnciation. 11.485.
suRVEr An investigation in which information is systematically collected but in which
the experimental method is not used. A population survey mac be conducted bv
face-to-face inquir), bc self-completed questionnaires, estionnaires, by telephone. postal service,
or in sonic other Mav. Each method has its advantages and disadvantagcs. For in-
stance, a face-to-face (interview) surve%ma) be a better way than sel(-completcd
questionnaire to collect information on attitudes or feclings, but it is more costh.
Existing medical or other records may contain accurate information, but not t atxtut
a representative sample of the population.
The information that is gathered in a survey is usually complex enough to re-
quire editing (for accuracy, completeness. etc.), crxting, keypunching, i.e.. entry on
PUNCH t:ARns and processing and analvsis by machine or computer. The gcneraliz-
abilitv of results depends upon the extent ln which the surveyed population is rep-
rescntative.

survey instrument 126
The term "survev' is sometimes used in a narrow sense to refer specifically to a
FIELD SURVEC. -
sURVEY INSTRUMENT The interview schedule, questionnaire, medical iw_ I examination re.
cord form. etc., used in a survey.
3URVIVAL ANALYSIS A class of statistical procedures for estimating the SURVIVAL FUNC-
Tlor:, and for making inferences about the effects on it of treatments, prognostic
factors. exposures, and other covariates
SURVIVAL CURVE A curve rve that starts at 100% of the study population and shows the
percentage of the population still surviving at successive times for as long as infor-
mation is available. Mav be applied not only to survival as such, but also to the
persistence of freedom from a disease, or complication or some other endpoint.
SURVIVAL FUNCTION (Svn: survival distribution) A function of time. usually denoted bv
SfU. that starts with a population 100% well at a particular time and provides the
percentage of the population still well at later times. Survival functions may be
applied to al» discrete event, for example, disease incidence or relapse, death, or
recovrn afier onset of disease (in which case the population is initially 100% dis-
eased. and the "survical" function gives the percentage still diseased).
SURVIVAL RATE (Svn: cumulative survival rate) The proportion of survivors in a group,
e.g.. of patients, studied and followed over a period. The proportion of` persons in
a specified group alise at the beginning of the time interval le._g., a hve-vear period)
who survive to (he end of the inlerval: It is equal to I minus the cuMULATtVE MuR-
TAUT% RATE. Afav be studied by current or CoHOaT LIFE TABLE methods.
sURVIVAL RATto The probabilitv of surviving between one age and another: when com-
puted for age groups. the ratios correspond to those of the person-years-lived func-
tion of a life table.
stntYrvoRSlm sTUDY Use of a cohort urx TAetE to provide the probabilitv that an event,
such as death, will occur in successive intervals of time after diagnosis and. con-
%ersel%, the probabililr of surviving each interval. The multiplication of these prob-
abilities of survival for each time interval for those alive at the beginning of that
intenal yields a cumulative probabilitr of surviving for the total period of study.
SYDENHAM, THOMAS_ (1624-1689) A great English physician in the tradition of Hippo-s crates and one
of the founding fathen of epidemiology (although his ideas about
the meteorological causes of epidemics were wrong). His writings contain many
careful and comprehensive accounts of important epidemic diseases, notabh, pla-
gue. malaria. measles. dvsente -ry, and scarlet fever. His OPrra Omnt have been twice
translated into English: the second (and better) two-volume translation br Latham
was published by the Svdenham Society in 1848-1850.
SYMSlosts The biological association of two or more species to their mutual benefit.
3YMMETRICAL RELATIONSHIP An association between variables that does not have direc-
tion.
The following four varieties can be distinguished:
I._ Functional interdependence, where one variable cannot exist without t the uther;
e.g.. prevalence is a function of incidence and duration.
2. Common complex, where variables occur together without being interdepen-
dent or necessan to each other; e.g., the occurrence together of air pollution,
poverty, poor housing, and overcrowding.
3. Alternative indicators of the same entity; e.g., antibodies to a microorganism
and history of specific infection caused by that microorganism.
4. The effects of a common cause; e.g., clinical and biochemical changes in hep-
atitis.
SlY aISFl ASStKIAT101:. Sl'MMETRICAL.
127
systems analysis
SYNDROME A symptom complex in which the symptoms and/or signs coexist more fre-
quently than would be expected by chance on the assumption of independence.
SYNERGISM, SYNERGY The definition of synergism in epidemiology is somewhat contro-
versial. We offer two definitions, the first a common dictionary definition, the sec-
ond a more specific definition encountered in bioassay.
I. A situation in which the combined effect of two or more factors is greater
than the sum of their solitary effects.
2. Two factors act synergistically if there are persons who will get the disease
when exposed to both factors but not when exposed to either alone. ANTACO-
NI$M, the opposite of synergism, exists if there are persons who will get the
disease when exposed to one of the factors alone, but not wh-n exposed to
both. Note that under these definitions two factors may act synergistically in
some persons and antagonisticalfv in others.
/YSTEMAYIC ERROR See RIAS.
SYSTEMS ANALYSIS This term is used with three similar meanings:
1. The examination of various elements of a system with a view to ascertaining
whether the proposed solution to a problem will fit into the system and, in
turn, effect an overall improvement in the system.
?. The analysis of an activity in order to determine preciselr what is required of
the svslem, how this can best be accomplished, and in what wavs the computer
can be useful.
3. Systems analysis refers to any formal analysis whose purpose is to suggest a
course of action by systematically examining the objectives, costs, effectiveness
and risks of alternative policies or strategies and designing additional ones if
those examined are found wanting. It is an approach tn nr way of looking at
complex problems of choice under uncertaintv; it is not va a method.
Ml} 1Sts %,J ('

129 transmissioo of infection
T
TASUnt, KANEHIRO (1849-1915) Japanese nobleman who studied medicine at St Tho-
mas's Hospital Medical School, London. He became a naval surgeon, and later used
his opportunity as director of naval medical services to conduct large-scale dietary
experiments on populations of naval personnel, demonstrating that beriberi could
be prevented by a mixed diet containing protein as well as rice.
TARGET ET roruwTlON
1. The collection of individuals, items, measurements, etc., about which we want
to make inferences. The term is sometimes used to indicate the population
from which a sample is drawn and sometimes to denote any "reference" pop-
ulation about which inferences are required.
2. The group of persons for whom an intervention is_ planned.
TAXONOMY A systematic classification into related groups.
TAXONOMY Or DISEASE The orderly classification of diseases into appropriate categories
on the basis of relationships among them, with the application of names. See also
NOSOGRAPHY, NOSOLOGYK --- f-DlsrtttstmoN, t-TEST The !-distribution is the distribution of a
quotient of indepen-
dent random variables, the numerator of which is a standardized normal variate
and the denominator of which is the positive square root of the quotient of a chi-
square distributed variate and its number of degrees of freedom. The (-test uses a
statistic that, under the null hypothesis, has the I-distribution, to test whether two
means differ significandy, or to test linear regression or correlation coefTicients.
The i-distribution and the /-test were developed by WS Gossett, who wrote under
the pseudonym "Student" as his employment precluded ded individual publication.
TEMTOGEN A substance that produces abnormalities in the embryo or fetus by disturb-
ing maternal homeostasis or by acting directly on the fetus in ulero.
TEST or SIGNIFICANCE See P VALUE: STATISTICAL SIGNIFICANCE.
TEST HYPOTHESIS See NULL HYPOTHESIS. TTtE_ORETICAL ErtDEMIOtAGY The development o_ f
mathematical/statiStical models to ex-
plain different aspects of the occurrence of a variety of diseases. With some infec-
tious diseases, models have been generated to elucidate the reasons for epidemics
and/or to predict the behavior of the disease in reaction to given control mea-
SUreS.See also MODEL.
TNERAPEUTIC TRIAL See CLINICAL TRIAL
THRESHHOLD 1JMrT VALUE See SAFETY STANDARDS.
THRESHOLD PHENOMENA Eve_nu_ or changes that occur only after a ceruin level of a
characteristic is reached.
TIME CLUSTER See CLUSTERING.
V C. Vfi kSC.G0 Z 128
TIME-rLACE CLUlTER See CLUSTERING.
TOTAL FERTtLttY RATS (TFR) The average number of children that would be born per
woman if 211 women lived to the end of their childbearing years and bore children
according to a given set of age-specific fertility rates. It is computed by summing
the age-specific fertility rates for all ages and multiplying by the interval into which
the ages are grouped. The TFR is an important fertility measure, providing the
most accurate answer to the question, "How many children does a women have, on
average?"
TRACER DISEASE METHOD Tracer or indicator conditions as defined by Kessner' are
easily diagnosed, reasonably frequent illnesses or health states whose outcomes are
believed to be affected by health care and which taken in aggregate should reflect
the gamut of patients and health problems encountered in a medical practice. The
extent to which the recorded care of these conditions concurs with preset standards
of care is used as an index of the quality of care delivered. However, it should first
be shown that the preset standards contribute to a favorable outcome. See also
SENTINEL HEALTIt EVENT.
' Kessner bM. Snow CK. Singer J: Auttnmrnr yf Mrdicd Carr jor CAildrrn. Washington DC: National
Academy of Sciences, tnstitute of Medicine. 1974.
TRANSMISSION OF INFEGt7ON Transmission of infectious agents. Any mechanism by which
an infectious agent is spread through the environment or to another person. These
mechanisms are defined in Contro_l of Communicablr Durasr in Man' as follows:
a. Direct transmission
Direct and essentially immediate transfer of infectious agents (other than
from an arthropod in which the organism has undergone essential multipli-
cation or development) to a receptive portal of entry through which human
infection may take place. This may be by direct contract as by touching, kiss-
ing, or sexual intercourse, or by the direct projection (droplet spread) of drop
let spray onto the conjunctiva or onto the mucous membranes of the nose or
mouth during sneezing, coughing, spitting, singing, or talking (usually limited
to a distance of about I m or less). It may also be by direct exposure of sus-e ceptible tissue to
an agent in soil, compost, or decaying vegetable matter in
which it normally leads a saprophytic existence. (e.g., the systemic mycoses),
or by the bite of a rabid animal. Transplacental transmission is another form
of direct transmission.
b. Indirect transmission
Vrhic4-borne--F.onuminated materials or objects (fomites) such as toys,
handkerchiefs, soiled clothes. bedding, cooking or eating utensils, and surgical
instruments or dressings (indirect contact); water, food, milk, biological prod-
ucts including blood, serum, plasma, tissues, or organs; or any substance serv-
ing as an intermediate means by which an infectious agent is transported and
introduced into a susceptible host through a suitable portal of entry. The agent
may or may not have multiplied or developed in or on the vehicle before
being transmitted.
Vafor-bornr--(I ) MecMnicaf: Includes simple mechanical carriage by a crawl-
ing or flying insect through soiling of its feet or proboscis, or by passage of
organisms through its gastrointestinal tract. This does not require multiplica-
tion or development of the organism. (2) Biological: Propagation (multiplica-
tion), cyclic development, or a combination of ihese (cyclopropagative) is re-
quired before the arthropod can transmit the infective form oF the agent to
man. An incubation period (extrinsic) is required following infection before

trneaov.rial transmission 130
the arthropod becomes infective. The infectious agent may be passed vertically
to succeeding generations (transovarian transmission); transstadial transmis-
sion is its passage from the one stage of the life cycle to another, as nymph to
adult. Transmission may be by saliva during biting or by regurgitation or dep-
osition on the skin of feces or other material apable of penetrating subse-
quently through the bite wound or through an area of trauma from scratching
or rubbing. This is transmission bv an infected nonvertebrate host and must
be differentiated for epidemiologic purposes From simple mechanical arriage
by a vector in the role of a vehicle. An arthropod in either role is termed a
.vector."
Airborne-The disseminauon of microbial aerosols to a suitable portal of en-
try. usually the respiratory tract. Microbial aerosols are suspensions in the air
of particles consisting partially or wholly of microorganisms. Particles in the
1-5µ range are easily drawn into (he alveoli of the lungs and ntay be retained
there; man) are exhaled from the alveoli without deposition. They may re-
main suspended in the air for long periods of time, some retaining and others
losing infectivity or virulence. Not considered as airborne are droplets and
other large particles that promptly settle out (see Direct transmission. above).
The following are airborne and their mode of transmission is direct:
Droplef nuclri: Usually the small residues that result from evaporation of
fluid from droplets emitted by an infected host (see above). Droplet nuclei also
may be created purposely by a variety of atomizing devices, or accidentally as
in microbiology laboratories or in abattoirs, rendering plants, or autopsy rooms.
They usually remain suspended in the air for long periods of time.
Dust: The small particles of widely varying size that may arise from soil (as,
for example, fungus spores separated from dn soil by wind or mechanical
agiution). clothes, bedding, or contaminated floors.' See a1S0 ACQUAINTANCE
NETWORK: AIR-SORNE INFECTION; C_ARRIER; COMMON VEHICLE SPREAD; CONTAC_T;
CONTAMINATION; DROPLET NUCLEI.
' Benenson AS (Ed.): Control of Coinin.nicab4 Duravr in Man. 14th ed. Washington lH;: American
Public Health Association, 1985.
TIIANSOVARIAL TRANSMISSION See VECTOR-RO_ RNE INFECTION.
TRANSPORT HOST See PARATENIC HOST.
TnEND A long-term movement in an ordered series. e.g., a time series. An essential
feature is that the movement, while possibly irregular in the short term, shows
movement consistently in the same direction over a long term. The term is also
used loosely to refer to an association n which is consistent in several samples or strata
but is not statistically significant.
TIIEND UNE That line that best fits the distribution of a set of values plotted on two
axes.
TRIAL See CLINICAL TRIAL.
T.tOHOC t.TUOV A retrospective ase-control study. -The term, proposed by AR Fein-
stein,' is the inversion of "cohort;" its use is deprecated by the great majority of
epidemiologists.
' Clin Phavnacol TAtr S0:5fi1-577. 1981.
TYPE I ERROR See ERROR.
TYPC 11 ERROR See ERROR.
TwtN YTVOr Method of detecting genetic etiology in human disease. The basic premise
of twin studies is that monozygouc twins, being formed by the division of a single
131 two-tuil test
fertilized ovum, carry identical genes, while dizygotic twins, being formed by the
fertilization of two ova by two different spermatozoa, are genetically no more sim-
ilar than two siblings born after separate pregnancies._
Two-TAIL TEST A statistical significance test based on the assumption that the data are
distributed in both directions from some central value(s).

U,v
;
tmttASSSED ttmMATOR An estimator that for all sample sizes has an expected value equal
to the parameter being estimated. If an estimator tends to be unbiassed as sample
size increases. it is referred to as asymptotically unbiassed.
UNDERLYING CAURE OF DEATtI See DEATH CERTIF/CATE.
UNDERREroR77NG Failure to identify and/or count all cases, leading to reduction of nu-
meralor in a rate. See also ERROR.
tmtxrv In economics, this means satisfaction derived from obtaining some quantity of
a specified article of commerce. When used in decision theory or CLINICAL DECISION
ANALYSIS, the meaning is essentially the same, and can be expressed as the useful-
ness or desirability of an outcome resulting from a decision.
VACCINATION Strictly speaking, vaccination refers to inoculation (from Latin in oculus,
into a bud) with vaccinia virus against smallpox. Nowadays the word is broadly used
synonymously with procedures for immunization against all infectious disease.
VACCtNE_ Immunobiological substance used for active immunization by introducing into
the bodc a live modified, attenuated, or killed inactivated infectious organism or its
toxin. The vaccine is capable of stimulating immune response by the hosl, who is
thus rendered resistant to infection. The word "vaccine" was originally applied to
the serum from a cow infected with vaccinia virus (cowpox; from Latin vocca, cow);
it is now used of all immunizing agents.
VALIDATION The process of establishing that a method is sound.
VALIDrTY This term, derived from the Latin validus, strong, has several meanings, usu-
ally accompanied by a qualifying word or phrase.
VAU_ Dttv, MEAaUREntENrr An expression of the degree to which a measurement mea-
sures what it purports to measure.
Several varieties are distinguished, including construct validity, content validity,
and criterion validity (concurrent and predictive validity).
Construcl uafidity: The extent to which the measurement corresponds to theoreti-
cal concepts (constructs) concerning the phenomenon under study. For example, if
on theoretical grounds, the phenomenon should change with age. a measurement
with construct validity would reflect such a change.
Content vatidih: The extent to which the measurement incorporates the domain
of the phenomenon under study. For example, a measurement of functional health
status should embrace activities of daily living, occupational, family, and social func-
tioning, etc.
Crilrrion validity: The extent to which the measurement correlates with an exter-
nal criterion of the phenomenon under study. Two aspects of criterion validity can
be distinguished:
I. ConcurrEnz validity: The measurement and the criterion refer to the same point
in time. An example would be a visual inspection of a wound for evidence of
133 v.riate
infection validated against bacteriological examination of a specimen taketl at
the same time.
2. Predictivr validih: The measurement's validity is expressed in terms of its abil-
ity to predict the criterion. An example would be an academic aptitude test
that was validated against subsequent academic performance.
VALIDtTY, asvov The degree to which the inference drawn from a study, especially
generalizations extending beyond the study sample, are warranted when account is
taken of the study methods, the representativeness of the study sample, and the
nature of the population from which it is drawn. Two varieties of study validity are
distinguished:
1. Internal validitr: The index and comparison groups are selected and compared
in such a manner that the observed differences between them on the depen-
dent variables under study may, apart from sampling error, be attributed only
to the hypothesized effect under investigation.
2. Extnnal validih (grnerolirability): A study is externally valid or generalizable if
it can produce unbiased inferences regarding a target population (beyond the
subjects in the study). This aspect of validity is only meaningful with regard
to a specified external target population. For example, the results of a study
conducted using only white male subjects might or might not be generalizable
to 211 human males (the target population consisting of all human males). It is
not generalizable to females (the target population consisting of all people).
The evaluation of generalizability usually involves much more subject-matter
judgment than internal validity.
These Fpidemiologic definitions of the terms "internal validity" and "external va-
lidit)'' do noI correspond exactly to some definitions found in the sociological lit-
erature.
VARIABLE Any quantity that varies. Any attribute, phenomenon, or event that can have
different values.
VARIARLE, ANTECE6ENT A variable that causally precedes the association or outcome
under study. See also EXPLANATORY VARIABLE: INDEPENDENT VARIARt.E.
VARIABLE, CONFOUNDING See CONFOUNDING.
VARIA/LE, lR1NTROL Independent variable other than the "hypothetical causal variable"
that has a potential effect on the dependent variable and is subject to control by
analysis.
VARIASLE, DEPENDENT See DEPENDENT VARIABLE.
VARIAtLE, DISTORTER A CONFOUND_ ING_ VARIABLE that diminisftes, nlasks, or reverses the
association under study.
YARIARLE, EXPERIENTIAL See INDEPENDENT VARIAELE.
VARIASLEINDEPENDENT SeetNDEPENDENT VARIARtE.
VARIABLE, INTERVENING See INTERVENING VARIARtL
VARIABLE, MANIFESTATIONAL See DEPENDENT VARIABLE.
VARIABLE, MODERATOR See_ EFTECT MODIFIER.
VARIAtaLE, PASSENGER See PASSENGER VARIABLE.
VARIABLE, UNCONTAOIJ.ED A (potentially) confounding variable that has not been brought
under control by design or analysis. See also CONFOUNDING.
VAR/ANCE A measure of the variation shown by a set of observations, defined by the
sum of the squares of deviations from the mean, divided by the number of DEGREES
OF FREEI>OM in the set of observations.
VARIATE (Syn: nndom variable) A variable that may assume any of a set of values, each
with a preassigned probability (known as its distribution).
s~~~~~~zoz 1

vmwr 134
V EcroR
I. In infectious disease epidemiology, an insect or anv living carrier that trans,
ports an infectious agent from an infected individual or its wastes to a suscep.
tible individual or its food or immediate surroundings. The organism may or
may not pass through a developmental cycle within the vector.
2. In statistics, an ordered set of numbers representing the values of a set of
variahies.
vECTOR-woRNE MFEGr10N Several classes of vector-borne infections are recognized, each
with epidemiologic features that are determined by the interaction between the
infectious agent and the human host, on the one hand, and the vector on the other.
Therefore, environmental factors such as climatic and seasonal variations influence
the epidemiologic pattern by virtue of their effects on the vector and its habits.
The terms used to describe specific features of vector-borne infections are:
Biological tran.nniuion: Transmission of the infectious agent to susceptible host by
bite of blood-feeding (arthropod) vector as in malaria, or by other inoculation, as
in Schirtaoma infection.
6xtrinaic inevbation period: Time necessary after acquisition of infection by the (ar-
thropod) vector for the infectious agent to multiply or develop p sufficiently so that
it can be transmitted by the vector to a vertebrate host.
Hibrznation: A possible mechanism by which the infected vector survives adverse
cold weather by becoming dormant.
lnaPparEnt infection: Response to infection without developing overt signs of ill-
ness. If this is accompanied by viremia or bacteremia in a high proportion of in-
fected animals or persons, the receptor species is well suited as an epidemiologically
important host in the transmission cycle.
Mechanical transmiuion: Transport of the infectious agent between hosts by ar-
thropod vectors with contaminated mouthparu, antennae, or limbs. There_ is no
multiplication of the infectious agent in the vector.
OvmoinJmmng: Persistence of the infectious microorganism in the vector for ex-
tended periods, such as the cooler winter months, during which the vector has no
opponunny to be reinfected or to infect a vertebrate host. Overwintering is an
important concept in the epidemiology of vector-borne diseases since the annual
recrudescence of viral activity after periods (winter, dry season) adverse to contin-
ual transmission depends upon a mechanism for local survival of an infectious mi-
croorganism or its reintroduction from outside the endemic area. To some extent,
the risk of a summertime epidemic may be determined by the relative success of
microorganism survival in the local winter reservoir. Since overwinter survival may
in turn depend upon the level of activity of the microorganism during the preced-
ing summer-fall, outbreaks sometimes occur for two or more successive years.
Transouanal infection ((ran:miuion): Transmission of the infectious microorganism
from the affected female arthropod to her progeny.
vEC_ roR srAcE An area (or volume) defined by the specified dimensions of two (or
three) vectors.
VEHICLE OF IN[ECTION TRANSMISSION The mode of transmission of an infectious agent
from its reservoir to a susceptible host. This can be person-to-person, food, vector-
borne, etc.
VENN D_ IAGRAM A pictorial presentation of the extent to which two or r more quantities
or concepts are mutually inclusive and mutually exclusive.
VIRCHOw, RuDOLr (1821-1902) Born in Pomerania, Virchow graduated in medicine
from Berlin in 1843 and rapidly estab)ished his reputation as the leading medical
ft+.c(, VE Y9i..Z0Z
Hypothetical causal (independent)
variable, X
StrcnRth of ss.ocistion of dependent
variable with hypothetical causal variable
before introduction of th'ud, control
variable (proponion of variance accounted
for by caual variable = A)
Overlap, in anociations.vith dependent
v.risb)e, or hypothetical pual variable
and control variable (- C)
Dependent variable, Y
Strensth of association of dependent
variable with control variable (proport_bn
of variance accounted for by causal
variable =s)
Control v.ri.ble, Z
Venn diagram. From Susser, 1973.
scientist of his time. Modern pathology owes much to his rigorous use of hypothesis-
testing methods. illustrated in his first paper in the journal he founded. Archu, Jur
palhologuchr Anotomic, now universally known as Virchow's Archives. Virchow was
also a practicing epidemiologist, who investigated a serious epidemic of typhus in
Silesia in I848; his recommendations for hygienic and social reform gnt him into
trouble with the government, but his scientific brilliance made it impossible for the
authorities not to recognize and reward him with promotions and honors. He en-
tered Parliament in 1862, and during the Franco-Prussian War he organized an
ambulance service. He made many contributions of fundamental importance to the
science of pathology, but deserves to be remembered as a great humanitarian as
well.
vtRCSN roruuTtoN A population that has never been exposed to a particular infectious
agent.
VIRVLENCE The degree of pathogenicity; the disease-evoking power of a microorgan-
ism in a given host. Numerically expressed as the ratio of the number of cases of
oven infection in the total number infected, as determined by immunoassay. When
death is the only criterion of severity, this is the case-fatality rate.
VITAL RECORDS (Literally, "To do with living") Certificates of birth, death. marriage,
and divorce required for legal and demographic purposes.
VITAL [CTATISTlC3 Systematically tabulated information concerning birlhs, marriages, di-
vorces, separations, and deaths based on registrations of these vital events.

W, X,
WASIIOrfr PHASE That stage in a study. especially a therapeutic trial, when treatment is
Nithdrawn so that its effects disappear and the subject's characteristics return to
their baseline state.
woRM couNT A method of surveillance of helminth infection of the gut that depends
upon counts of wornts, or their cvsts or ora, in quantitativelv titrated samples of
feces. Other terms used su describe this form of surveillance are "egg count," "cyst
count- and "parasne count."
Wu, LEN-TErr ( IN79-14Ni(1) Chinese epidemiologist, responsible litr controlling the plague
pandemic in Manchuria in 1y10-11. Later he worked un control of sexualh trans-
mitted diseases and other %ocioeconomicallv determined health cmtditions, devel-
oped a national quarantine service and was one of the founders of the Chinese
Medical Association, thus helping to la) the foundations for health improvements
in mcrclern China.
xENostonc
I. (Svn: commensal. sembiosis) Pertaining lo association of two animal species,
usualh insects. in the absence of a dependencv relatiunship, as opposed to
parasitism.
2. A foreign compound that is metabolized in the Ixrdv. f.fanc pesticides and
their derivatives. some fixxd additives and a number of other complex organic
comlxuunds such as dioxins and PCBs. are xenobiotics.
XENODIAGNOSIS 1)etection of a (human) pathogenic organism bv allowing a nuninfected
vector (e.g.. mosquito) to consume infected material, and then examining this vec-
tor for e%idence of the pathogen.
YATrs' GORRECTION An adjustment proposed by 1`ates (1934) in the chi-square calcu-
lation for a 2x2 table, which brings the distribution based tin discontinuous fre-
quencies closer to the continuous chi-square distribution from which the published
tables for testing chi-squares are derived
YEARS OF ROTENTIAL_ LIFE LOST (YILL) See IOTENTIAL 1'FARS OF LIFE LOST.
YIELD The number or proportion of cases of a condition accurately identihed by a
screening test.
- YovoeN's tNDEx When assessing screening tests, in the uncommon case where the risk
of a false negative and that of a false positive result are assumed to be equivalent
(i.e., specificity and sensitivil), assumed to be equally important), it may be possible
to compare screening tests through the Youden index based on n the sum of specific-
ity and sensitivity:
1'ouden Index=f =specificity+sensitive- 1
with J ranging from zero (slxcificity=0.5(t and sensitivity=0.50) to I(sensitiv-
ity = 1.00. specificity= 1.00).
137
zoonosis
yERO-TIME SN!!T This concerns the selection of a starting point for the measurement
of survival following the detection of disease. It is a Jargon term. denoting the
movement "backward" (toward the starting point of a disease) of time between on-
set and detection, that may accompany use of a screening prrx-edure.
zooNOSts An infection or infectious disease transmissible under natural conditions from
vertebrate animals to man. Examples include rabies and plague. May be enzoHotis
or epizootic.
~~ V C.~~ ti zoz 136

Bibliography
Mam ol the works on this list contain glossaries, and nearly all contain definitions that
ha%e been adapted and included in this dictionary.
Abramsou IH: Survrs MtNwdi in Communih' Medicine, 3rd ed. London: Churchill Living-
stnne. 1984.
Alderson M: An lntroduction to E/nd.miologr, 2nd ed. London: Macmillan. 1983.
Allabs M: /)ictionary of thc Environmrnt. Southampton: London Press, 1975.
Armitage I': Statutical Methods in Medical Research. Oxford: Blackwell, 1971.
Balm AK: Bo.ur Medical Statistics. New York: Grune & Stratton, 1972.
Barker DII', Rose G: Epidemiology in Medical Practice, 2nd ed. Edinburgh: Churchill Liv-
ingstone, 1979.
Benenson AS (Ed): Control of Communicable Diseases in Man, 14th ed. Washington DC:
American Public Health Association, 1985.
ISol;uc DJ: Principdti af 1)emograp/tr. New York: Wiley. 19fi9.
Breslow NE, Day NE: Stotistiral Methods in Cancrr Research, Vol l: Tha Arutlrsia oJ Cav-
Control Uata. Lyon: IARC. 1980.
Gavalli-Slirrta LL, Bodmer WF: The Gnietics of Human Populations. San Francisco: Free-
rnan. 1971.
Gdton 1: Statidics in Medicine. Boston: Little. Brown. 1974.
(i)mrnittte on Population and I1Fmographs, National Academy of Science/Natinnal Re-
search Council: Collrcting Data for thc Estimation of Frrtilitr and Afortahtr. Washing-
ton DC: National Academy Press, Report Number 6, 1981.
C;ux DR. Hinkle? DV: Theoretical Statistics- New York: Chapman and Hall. 1974.
Davies G: A Dictionnry of Veterinary Epidcmiologr. Mimeographed. To be published.
Dublin LI, Lotka AJ: The Moner Valur oJa Mon. NeN' York: Runald. 1930.
Farr. W: Vital Statittics (Ed. NA Humphreys). London: Stanford, 1885. (A modern
abridgement, edited by A Adelstein and MW Susser. was published by the New
1'ork Academy of Medicine in 1975.1
Feinstein AR: Clinical Biostatistics. St Louis: Mosby. 1974.
Feinstein AR: Clinical Epidemiology. Philadelphia: Saunders, 1985.
Feinstein AR: A glossary of neologisms in quantitative clinical science. CGn Plmnnarnl
Thrr 30:564-77, 1981.
Fisher RA: Statistical Methods and Scientific lnjrrencr. Edinburgh: Oliver and Boyd. 1956.
Flciss JL: Statutiral Methods for Rates and Proportions. 2nd ed. New York: Wiley, 1981.
Fletcher RN, Fletcher SW, Wagner Efl: Clinical EPidcmioloRr-Thr EsscntreLs. Bahimore:
Williams & Wilkins. 1982.
Friedman GD: : t'nmcr of Epidnniologir, 2nd ed. New York: McC:raw-Hill. 1980.
Farom J: An international glossary for primary care. f Family Prart 13:fi73-.G81, 1981.
Garrison FH: An fntroduction to the History of Medicine, 4th ed. I'hiladelphia: Saunders,
1929.
139

bibliography 140
Greenwood M: Epidrwtiq and Crvad Dirmxs. London: Williams and Norgate. 1933.
Haupt A. Kane T-T-: Populotion HanJGowF, 2nd ed. Washington DC: Population Refer-
ence Bureau, 1985.
Hogarth J: G(asare o/Hraftb Cars Tmninologx. Copenhagen: World Health Organization,
1975.
Holland 11'N (Edl: Data Handling in Epidrnniology. London: Oxl'ord, 1970.
Holland WN' (Ed): Eualuation q/ Health Carr. Qxford: Oxford University Press, 1983.
Holland W11', Detels R, Knox G(Eds): Oxford T-e:tlxwk q/ Public Health, Vol 3. Oxford:
Oxford University Press. 1985.
Ibrahim MA: Epidrmiologr and Health Pdics. Rockville, MD: Aspen, 1985.
Jammal A. Allard R, Loslier G(Eds): Dictiunnaire d'rpidemiolugie. Si Hyacinthe, Ma-
ksine. Paris: Edisem, 1988.
Jenicek M. Ckruux K: EptdEwialoRir. St Hyacinthe, Qufhec: Edisem. 1982.
Kahn HA: An lntrudunion to Epidrwriolagir Mrthoda. New York: Oxford University Press,
1983.
I:elser JL. Thompson WD. Evans AS: Methods in Odsrn.miorral Epidsatirrtogy, New York:
Oxford Universitv Press. 1986.
Kendall MG. Buckland AA: A Dictiottarrt of Statistical Tmw, 4th ed. London: Longman,
1982.
Kkinbaum DG. liupper LL, Murgenstern H: Epidrwrolow-Principlrs and (Luaruitotiw
Mrthodt. Belmont: LilEtime Learning Publiutions, 1982.
fUug WS. Cummings MR: Corurptr oJ Gennin. Columbus OH: Merrill. 1986.
Knox EG (Ed): Elndrrrrwlogt in Health Carr Planning. London: Oxford rd_ University Press,
1v79 ~.
Last JM (Ed): Masn-Rosrmu Public HraltA and Prarrntiuv Mrdinnr, 12111 ed. Norwalk,
CT: Appktun-Ckntury-Crolts. 1986.
Last JM: PuMrr Health and Human EcoluRr. Norwalk CT: Appleton and Lange, 1987.
Lilienleld AM. Lilicnlekl D: /toundotiont of Epidemiology, 2nd ed. New York: Oxford
University Press, !979.
klacmahon B. Pugh TF: EpiMnioloRr: Principles and Msehodt. Hustvn: Liuk. Brown, 1970.
Mausner IS, Kramer S: Epidemiology. 2nd ed. Philadelphia: Saunders, 1985.
McDowell I. Newell C: Afrasuring Hraltb: A Gtudr to Hattng Scales and Qursuarnwirn. New
York: Oxford Universitv Press. 1987.
Meadows AJ, fbordon M, Singleton A: Dictwnan of Nru, ln/onwtion 7}chrw/o&y. London:
Censury, 1982.
Mcinert CL: Clrnual Tna4. New York: Oxford University I'ress, 1986.
klieninen OS: Theoretical Epidrwialugir. New York: N'iky. 1985.
Morris JN: Usrs of Epdrwrology. 3rd ed. London: Churchill Livingstone, 1975.
Morrison AS: Scrrrnin6 in Chronic Disease. New York: Oxford University Press. 1985.
Morton NE: Oullnv of genetic rpiJrraidogs. New York: Karger, 1982.
Murphy EA: Thr t.oRir of Afrdairrr. Baltimore: Johns Hopkins University 1'ress, 1976.
Murphy EA: A Comparritrn tu Medical Statistics. Baltimore: Johns Hopkins University Press,
-
1985.
Oldham PD: Measurement in Medicine. Baltinttsre: Johns Hopkins University Press, 1976.
O.rford English DictionorT (OED). London: Oxford University Press, 1971.
I'ressat R: Daoorwn of Drraoaraphr. English Trans. Ed. Christopher Wilson. Oxford:
Blackwell, 1985.
-
Rintm AA. Hartt AJ, laalblleisch JH, Anderson AJ, Hoffmann RG: Basic Aiamtistics in
Mrdirinr and Epidemiology. New York: Appkton-Century-Crofu, 1980.
Rothman KJ: Modern Epidemiology. Boston: littlc, Brown. 1986.
CYA
(,OZ
W%irSL
141 bibliography
Rothman I:J (Edl: Catual Inferenu. Chestnut Hill. MA: Epidemiology Resources, Inc.,
1988.
Rumeau-Rouquette C. Breart G, Padieu R: MttAoda en fpidtrnioingir. Paris: Flammarion,
1980.
Schlesselman JJ: Cau-Control Studies; Dnipr, Conduct, Analuiu. New York: Oxford Uni-
versity Press. 1982.
Schuman SH: Practicr-Basrd Epidraiolor. New York: Gordon and Breach, 1986.
Silverman WA: Human Exprrir.rntatiorc A Guid.d Strp into the Unknown. Oxford: Oxford
University Press, 1985.
Skinner HA: Thr Origin of Medical Trrw. 2nd ed. Baltimore: Williams k Wilkins. 1961.
Snedecor GW. Cochran WH: Statutical Methods. 7th Ed. Ames, 10: Iowa University Press,
1979.
Snow. J: On the Afodr of Cowaunication of Cholera. 2nd ed. London: Churchill, 1855.
(Reprinted 1936 b%. the Commonwealth Fund of New York, with an Introduction
bc Wade Hampton Frost; and again by Hafner, New York, 1973.1
Sohnr El) (Ed): Glossarr of Evaluation Trrwt. Geneva: United Nations. 1978.
Stallcsbrass CO: The Principlrs of Epidrteidogy. London: Routledge, 1931.
Slydrxan'r Medical Dictiorurt, 22nd ed. Baltimore: Williams & Wilkins. 1972.
Susser MW: Causal Thinking in the Health Sciences. New York: Oxford University Press,
1973.
Susser MW. Watson W, Hopper K: SocidoV in Mrdicinr, 3rd ed. New York: Oxford
Universits Press, 1985.
Swaroop 5: ln/roduction to Health Slatistin. Edinburgh: Livingstone, 1.460.
Tunta B(Ed): Glossaire d'Epidfmiolugie animale. Mimeographed. Ecole Nationak Ve-
tcrin.irr D-Allort. 1987.
v.u de Wallc E: Alu/tilinFual Demographic Dictiortart; Englislr Sedion. Litge: Ordina tdi-
tions, 1982 (For the International Union for the Scientific Study of Population).
U.S. Department or Health and Human Services l`ask Force on Health Risk Asscss-
mcnt: /)ctmnininR Risks to HraUh-Frdrra/ Poliri and /'ractica Duver, MA: Auburn
Hausc. 1986.
U.S. House of Representatives: A Disrvnit+r Dictitmart of HroltA Carr. Washington DC:
US Cawernment Printing OfTice, 1976.
N'rlerrn's Thirrl h'ru humatianal Dirfionart. Unabridged. Springfield. MA: Mer.um, 1971.
Weed LL: Mrdical Nrcords, Mrdical Educatiotr and Palirrrt Carr. Cleveland: Case Western
Reserve Universit) Press. 1969.
N'hite fa-, Henderson M(Eds): Epidemiology as a£undastrntal Scimcr. New York: Oxford
Univcrsitr Press, 1976.
Wilstm C: Thr liictimrarr of Dr.ograpbr (Translation from the French of Roland Pressat).
Oxford: Blackwell, 1985.
Worid Bank: A Gloswn of Population Trrairrologs. Washington: World Bank, 1985.
Wurld (4l;auieatitm of National ColleRes, Academies (WONCA) of Family Practice: In-.
trrnaliorml Claslifiration of Health Probinm in Pri.wg Carr, 3rd ed. Oxford: Oxford
Unircrsitr Press. In I'ress, 1987.

®
News~
Numbers
A GUIDE 'IC7 REPORTING STATISTICALCLAIMS AND CONTROVERSIES IN HEALTH
AND OTHER FIELDS
Victor Cohn
SENIOR WRRER AND COLUMNiST, FORMER SCIENCE EDf[DR.
Wachingfan Flui
FOREMP-D sv Frederick Mosteller
ROGER G,LEE PROFESSOR EMERTMS OF MATHEMATICAL STA77STdCS,
Hmrmd (Iniurney
A1~-oject of the CentEr for Health Communication
Harvard School of Public Halth
Iff IOWA STATE UNIVERSITY PRESS / AMES
®
8
m
®
®
®

A Note tc
® 1989 Victor Cohn. All rights neserved
Compoeed by Iowa State LJnivreisry Pness
Printed in the United States of Americs
No pan of this book may be reproduced in any form or by any ekcvonie or merlianical
means, inrliiding information aorage and reuieval rystems, without written pennission
from the publisher, except for brief passages quoted in a nview.
First edition, 1989
Library of Congress Cataiogin;-in-Publication Data
Cohn, Victoq 1919-
News & numbers.
'A project of the Center for Health Communication, Harvud School of Public
Health.'
1. Public health-Statiaia. 2: Environmental bealth-Statistics. 3. Vital
.utiatio. I. Harvard School of' Public Health. Center for Health Communiea-
uon:, II. Tide. III. Tide: News and numbers.
RA407.Cb4 1989 362.1'021 88-6807
ISBN 0-8138-1442-1
ISBN 0-8138-1437-3 (pblc.)
THE ruTe
fied. They aF
stated or iinF
porting, busii
This gui
language of :
about t},- ^na
on sor.( P
project of the
health and th
ciples and mz
used by inqu:
a scientific re
environment
weighandc
shows how tJ
N
0
N
W
U1
~
W

A Note to Readers
THE rules of statistics are the rules of good thinking, codi-
fied. They apply to any kind of reporting in which numbers-
stated or implied-are involved: political reporting, science re-
porting, business, economics, sports, or whatever:
This guide is an attempt to explain the role, logic, and
language of statistics, so we reporters can ask better questions
about the many alleged facts or findings that rest, or should rest,
on some credible numbers. Because this manual began as a
project of the Harvard School of Public Health, the reporting of
health and the environment is the major example. But the prin-
ciples and many of the suggested "questions for reporters" can be
used by inquiring reporters in any field. They can help you read
a scientific report or listen to the conflicting claims of politicians,
environmentalists, physicians, scientists, or almost anyone and
weigh and explain them. And the final chapter specifically
shows how these principles apply in all areas.
!
VICTOR COHN
N
0
N
Ca
t!1
~
N
>~A
tiP
CA
0
0

~._
~_---
- ~.
Contents
hQRF:WL)FCI) KY F':rdentk r4fnatel4., ix
ACKNCXNLEDC:Iv1F.P1"I?5; xi
1. Facts and Figures-We Can Do Better, 3
2. The Certainty of Uncertainty, 8
3. The Scientific Way, 12
Probability, 14
'Power' and Numbers, 20
Bias and Confounden, 24
Vuiability, 30
4. Studies, Goo&and Bad; 35
Experiments versus Seductive Anecdotes, 37
Clinical Trials, 38:
What Makes a Study Honest' 40
Epidcmiolcgy: Hippocrates to AIDS, 43
S. Questions Reporters Can Ask, 48
6. 'I'ests and Testing, 64
Drugs and Drug Triali, 68
Animals as Models for Us. 72
7. Vital Statistics: The Numbers of Life and Health, 74
Crude Rates versus Rates Tham Compare, 76
OtAer Ways to Compaee, 78
Rcponing Hoapita! Death Rate, 79
Cancer Rates and Cancer'Cutei , 86
The linportant Quetioru about Cuxer, 88
Shifes, Drifts, and Blip, 96
vii
®
®
®
w
a
0
®
0

viii CON'1't:N'tS
8. The Statistics of Environment and Risk, 98
Who's Bdleva6le? 1U7'
Qlleft1oT/5 ttl Ask, 108
Evaluating Envnnnmemal Huards, 116
Advice from kcponers, 121.
9. The Statistics of Politics, Economics, and' Democraey 126
The State of'the Nation's Sutittia, 146
The Bonom Lne 151
w r1 t:R t-: '1 v r.EA R N moR r: A Bibleogapliy and orheT Sourcu, 153'.
NO'1'F.S, 157
GLOSSARY/iNAEX, 165
Foreword
REPORIT
science to the
accvracy.A1th!
stories, the bic(
presents speci2
misleading mt
sistent," and 'y
sults that ane
layTnen' r
definitio.__ -a~ftc
siderable differ
Science h
such as biostat
have been imp
ertheless, they
permanent for
Victor Cc
ual to'help all
wants to give t
facts or mystif
Cohn's bo
Science Policy
Research and
that faculty m
have been able

/
<
®
N
Foreword
REPORTERS play an essential role in communicating
science to the public. In common with scientists, they desire
accuracy Although health and medicine provide many exciting
stories, the biostatistScs that scientists must use in their studies
presents speciaI problems for reporters. It gives uncommon and
misleading meanings to common~ words like "significant," "con-
sistent," and "power." Mathematical statistics often produces re-
sults that are disturbingly counterintuiti've, at least at first, to
laymen and scientists alike. In vital statistics and epidemiology,,
definitions often seem arbitrary, and slight changes make con-
siderable differences in the findings.
Science writers often take short courses in special topics
such as biostatistics. I have taught in some of these courses and
have been impressed by the seriousness of the participants. Nev-
ertheless, they need some of this material in an accessible and!
permanent form.
Victor Cohn~ of the Y1rashington Post has prepared this man-
ual to help all reporters cut through these statistical tangles. He
wants to give them a guide to the ways that statistics can darify
facts or mystify the reader.
Cohn's book grew out of the Media Project of our Health
Science Policy Working Group of the Division of Health Policy
Researeh and Education at Harvard University. I am pleased
that faculty members of the Harvard School' of Public Healtlt
have been able to help him produce this book as a visiting fellow
®
ix
0
®
t
®

x FOREWURD
in 1978 and 1984 and as a contributor to the Health Science
Policy Working Group.
Through the Media Project, with the help of Jay Winstens
we have also examined sources of pressures on the science
writer.' In the future we wanr to use what we have learned
through many discussions with science writers to advise scien,
tists on their role in the media.
By such e$brts, including this book, and by many similar
efforts in this and other fields, scientists and writers may gradu-
ally upgrade the whole communication system, scientific and
journalistic: Thus we may clear the communication channel
between science and~ the public.
FREDSRICK MOSTELr _FR
AcknowlE
MY mai
has been Ur.
tus of mathe
partments of
Harvard Sch
energy, and! }
for the fact
approach rat:
statcrr ~r
Tt,, prc
tions and by
ing which p
journalists, b
Ididmt
School of Pu
Center for IF
guide, and :
Thomas A. l
were Dts. C
Kaiser orgar:
and Peter Iv
writings 116
Cass Pete.r,o
Runkle, no :
I aLso o,
.

i
i
R
~
MY main mentor and guide in the preparetion of tlus book r
has been Dr. Frederick Mosteller, Roger I. Lee professor emeri- s
tus of mathematical statistics and former chairman of the de- s
pamnents of Biostatistics and' Health Policy and'Managemenr, 2
Harvard School of Public Health. He gave so fully of his time, ;
energy, and knowledge that he should be listed as coauthor but
for the fact that I sometimes used a journalist's freewheding ~
approach rather than a statistician's rigor. This makes any mis- ;~
statements mine.
The project was supported~ by the Russell Sage Founda-
tion, and by the Council for the Advancement of Science Writ-
ing, which pointed the way in holding seminars on statistics for
journalists induding the first of its kind in 1964.
I did much of the work as a visiting fellow at the Harvard~
School of Public Health, where Dt: Jay Winsten, director of the
Center for Health Communication, was another indispensable
guide, and Drs. John Bailar III, Nan Laird, Philip Lavin,
Thomas A. Louis, and Marvin Zelen were valuable helpers. As
were Dts. Gary D. Friedman and T homas M. Vogt of the
Kaiser organizations, Michael Greenberg of Rutgers UniNersity, n.
and Peter Montague of Princeton University (on all of whose Q
writings I leaned); Lewis Cope of the Minneapolis Star Tribune; r, w",
Cass Peterson of the Washington Post; and my daughter, Deborah ci
Runkle, no mean statistician. ~J1
I also owe thanks to Harvard's Drs. Peter Braun, Harvey ~
~
©

Fineberg, Howarr] Frazier, Howatd' Hiatt, William Hsaio,
Hetb Sherman, and William Stason. And to Drs. Stuart A.
Bessler, Syntex Corporation; H. Jack Geiger, City University of
New York; Nicole Schupf Geiger, Manhattanville College;
Charjes Moertel, Mayo Clinic; Arnold Reltnan, New Eregland
fourrusl ofil?edr<inc Eugene Robin, Stanford Universiry;and Sid-
ney Wolfe, Public Citizen Health Research Group. Also
Katherine W2llman, Council of Professional Associations on
Federali Statistics; Howard~ L. Lewis, American Heart Associa-
tion; Philip Meyer, University of North Carolina; Mildred~
Spencer Sanes; Earl Ubell, WCBS-TV, New York City; and
Philip Hilts, Cristine Russell, and Barry Sussman, Washington
Po.rt. I am indebted to my editors at the Washington Post, particu-
larly Abigail 1 Trafford, Ben Cason, Carol Krucoff, Len Downie,
and Howard Simons for their understanding and support.
The work was also aided by the Andrew W. Mellon Foun-
dation. The American Cancer Society, American Heart Asso-
ciation, Commonwealth Fund Gannett Foundation, Henry J.
Kaiser Family Foundation, Mayo Medical Resources, Milbank
Memorial Funds Pew Charitable Trusts, Philip ~ L. Graham
Fund, Russell Sage Foundation, and John~ Cowles, Jr., have
contributed to this manual's initial distribution.

a
Facts and Figures -
We Can Do Better
Facts and Figures! Put 'em Down!'.
-Chules Dick'ens (in 77r Chbnc)
There are lies, there arr damtud lies, and thete axe statistics.
-Duraeli
Almost everyone has heard that 'tigures don't 6e, but liars can figute' We need
statistics, but Uars give them a bad name, eo to be abk to tell the liars from the
statisticiasu is crucial.
,
®
-Dr. Robert Hooke
I I E journalists like to think we deal mainly in facts and
ideas, but much of what we report is based on numbers.
Politics comes down to votes. Budgets and dollais dominate
government. The economy, business, employment, sport.s-all
dtmand numbers.
'Ihe environment, pollutants, toxic chemicals. Again, we
see counts and measurements and, most likely, widely varying
estimates, some careful, some questionably high or low: An
environmentalist says a nuclear power plant or toxic waste
dump will cause so many cases of cancer. An industry spokes-
man denies it. What are their numbers? Where did they get
them? How valid are they?
A doctor reports a promising, even exciting new treatment.
Is the claim justified or based on a biased or unrepresentative
sample? Or too few patients to justify any claim? Science, medi-
cine, technology, the weather, intelligence- all are statistical.
IN

i!
CHAPTER 1
Science is observation, experimentation, measurement, and all
these involve numbers, whether we reporters pay attention to
them or not.
Statistics are used or misused even~by people who tell us, "I
don't believe in statistics," then claim that all of us or most people
or many do such and such. The question for reporters is, how
should we not merely repeat such numbers, stated or implied,
but also interpret them to deliver the best possible picture of
reality?
We can be better reporters if we understand how the best
statisticians-the best figumrs-figure. And if we learn a few
questions to help us separate the wheat from the chaff:
I do not say that telling the truth-describing reality-will'~
then become easy, for we are constantly bombarded'witli sweep-
ing claims in convincing wrappings, and the disputed subjects
are endless. Medical and~ surgical treatments, radiation, pesti-
cides nuclear power, the probability of environmental disasters,
the side effects of inedicines-almost nothing seems settled.
Like it or not, we must wade in. Whether we will' it or not,,
we have in effect become part of' the regulatory apparatus. Dr.
Peter Montague of Princeton University tells us, The environ-
mental and toiric situation is so complex, we can't possibly have
enough officials to monitor it. Reporters help officials decide
where to focus their activity"
"f,ournalists opened up" the Love Canal toxic waste issue by
"independent investigation," according to Cornell University's
I1r: Dorothy Nelkin. The extensive press coverage contributed
to investigations that eventually forced the re-staffing of the En-
vironmental Protection Agency and the creation of a national
toxic waste disposal! program:"'
That very coverage, however, may also have stampeded
public officials into hasty, ill-conceived studies that left un-
answered ~ the crucial question: Did the Love Canal wastes ac-
tually cause birth defects and other physical problems?2 The
very way we report a medical or environmental controversy can
affect the outcome. If we ignore a bad situation,, the public may
f
I
FACTS Af:D FIG
suffer. If we v.
"no danger," tI&
experimental i
false hope.
It isnot ,
National Can
refuse to con
think "carcinc
persons proba
cancers are er
most inforrnec
related main]
and very pos
percent ofL
aL
carcinogens-
foods, air, an
When it
issues, or wl-
making the si
state or unde
of he 1J
statisti~ at
terpretationm
evident; you
negative]. A
sterile is mon
that apple pi
We also
the space or
news di:recto ~
story yet." EN
done. In a r
major southc
traction afren
who worked
numbers fro

suffer. If we write "danger; the public may quake. If we write
"no danger," the public may be falsely teassured. If we paint an
experimental rnedicali treatment too brigtitly, the public is given
false hope.
It is not just what we write, it is what we emphasize. A
National Cancer Institute survey indicated that many persons
refuse to consider healthy changes in life-style because they
think "carcinogens are everywhere in the environment." Such
persons probably have read or heard again and again that most
cancers are environmentally related, although, in the opinion of
most informed scientists, most fatal "environmental" cancers are
rdated~ mainly to individual behavior, outstandingly smoking,
and very possibly diet. By various estimates, perhaps 5 to 15
percent of all cancers are related to exposures to man-made
carcinogens -chemicals we have inserted into the workplace,
foods, air, and water.'
When it comes to such emotionally charged and complex
issues, or when it simply comes to nanning for page one or
making the six o'clock news, the best among us sometimes over-
state or understate. Philip Meyer, veteran reporter and' author
of Rairion ,Journalirm, writes, 'Journalists who m.isinterprett
statistical data usually tend to err in the direction of overin-
terpretation.... The tzason for this professional bias is self-
evident; you usually can't write a snappy lead upholding [Ithe
negative]1 A story purporting to show that apple pie makes you
sterile is morr interesting than one that says there is no evidence
that apple pie changes your life'
We also work fast, sometimes too fast, with severe limits on
the space or tirne we may fill. We find it hard to tell editors or
news directors, "I haven't had; enough time. I don't have the
story yet:' Even, a long-term project or special may be hurriedly
done. In a newsroom "long-term" may mean a few weeks. A
majpr southern newspaper had to print a long, front-page re-
traction after a series of front-page stories alleged that people
who worked at or lived~ near a plutonium plant suffered in excess
numbers from a blood; disease. "Our reporters obviously had

FACTS AND FIGURFS: WE CAN DO BETTER 7
not patently absurd, it may not be the ltad you would go for a
year later"
We reporters are also subjecr to human hope and human
fean A new `cure" comes along, and we want to believe it. A
new alarm is sounded, and:we too tremble..
Alarrns also make news. We too often obey a sardonic
maxim: Bad news is good news; good news is no news. Ih: H.
Jack Geiger, a respected former science writer andnow a profes-
sor of medicine, says,
I know I wrote stories in which I explained' or interpreted the results
wtvngiy. I wrote stories that didn't have the dixlAuners I should have
written. I wrote stories under competitive pressure, when it became
clear later that I shouldti t have written them. I wrote stories when I
hadn't asked-because I didn't know enough to ask-Was your study
capable of getting the answers you wanted? Could' it be interpreted to
say something else? Did you take into acmunt possible confounding
factors?'

6 CHAPTER 1
confused statistics and'scientific data;' the editor admitted. "We
did not ask enough questions."s
We tend to oversimplify We may report "A study showed
that black is white" or "So-and-so announced'~t}iar ...," when a
study merely suggested that there was some evidence that such
might be the case. We may slight or omit the fact that a scientist
calls a result 'preliminary." As scientific unsophisticates, we may
confuse a study that merely suggests a hypothesis that should be
investigated-very frequently the case-with a study that
presents strong and~ condusive evidence.
We often omit essential perspective, context, or back-
ground! Dr. Thomas Vogt of the Kaiser Permanente Center for
Health~ Research tells of seeing, the headline `Heart Attacks
From Lack of 'C' " and then, two months later, 'People Who
Take Vitamin C Increase Their Chances of a Heart Attack"a
Both stories were based on limited, and'i far from condusive,
animal studies.
Scientists who do poor studies or overstate their results
deserve part of the blame. But bad~ science is no excuse for bad
journalism. We tend to rely most on "authorities" who are either
most quotable or quickly available or both, and'they often tend
to be those who get most carried away with their sketchy and
unconfirmed but "excfting" data-or have big, axes to grind,
however lofty their motives. The cautious, unbiased scientist
who ~ says, "Our results are incondusive" or "We don't have
enough data yet to make any strong statement" or "I don't know"
tends to be omitted or buried' someplace down in the story.
We are influenced too by intense and growing competition
to tell~ the story first and tell it most dramatically, I was once
asked by a Harvard researcher, "Does competition, affect the way
you present a story?' I thought and had to answer, "We have too
almost overstate. We have to come as dose as we can within the
boundaries of truth to a dramatic, compelling statement. A
weak statement will go no place" Another reporter said4 'he
fact is, you are going for the strong [lead and story]. And, while
FACTS AND FIC
not patently a'
year later.'"
We repor
fear. A new "c
new alarm is ~
Alarms a
maximc Bad r
Jack Geiger, a
sor of inediar
I know I wrote
wrongly. I wroi
written. I wrot
dear later that
hadnh asked-1
capable of getti
say something
factors?'
How car..
N
©
N
W
C11
~
IV
~
CA
j

The Cerlainty
of Uncertainty
Too muah of the Kornce reporting in the press [blurs] what we'tc sure of and'
what we're not very sure of and what is incandusive. The notion of tentative-
ness tends to diop out of much reporting.
-Lk. Harvey Brooks
The only trouble with a sure thing is the uncertainty.
-Author unknown
THE first thing to understand about science is that it is
almost always uncertain. A scientist, seeking to explain or trn-
derstand something-be it the behavior of an atom or the effect
of the toxic chemicals at a Love Canal-usually proposes a
hypothesis, then seeks to test it by experiment or observation. If
the evidence is strongly supportive, the hypothesis may then
become a theory or at some point even a law, like the law of
gravity.
A theory may be so solid that it is generallyy accepted.
Example: the theory that cigarette smoking causes lung cancer,
for which almost any reasonable person would say the case has
been proved, for all practical i purposes. The phrase "'for all prac-
tical purposes" is important, for scientists, being practical peo-
plh, musr often speak at two levels: the strictly scientific level
and'the leveJ of ordinary reason that we require for daily guid-
ance.
Example: In June 1985, 16 forensic experts examined the
bones that were supposedly those of the "Angel of Death," Dr.
Josef Mengelt. Dr. Lowell Levine, deltgated by the Depart-
ment of Justice, then said; 'he skeleton is that of Josef
0
THE CERTAIhM
Mengele withi.r
cos Segne of th
with ~ the law o
cians." Pushed
important mati
of the patdiolof
findings.' (Iat,
But all' ar
tainty in almos
bilit} that such
Widely bc
wholly or part)
say" reports L
, Jnurnal of Madi,
help the public
with an eltmer
a probable nat
not certainty. V
best opin,'^*t att
furure." )
Exa.--,.e:
mended'that M
cal cancer. Th
three years fo:.
Statistics had
matter is still c
changed again
Scientists
a failing. Whe
sionalliy' shows
right, the scier
ing.
The publi
have a hard
sions. We all
todaN, and ano
show discussir

f
Mengele within a reasonable scientific certainty;` and Dr. Mar-
cos Segre of the Uivversity of Sao Paulo, explained, "We deal
with the law of probabilities. We are scientists and not magi-
cians" Pushed by reporters' questions- after all, this was an
important matter, and what should the public believe? -several
of the pathologists said they had "absoliitely no doubt" of their
findings.` (Later evidence made the case even stronger.)
But all any scientist can scientifically say -say with cer-
tainty in almost any such case-is, there is a very strong proba- ?
bility that such and such is true. :
Widd'y believed theories or conclusions are often proved ~
wholly or partly wrong. 'When it comes to almost anything we *
say;r reports Dn Arnold Relman, editor of the New Ergland s
fournal of Medici'ru, 'you, the reporter, must realize-and~ must ;
help the public undetstand-that we are almost always dealing ;
with an dement of uncertainry. Most scientific information is of
a probable nature, and we are only talking, about probabilities,
not certainty. What we are concluding is the best we can do, our t
best opinion at the moment, and~ things may be updated in the t~
future.
Example: Until 1980 the American Cancer Society recom-
mended that women have an annual Pap smear to detect cervi-
cal cancer. The recommendation was then changed to every
three years for many women, after two initial' examinations.
Statistics had shown that this would be equally effective.j The
matter is still controversial, and the recommendation has been
changed again in the light of new knowledge.
Scientists are often wrong. In science this is not necessarily
a failing. When new evidence disproves an old~ theory, or occa-
sionally shows that some little believed, even kooky notion is
rigfit, the scientific method is doing what it should. It is work-
ing.
The public, and even some reporters and especially editors,
have a hard time understanding these sometimes drastic revi-
sions. We all hear the question, Why do they say one thing
today and another thing tomorrow? I was once on a radio talk
show discvssing unsettled medical controversies when a testy

10 c}ihPTE.R 2
listener phoned in to exdaim, ` They say is a damned liar!"
'hey" of course may be different theys who arrive at dif-
ferent conclusions about inconclusive evidence in a thousand'
areas: the role of fats and~ cholesterol in the diet, the effects of
low-level radioactivity; the cause of' the extinction of dinosaurs.
Why so much uncertainty? Science is always a continuing
story. Nature is compltx, and almost all methods of observation
and experiment are imperfect. "There are flaws in all studies,"
says Harvard's Dr. Marvin 2,r1en.' There may be weaknesses,
often unavoidable ones, in the way a study is designed or con-
ducted'. Observers are subject to human bias and error. Subjects
fluctuate. Measurements fluctuate.
Many studies are thus inconclusive, and virtually no single
study proves anything. "Fundamentally" writes Dr. Thomas
Vogt, "all scientific investigations require confirmation, and un-
til it is forthcoming all results, no matter how sound they may
seem s are preliminary.'
Medicine, in particular, is full of disagreement and con-
troversy. "No clinical trial is ever perfect" Harvard's Dr. John
Bailar observes. Unlike new drugs, medical treatments and tests
and surgicali operations need, not even be subjected to experi-
mental studies before being applied. `Most treatments escape
and will' continue to escape rigorous evaluation" Bailar says.s
The reasons are many: lack of funds to mount enoughh
trials; lack of enough patients at any one center to mount a
meaningful trial; the expense and difficulty of doing multicenter
trials; the swift evolution and obsolescence of medical tech-
niques; the fact that, with the best of intentions,, medieal data-
histories, physical examinations, interpretations of tests, descrip-
tions of symptoms and discases-arz notoriously inexact and
vary from physician to physician; and the serious ethical obsta-
dts to trying a new procedure when an~ old'~ one is doing some
good, or to experimenting on children, pregnant women, or the
mentally ill!
While all studies have flaws, some have more flaws than
others. Study after study has found that many artides in the
most prestigious medical journals are replete with shaky statis-
THE CERTAINTY
tics and lack of
tients' complica:
up. Papers pres
reported by thc
mere progress :
tive results that
or criticism or
uncertain findi
The upshc
organization's
care is based a
. . Seemingly
doctrines, perp
out to be suppK
be found."
In genera
possible benef.
that only a ra
cancer. Only r
less dra-"- trc
om}; o-" t_t~
is rich in, tren
or statistically
discarded.
Occasiom
sults: More of
data that contr
tical methods
ascribing fraui
inmindtheo
competence tc
So some
tainry need n
survive on, th
policy, to govc
basis of incom
can do so.
N
C
N
W
CA
~
N
~
m
O

tics and lack of any explanation of such crucial matters as pa-
tients' complications and the number of patients lost to follow-
f up. Papers presented at medical meetings, many of them widely
reported by the media, are even Itss reliable. Many papers are
mere progress reports on incomplete studies. Some state tenta-
tive results that later collapse. Some are given to draw comment
or criticism or get others interested in a provocative but still
uncertain~ finding.b
The upshot, according to Dr. Gary Friedman of the Kaiser
organization's Permanente Medical Group: "Much~ of health
care is based on tenuous evidence and incomplete knowledge. .
.. Seemingly authoritative statements and accepted~ medical
doctrines, perpetuated through textbook and lectures, often turn
out to be supported' by the most meager of evidence, if any can
be found.'
In: general, possible risks tend to be underestimated'd
and
possible benefits overestimated. For decades surgeons swore
that only a radical mastectomy was the treatment for breast
cancer. Only recently were clinical trials mounted to show that
less drastic treatments seem equally effective. Prefrontal lobot-
omy, overstrict bed rest, drugs by the c.arload-medical history
is rich in treatments that were given for years without question
or statistically rigorous study, only to be proved wrong and
discarded.
Occasionally, unscrupulous investigators falsify their re-
sults. More often, they may wittingly or unwittingly play down
data that contradict their theories, or they may search out statis-
tical methodfr that give them the results they wanr. Before
ascribing fraud, says Harvard's Dr. Frederick Mostelltr, "
keep
~
in mind the old saying that most institutions have enough in-
competence to explain almost any results"
So some uncertainty almost always prevails. But uncer-
tainty need not stand in the way of good sense. To live-to
survive on this globe, to maintain our health, to set public
policy, to govern ourselves -we almost always must act on the
basis of incomplete or uncertain information. There is a way we
can do so.

Somehow the wortdrous promise of t}ie earthl is thai ithert are things beautiful in
it, things wondrous artd alluring, and by virtue of your trade, you want to
underswtd them.
-Mitchell Feigenbaum
Corref/ Uaite+tity physiciu md'rmd&~
The great tragedy of Sciener-the slaying of a bwutifullhypothesis by an ugi~
fAct.
-'ITwmas Henry Huxlev
TO neporters, the worid is full of true believers, peddling
their "truths." The sincerely misguided and the outright fakers
are often highly convincing, also newsy. How can we tell the
facts, or the probable facts, from the chaH?
We can borrow from science. We can try to judge all possi-
ble claims of fact by the same methods and rules of evidence that
scientists use to derive some reasonable guidance in ~ scores of
unsetded issues.
As a start, we cam ask these questions:
How do yvu knom?
Have the cfaims ban subjeckd'to any studies or experiments?
Were the studies acceptable ones, by general agraenunt ? For exam-
ple: Were they without any substantial bias?'
Have nsulls been fairly consirteni from study to study?
Have du fmdtngs >uvlted in a coruenrtu among others in the same
frcld ? Do at luast the majority of infornrrd ' fxrsonr agrec?' Or should ux
unlhhald judgnrn! until there is more euidence?
Always: Are the cancGut'aru backed by beGictzbk stasistrcal aiderrce.P
12
THE SCIEh rlC '
And mhat it t/u c
be?
Obviousty,
rather than nur
that reporters c
There art
usefiil' ones: T}'
interpreting da
a way of' extr-ac(
of mathematic:
Statistics c
and inexpert si
be difficult for t
possible. Unce
in~ almost all.
There are
"Edison had it
author. 'It doe
lt did not take
ton's et~ -.nt
t.
centuny . .,'9(
until'then hac
Overwhe
probability, 4
and the use c
called the on
many events:
women, yet t
before it bec:
develbp hear
some years r
was to: womt
The bes
line (for exar
a study is ac

®
®
®
®
THE SCI&VIIFIC WAY 13
And uAet is tke degna of ccriainty or unartuiruy.~ How sure can you
be?
Obviously, much of statistics involves attitude or policy
rather than numbers. And much, at least much of the statistics
that reporters can most readily apply,, is good sense..
There are many definitions of statistics as a tool. A few
useful ones: The science and art of gathering, analyzing, and
interpreting data; a means of deciding whether an effect is real;
a way of extracting information from a mass of raw data; a set
of mathematical, processes derived from probability ttteory.
Statistics can be manipulated by chaiiatans, seif-deluders,
and inexpert statisticians. Deciding on the truth of a matter can
be difficult for the best statisticians, andsometunes no decision is
possible. Uhcertainry will ever rule in some situations and lurk
in~ almost all.
There are rare situations in which no statistics are needed..
"Edison had it easy," says Dr. Robert Hooke, a statistician and
author. "It doesn't take statistics to see that a light has come on."'
It did not take statistics to tell 29thrcentury physicians that Mor-
tons ether anesthesia permitted painltss surgery or to tell 20th-
century physicians that the first antibiotics cured infections that
until then had' been highly fatal.
Overwheltningly, however, the use of statistics, based on
probability, is called the soundest method' of decision making,
and the use of large numbers of cases, statistically analyzed, is
called the only means for determining the unknown cause of
many events. Birth control pills were tested on several hundred
women, yet the pills had to be used for several years by millions
before it became unequivocally dear that some women would
develop heart attacks or strokes. The pills had to be used for
some years more before it became dear that the greatest risk
was to women who smoked and women over 35.
The best statisticians, let alone practitioners on the firing
line (for example, physicians), often have trouble deciding when
a study is adequate or meaningfL Most of us cannot become
N
®
®
®
®
.,...-ow. *:.
®
®

14 CHAPTER 3
statisticians, but we can at least learn that there are studies and'
studies, and the unadorned c]aim~°1Ne made a study" or "We did'
an experiment" may not mean much. We can lcarn to ask more
pointed questions if we understand some basic concepts and
other facts about scientific studies.
These are some bedrock statistical concepts:
Probability
'Power" and numbers
Bias and confounders
Variability
Probability
Scientists cope with uncertainty by measuring probabilities.
Since all i experimental results and all events can be influenced
by chance and almost nothing is 100 percent certain in science
and medicine and life, probabilities sensibly describe what, has
happened and should happen in the future under similar condi-
tions. Aristotle said, 'he probable is what usually happens," but
he might have added that the improbable happens more often,
than most of us realize.
The accepted numerical expression of probability in evalu-
ating scientific and medical studies is the P(or probab:lf y) ~value.
The P value is one of the most important figures a reporter
should look for. It is determined by a statistical formula that
takes into account the numbers of subjects or events being com-
pared in order to answer the question, could a difference or
result this great or greater have occurred by chance alone.7'By more
precise definition, the P value expresses the probability that an
observed relationship or effect or result could have samrd to
occur by chance f there had aceually ban no rral efict. A low P value
means a low probability that this happened,, that a medical
treatment, for example, might have been declared beneficial
when in truth it was not.
Here is why the P value is used to evaluate results. A
THE SCIENTIFIC
scientific invest
commonly sets
h}potlu_sis; that
back the origin
pothesis. The
number or as
"gneater than'
pened, that th
r
-,.nanoe -or, , to
carialrnn.
By com
only 5 or fewe
pened by char
usually called .
ues are used).
ally implies th
A Alker
statistically sig
result is due tc
In ~~
the shoL
dinary logic.
replaces 'it ca
Why the
People have ;
purposes. Anc
Mosteller telll
class and afte:
cious going o:
the fifth heac
chance in 16
that five heacr
there is some
neighborhooc
Another
late a cnnfsde.

THE SC18fPI7FIC WAY 15
®
scientific investigator first forms a hypothesis. Then he or she
cotnmonly sets out to try to drsprorx it by what is called the wd
h,Ybnthuczr. that there is no effect, that nothing wi1 happen.. To
back the original hypothesis, the results must rtjad the null hy-
pothesis. The P value, then, is expressed either as an exact
number or as <.05, say, or >.05, meaning less than' or
'greater than" a 5 percent probability that nothing has hap-
pened, that the observed result could have happened just by
e$ance-or, to use a more elegant statistician's phrase, by mndom =
canation:
By convention, a P c+aluc of . 05 or 14u; meaning there are `
only 5 or fewer chances in 100 that the result could have hap- `
pened by chance, is most often regarded as low. This value is
'
usually calltd'statirtically s~mufrcant (though sometimes other val- 5
ues arc used), The unadorned term 'statistically significant" usu- _
ally implies that P is .05 or less. _
A higher P cnlire, one graater than . 05, is usually seen as not
statistically significant. The higher the value, the more likely the ~~
result is due to chance. t ~
In common language, a low chance of chance alone calling
the shots replaces the ~it's certain" or 'dose to certain' of or-
dinary logic. A strong chance that chance could have ruled
replaces "it can't be" or 'almost certainly can't be."
Why the number .05 or less? Partly for standardization.
People have agreed that this is a good cutoff point for most
purposes. P.rnd partly out of old friend common sense. Frederick
Mosteller tells us that if you toss a coin repeatedly in a college
dass and after each toss ask the class if there is anything suspi-
cious going on, 'hands suddenly go up all over the room' after
the fifth head or tail in a row. There happens to be only 1
chance in 16-.0625, not far from .05, or 5 chances in 100-
that five heads or tails in a row will show up in five tosses, 'so ~
there is some empirical evidence that the rarity of events in the 0
neighborhood of .05 begins to set peoples teeth on ed'ge" N
Another common way of reporting probability is to calcu- ~
late a confrdenu 1a~1; as well as a confdnr,c interpal (or c»nf:dimc edimce ~
~
~
Ll'1
e
0
M
®
M

36 CHAPTER 3
limits or rrnege)'. This is what happens when a politiral pollster
reports that candidate X would~ now get 50 percent of the vote
and thereby lead candidate Y by 3 percentage points, 'with a 3-
percentage-point margin of error plus or minus and a 95 per-
cent confidence level.' In other words, Mr. or Ms. Pollster is 95
percent confident that X's share of the vote would be someplace
bet+ween 53' and'47 percent. Similarly, candidate Y's share might
be 3 percentage points greater (or less) than the figure predicted.
In a close election, that margin of error could obviously turn a
predicted defeat into viaory: And that sometimes happens.
An~impottant point in looking at the restilts of political polls
(and any other statements of eonfidence): In the reports we
read, the plus or minus 3 (or whatever), percentage points is
often omitted, and the pollster merely mentions a'3-point
margin of error.° This means thete is actually a 6-point range
within which the truth probably lurks.
The more people who are questioned in a political poll or
the larger the number of subjects in a medical study,, the greater
the chance of a high confidence level and a narrow, and there-
fore more reassuring, confidence interval.
No matter how reassuring they sound, P values and confi-
dence statements cannot be taken as gospel, for .05 is not a
guarantee, just a number. There are several important reasons
for this.
All that P values measure is the probability that the results
might have been produced by some sneaky random process. In
20 results where only chance is at work, 1, on the average, will
have a reassuring-sounding but misleading P vali,e of <.05.
One, in short, may be a false positive.
Dr. Marvin Zelen points ouo that there may be 6,000 to:
10,000' clinical (medical) trials of cancer treatment under way
today, and if the conventional value of .05 is adopted as the
upper permissible limit for false positives, then every 100 studies
with no actual i benefit, may, on ~ average, produce 5 false-positive
results. Hence, we may expect 50 false positive results, on
THE SC]£'.TIFIC V
averageforever
fact has said', "W.
chemotherapy in
therapies in the
paths.
Arrtaangly;
tected. Scientists
negative results.
them. Nor are se
ing studies that
firmatoryy studie
Statistical i
cause and effect
member the roo
Uriless an associ
thatthecaseisc
ing more study
To statistic
ference betweer
there is r ' -as
1
conelatinn .tn
If the nw
value may sim
detect somethin
jects. Highly "si
ble differences i.
An impr
other variable
not taken into
Statistica
cal l- that is, m
rienced reporte
and jump to t1
called their stu
tween two !larF

®
THE SCIEMIFIC WAY 17
average, for every 1,000 trials with no beneficial effects! Zden in
fact has said, 'We may now have reached an impasse in cancer
chemotherapy in which there are large numbers of false-positive
therapies in the clinic;' leading physicians down many false
paths.
Amazingly, most false positives probably remain unde-
tected. Scientists do not profit much professionally by reporting
negative results. journal editors are not keen on publishing
them. Nor are scientists keen on doing costly and time-consum-
ing studies that merely confirm someone else's work, so "con-
firmatory studies are rare," Zelen reports.
Statistieal significance alone does not mean there is a
cause and effect. Corrrlation or associatioa is not causation. Re-
member the rooster who thought his crowing made the sun rise?'
Unless an association is so powerful and so constantly repeated
that the case is overwhelming, association is only a due, mean-
ing more study or confirmation is needed.
To statisticians, incidentally, there is this important dif
ference between correlation and association: Auoaation means
therc is at least a possible relation between two variables. A
comfation is a measure of the association.
If the number of subjects is too small, an unimpressive P
value may simply mean that there were too few subjects to
detect something that might have shown an effect in more sub-
jects. E-iighly "significant" P values can sometimes adorn negiigi
ble differences in large samples.
An impressive P value might also be explained by some
other variable or variables -other conditions or associations -
not taken into account.
Statistical significance does not mean biological, dini-
cal-that is, medic.af-or practical significance, though inexpe-
rienced reporters sometimes see or hear the word "significant"
and jump to that condusion even reporting that the scientists
called their study "significant." Example: A tiny difference be-
tween two large groups in mean hemoglobin concentration, or
0
M
CH
W
0
®
0
®

]8 CHAPTER 3
red blood count (say, 0.1 g/100 mL, or a tenth of a gram per
100 rnilliliters)i may be statistically significant yet medically
meaningiess:'
Eager scientists can consciously or unconsciously manip-
ulate the P value by failing, to adjust for other factors, by choos-
ing to compare different end points in a study (say, condition on
Itaving the hospital' rather than length of survival), or by choos-
ing the way the P value is calculated or reported.
There are several mathematical paths to a P value, such as
the chi-square ()?), t, F,, r, andpaired t tests. All may be legiti-
mate. But be wanned; Dr. David Salsburg of Pfizer, Inc., has
written in the Ameican Statistuii of the unscrupulous practi-
tioner who "engag,rs in a ritual known as 'hunting for P values' "
and finds ways to modifiy the original data to "produce a rich
collection of small' P values" even if those that result from simply
comparing two treatments 'never reach the magical .05 "'
"If you look hard enough through your data," contributess
an investigator at a major medical center, "if you do enough
subset analyses, if you go through 20 subsets, you can find
one"-say, "the effect of chemotherapy on premenopausal
women with two to five lymph nodes"-'with a P value less than.
.05. And people do this"
"Statistical tests provide a basis for probability statements,"
writes Dr. John Bailar,, "only when the hypothesis is fully devel-
oped before the data are examined.... If even the briefest
giance at a study's results moves the investigator to consider a
hypothesis not formulated before the stUdy was started, that
glance destroys the probability value of the evidence at hand':"
(At the same time, Bailar adds, 'review of data for unexpected
..
dues ... can~be an immensely fruitful source of ideas" for new
hypotheses "that can be tested' in the correct way" And occa-
sionally 'findings may be so striking that independent confirma-
tion ... is superfluous.")°
A rather sophistieated-and possibly touchy - line of ques-
tioning that some reporters might want to try if they're skeptical:
How did yare mrirx at yow P oaGw? Did wu use the tet planntd in
THE SCIEh'T'[FIC ~ k
advance t n y(ltlf pro: report tha brst-soun,
An6you~ m,
The laws of
even impossible-
We've all i tal
and bumped ini
don't know, but
work, the chanca
1,024. Yet I woi,
year period. W?
statistiaans call'
few people wi&
cover,, there will
birth defects thE
in a great while
In a large
unusual. They
and ofter µ
duce unr_~olf
evidence. 'he
large number c
occurred. They
itry, are wrong.'
'We [repo
dence,". Philip
and we are rif
mind our read
from a few in
member. The
A statistic
people or, a st;
whom such ar
The chance of
oping leukemi

i
THE SCIEMFIC WAY 19
adramut in your protarol or study derigre; or did you apply srrxral Grsts, then
report the best-souwrdiitg one?
And you may think of other questions.
The laws of probability aL#o teach us to apaY some unusual,
even impossible-sounding events.
We've all taken a trip to New York or London or someplace
and bumped into someone from home. The chance of that?' I
don't know, but if you and I tossed for a drink every day after
work, the chance that I would ever win 10 tunes in a row is 1 in
1,024. Yet I would probably do so sometime in a four- or five-
year period. What I like to call ttu Law of Unusual Events-
statisticians calliir the Law of Small Probabilities-tells us that a
few people with apparently fatal illnesses will inexplicably re-
cover, ttiere will be some amazing clustets of cases of cancer or
birth defects that will have no common cause, and I may once
in a great while bump into a friend far from home.
In a large enough population such coincidences are not
unusual. 1liey are the rule. They produce striking anecdotes
and often striking news stories. In the medical world they pro-
dutx unreliable, though often cited, testimonial or anecdotal
evidence. 'he world is large," Vogt notes, "and one can find a
large number of people to whom the most bizarre events have
occurred. They alli have personal explanations. The vast major-
ity are wrong.7
'We [reporters] are overly susceptible to anecdotal evi-
dence," Philip Meyer writes. `Anecdotes make good reading,,
and we arc right to use thern.... But we often forget to re-
mind our readers-and ourselves-of the folly of generalizing
from a few interesting cases.... The statistic` is hard to re-
member. The success stories are not."
A statistic to ask about is the drnomirralor-the number of
people or, a statistician would say, the populalion or domain -in
whom such an event might happen. Zden cites this example:
The chance of any youngster between ages five and nine devel-
oping leukemia is 3 in 100,000 per year. In a school with 100
®
e
8
©
®

children ~ of this age group, we would expect only 3 cases in 100
years. But in this nation with thousands of schools, we would
occasionally-such is chance-firld schools with 3 or more cases
in a single year. 'Then one is faced with the problem of interpre-
tation," Zden says. "Is this one of those rare events that is surely
going to be observed? Or is it due to some causal factor?"
A reporter in tiis instance might ask a statistician at the
National Cancer Institute or a medical center, What is the
chance of such an event in such a population? How many
similar unusual events are probably never reported?
'Tower" arad"hjumbers
This gets us to another statistical concept: pouxr. Statisti,
cally, 'powet' means the probability of finding something if it's
there. Example: Given that there is a true effect, say a difference
between two medical treatments or an inarase in cancer caused
by a toxin in a group of workers, how likely are we to find it?
.Samplc siu confers power. Statisticians say, "Funny things
can happen in small samples without meaning very much" ...
"There is no probability until the sample size is there" ...
"Large numbers confer power" ..."Large numbers at least
make us sit up and take notice."'
All this concern about sample size can also be expressed as
the lau of lnrgc numbers, which, says that as the number of cases
increases, the probable truth of a condusion or forecast in-
creases. The vaMity(truth or accuracy) and relinbility (reproduci-
bility) of the statistics begin to converge on the truth.
We already learned this when we talked about probability.
'There u another unrrlated uac of the wotd 'pawer, 5oenuns rnrrunoniy epeak of
inocsing or 'raiang" some quantity by a puar of 2 or 3 or 100 or w}iatever: 'Powef
hec mrina the product you get when you muluply a number by itarlf one or more
umes. 7htu, in 2 x 2= 4, 4 is the ¢condpower of 2, or to put it ano[her way, there
are two 2's in your equation. This is oommonly written 2' and known as 2 to the seoond
power or,iust 2to the aecond. In 2 x 2 x 2= 8, 2 his been ruted to the third power.
Whrn you think abour 21Dyou we the need for the shorthand.
But by thinkin.
both sample si
too affects the p
if the number ,
shift from succc
cally decrease t,
If six patit
rate, the shift
success rate to
any case that t
valid or aceur,
not have relia'
samples. The
no fatal biases
would have ir
I have m\
dairn, T' k a
finding '3c
example, zoume
Would it aersn
Or if they
100 percent ir
total and subtr
changed4 rxcei
analysis. But 1~
times try , thre
problem or er

0
THE SCIENTIFIC WAY 21
But by thinking of power as statisticians do-as a function of
both sample size and the accuracy of measurement, since that
too affects the probability of finding something-we can see that
if the number of treated patients is small in a medical study, a
shift from success to failure in only a few patients could dramati-
cally decrease the success rate.
If six patients have been treated with a 50 percent success
rate, the shift to the failure column of just one would cut the
success rate to 33 percent. And the total number is so small in
any case that the rtslilt has little reliability. The result might be
valid or accurate, but it would not be generalizable - it would
not have reliability until confirmed by careful studies in larger
samples. The larger the sample, and assuming there have been
no fatal biases or other flaws, the more confidence a statistician
would have in the result.
One canny science reporter,L,ewis Cope, says,
I have my own "rule of two." If someone makes some numerical
claim, I look at the numbers, then see how much I might change the
finding by adding or subvacting two from any of the 5gures. For
example, someone says there ate five cases of cancer in a community:
Would it seem meaningful if there were three?
Or if there were eight cases this year but four the year befotz-a
100 percent increase-I ask myself, "If I add two cases to last year's
total and'subtraa two from this yeat's, is there a chance things haven't
changed, except by chance?" This approach will never supplant neftned
analys;s. But by playing around with the nurnbers this way-I some-
times try three instead of two- a reporter can often spot a potential
problem or error.
A statistician says, "I'his can help with small numbers but
not large ones" Mosteller contributes "a little trick I use a lot on
counts of any size." He explains, "Let's say some political unit
has 10,000 crimes or deaths or accidents this year. Has some-
thing new happened? The minimum standard deviation [see
®
®
a
®
M
M
M
. .;<=_ - ..
k

cHAPm 3
THE SCIENriFIC
22
page 33] for a number like that is 100-that is, the square root
of the original number. That means the number may vary by a
minimum of 200 every year without even considering, growth,
the business cycle, or any other effect. This will supplement
your ttportet's approach"
Looking for error in reported results, statisticians try to spot
both false positives and' false negatives: The folse pontirx (or Type
I or alpha errvr in statistical language you may see) is to find a
result or effect where there is none. The fa1ct negatiue (or Type II
or beta error) is to miss an effect where there is one. The latter is
parvcularly common when thenc are small numbers. 'I'hene are
some very well conducted studies with small numbers,,even five
patients, in which the results are so dear-cut that you don't have
to worry about power," says Dr. Relman. "You still have to
worry about applicability to a larger population, but you don't
have to doubt that there was an effect. When results are nega-
tive, however, you have to ask, How large would the effect have
to be to be discovered?"
Many scientific and medical studies are underpowered -
that is, they include too few cases.'Whenever you see a negative
result," another scientist says 'you should ask, What is the
power? What was the chance of finding the result if there was
one?" One study found that an astonishing 70 percent of 71
well-regarded clinical trials that reported no effect had too few
patients to show a 25 percent difference in outcome. Half of the
trials could' not have detected a 50 percent difference.'
A statistician scanned an article on colon cancer in a lead-
ing journall "If you read~ the artic3e carefully," he said,,'you: will ~see that if one treatment
was better than, the otlier-if it would'~
increase median survival by 50 percent, from five to seven and a
half years, say-they had only a 60 percent chance of finding it
out. That's little better than tossing a coin!"
The weak power of that study would be expressed numeri-
cal]y as .6; or 60 percent. Scan an article's fine print or foot-
notes, and you will'sometimes find such a pouxr sratement: Most
authors still dc
cially when rea
How largc
lated that a tri~
percent chancf
Sometime
ltind'of cancer
pect that the r
X, you woulc
excess rate to
significance. 'I
suffer a myoci
oral contracep
cent sure of ot
you would ha
Even ~ the
zero numeratc
treated 14: ltu]
lAr dysfunctioi
remains. how
any re ~n.
may be unall
All this n
1Nhat's the .cizc
20 individual!
persons woul.
Always try to
The mosthem. When
numbers and
people, or ev<
Andimowt
people 5tausricall
one or morc pari
rorrs, or pMyad~
tarm iwcear for a

t
THE setENntIc WAY
authors still don't report one, but the practice is growing, espr
cially when results are negative.
H'ow large is a large enough sample? One statistician calcu-
latedthat a trial has to have 50 patients before there is even a 30
percent chance of finding a 50 percent difference in results.
Sometimes large populations indeed are needed'.10 If some
kind of cancer usually strikes 3 people per 2,000, and you sus-
pect that the rate is quadrupled in people exposed to substance
X, you would have to study 4,000 people for the observed
excess rate to have a 95 percent chance of reaching statistical
significance. 'Ihe likdihood that a 30-to-39-year-old woman will
suffer a myocardial infarction,, or heart attack, while taking an
oral contraceptive is about 1 in 18,000 per year. 'I'o be 95 per-
cent sure of observing at least one such event in a one-year trial,
you would have to observe nearly 54,000 women."
Even the lack of an effect-statistically sometimes called a
zero numerator-can be a trap. Say, someone reports, "1Ne have
treated 141eukemic boys for five years with no resulting te:sticu-
lar dysfiunction"-that is, zero abnormalities in 14. The question
remains, how many cases would they have had to treat to have
any real chance of seeing an effect? The probability of an effect
may be small' yet higtily important to know about.
All this means you must often ask, Whai's ymcr dmominntor?
iWJrat's the siza of your pop'ulalinn?' A disease rate of 10 percent in
20 individuals may not mean much. A 10 percent rate in 200
persons wvuld be more impressive. A rate is only a figure.
Always try to get both the numerator and the denominator.
The most important rule of all about any numbers: Ask for
them. When anyone makes an assertion that should include
numbers and fails to give them, when anyone says that most
people, or even X percent, do such and such, you should ask,
And know t}ut to a rtacrocian a populauon dos not nsa.uily mean a group of
pwpie. S~y, a p~ is any, group or mLfecti°n of pertinan units-urun wiih
one or moce perunent char.ctaiwa in aawnoo-pmpk.,evena, objav. reeMda, ew
.cma, or physiblogical values (likr blood prenure readings). Stanstxan allo use the
tertn owavsr 6or a whok group of peopk or unita under nu*
4+
I
®
®
®

M
®
24 cRAFTER3
What mr .yow nonbas? After aIl, some researchers reportedly
announced a new treatment for a disease of chickens by saying,
°33.3 percent were cured, 33.3 percent died, and the other one
got away."
Bias and Con f ounders
One scientist once said that lefties are overreptesented
among basebalI's heavy hitters. He saw this as 'a possible result:
of their hemispheric lateralization, the relative roles of the two
sides of the brain.' A critic who had seen more ball games said
some simpler covariables could explain the difference. When
they swing, left-handed hitters are already on the move toward
first base. And most pitchers are right-handers who throw most
often to right-handed hitters.l'
Scientist A was apparently guilty of bias, meaning the intro-
duction of spurious associations and: error by failing, to consider
other influential factors. The other factors may be called QorhWia-
blcs, uozmialzs, rnterrurung or conhib111ing wnables, aon, found:ng amra-
bks, or confounders. A simpler term may be "other explanations"
Statisticians call bias 'the most serious and pervasive prob-
lem in the interpretation of data from clinical trials" ..."the
central issue of epidemiological rescarch" ..."the most com-
mon cause of unrellable data' Able and conscientious scientists
try to eliminate biases or account for them in some way. But not
everybody who makes a scientific, medical, or environmental
claim is that skilled. Or that honest. Or that all-powerful. Some
biases are unavoidable by the very difficulty of much research,
and the most insidious biases of all, says one statistician, are
"those we don't know exist."
Some biases may be uncovered'by assiduous investigation.
A father noticed that every time one of' his I 1' kids dropped a
piece of bread on the flbor, it landed with~ the buttered~ side up~
"I'his utterly defies the laws of chance," he exclaimed. Close
examination disdosed the cause: The kids were buttering their
bread on both sides.
THE SCIEhTIFU,
I told thi
called about ~
prizes in a cht}iat this could
bought nearly
He had o
tist and repon
factm5?.
Not even
human failing
'I wouldn't h
investigators (
maybesoe)
overr-rosy hue
Other pc
motion and p
scious or unc
bias. Dr. 'Ihc
New Y-^i^ te:
firm, 1 )ic
main statisti,
though not sc
drugs for diz
prrviously pt
acknowledge,
known to tht
In contrr
dru g firnn bi
signed by in(
side board 1e
outcome. `Itt
iJiterzsi in bic
disdbsed so ~
Even a
Johns Hopki~
with prisrns

®
®
®
THE SCIENIYFIC WAY 25
I told this story to one statistician, who said, "I was once
called about a person who had won first, second, and third
prizes in a church lottery. I was asked to assess the probability
that this could have happened. I found out that the winner had
bought nearly all the tiekeu."
He had of course asked the obvious question for both scien-
tist and reporters: Could the rdatranship dcsc7s'btd be orplairud by other
fwto,.~
Not everyone will tell you, of course, for bias is a pervasive
human fairig. As one candid scientist is said to have admitted,
"I wouldn't have seen it if I hadn't believed it" Enthusiastic
investigators often tell us thar findings are exciting. But they
may be so exciting that the investigators paint the results in
over-rosy hues.
Other powerful human dtives-the race for academic pro-
motion and prestige, financial connections -can also create con-
scious or unconscious conflicts of interest or attitudes that feed
bias. Dr. Thomas Chalmers of Mount Sinai Medical Center in
New York tells of a drug trial~ financed' by a pharmaceutical
firm, in which both the head of the study committee and the
main statisticians and analysts were the firm's employees,
though not so identified in any credits. He tells of a study of oral
drugs for diabetes in which the fact that the first author had
previously published 14 artid s on the subjecr, and in 7 had
acknowledged support by the ?~vg manufacturers, was "not
known to the reader"
In contrast, Chalmers describes a study also financed by a
drug firm but with a contract specifying a study protocol de-
signed by independent investigators and monitored' by an out-
side board less likely to be influenced by a desire for a favorable
outcome. 'It is never possible to eliminate" potential conflicts of
interest in biomedical' research, he concludes, but they should be
disclosed so others can evaluate them. "'
Even a genius may be biascd! Horace Freeland Judson of
Johns Hopkins University tells how Isaac Nervton experimented
with prisns and ltnses and developed a theory of color, light,
Is
M
n
iiim
®
i
®
M
w

and the solar spectrum. He did not report seeing some dark
lines-absorption lines, which mark varying wavelengths-that
his instruments must have shown. A modern scientist argues
that I`lewton's theory, not his instruments, had no place for that
evidence: 'To the observing scientist, hypothesis is both friend
and~ enemy'"
For years technicians making blood counts were guided by
textbooks that told them two or more 'properly" studied samples
from the same blood should not vary beyond narrow "'allowable"
limits. Reporte& counts always stayed inside those limits. A
Mayo Clinic statistician rechecked and found that at least two
thirds of the time the discrepancies exceeded the supposed
limits. The technicians had' been seeing what they had been told
to expect and'discounting any differences as mistakes. This also
saved them from~ the additional labor of doing still more count-
ing.
Both the biaced obsenrr and the biared .eubjat are common in
medicine. A researcher who wants to see a treatment result may
see one. A patient may report one out of eagerness to please the
researcher. There is also the powerfiil plaubo ffict. Summarizing
many studies, one scientist found that half the patients with
headaches or seasickness-and a third of those suffering from
coughs, mood changes, anxiery, the common cold, and even the
disabling chest pains of angina pectoris - rrponed relief' from a
"nothing pill."" A placebo is not truly a nothing pill;, the mere
expectation of relief seems to trigger important effects within ~ the
body. But in a carefW study the placebo should not do as well as.
a test medication; otherwise the test medication is no~better than
a placebo.
Sampling bias is the bugaboo of both political polls and medi,
cal i studies. Say you want to know what proportion 1 of the popu-
lace has heart disease, so you stand on a corner and ask people
as they pass. Your sample is biaaed'; if only because it leaves out
those too disabled to get around. Your problem, a statistician
would say, is sefatioa. A politiaal pollster who fails to build a valid
probability sample, easy when questioning only a thousand or
THE SCIEhTIF
so people fror.
A doctor
patienr popul,
average-ma
tion ~ as a who]
treat rrlativel-
the dispropor
cally seek out
Cleveland or
ber of di>bcu]
a$luent and ,
werr valuablt
the samples (
men and woi
An inve
distorting, a
otherwise `th,
in those disc
omits those v
people n
are dn )
they came d<
away, they d
had unfavor
Mostelle
ous anestheti'
hospitals. Urr
dead had be
plained by t:
wound up w
The pre
tected, when
of patients tr
treated conm
compared. I
randomized

THE scIENnFic wAr 27
so people from coast to coast, has equally poor selection."
A doctor in a clinic or hospital with an unrepresentative
patient population-healthier or sicker or richer or poorer than
average-may report results that do not represent the popula-
tion as a whole. Veterans Administration hospitals, for example,
treat relatively few women; their condusions may apply only to
the disproportionate number of lower-income men who typi-
cally seek out the VA hospitals' free care. A celebrated Mayo or
Cleveland or Ochsner clinic sees both a disproportionate num,
ber of difficult cases and' a disproportionate number of patientss
affluent and well enough to travel. The famed Kinsey reports
were valuable revelations of sexual behavior but flawed because
the samples consisted disproportionately of upper middle-class
men and women and of those willing to talk.
An investigator may also introduce bias by comutrainirtg, or
distorting, a sample-by failing to reveal norverporrse or by
otherwise "throwin.g away data.' A surgeon cites his success rate
in those discharged from the hospital after an operation but
omits those who died during or just after the procedure. Many
people drop out of studies-sometunes they just quit-or they
are dropped for various teasons: They could not be evaluated,
they came down with some *irre]evant" disorders, they moved
away, they died. In fact, many of those not counted may have
had unfavorable outcomes had they stayed in the study.
Mosteller tells of a nationwide study of a possibly danger-
ous anesthetic. The investigators n-lied' on autopsy results at 38
hospitals. Unfortunately, only about 60 percent of the relevant
dead had been autopsied, and "anything could have been ex-
plained by the missing 40 percent, so that part of the study
wound up with a handful of nothing"
The presence of significant nonresponse can often be de-
tected, when reading, medical papers, by counting the number
of patients treated! versus the number of untreated or differently
treated controls-patients with whom the treated patients are
compared. If the number of controls is strikingiy greater in a
randomized clinical trial (though not necessarily in an epidemio-
®
®

®
28 GHAP7ER'3
logical or environmental study), there were probably many
dropouts. A well'-conducted study should describe and account
for them. A study that does not may report a favorable treat-
menrresult by ignoring the fate of the dropouts-a confounding
variable.
Age, gender, occupation, nationality, race, income, so-
cioeconomic status, health status, and powerful behaviors like
smoking, are all possible confounding-and frequently ig-
nored-variables. In the 1970s, foes of adding fluoride to city
water pointed to crude cancer mortality rates in two groups of
10~U.S. cities. One group had added'fluoride to water, the other
had not, and from 1950 to 1970 the cancer mortality rate rose
faster in the fluoridated cities. The National Cancer Institute
pointed out that the two groups were not equal: The diference
in cancer deaths was almost entirely explained by differences in
age, race, and sex. The age-, race-, and sex-adjusted di$erence
actually showed a small, unexplained lower mortality rate in the
fluoridated cities:"
If you look carefully at the fate of women taking birth
control pills, you find that advancing age and smoking arr the
two great eonfounders. You must take both into account to find
the greatest clusters of ill effects. Smoking has been an important
confounder in studies of industrial' contaminants like asbestos,,
in which, again,, the smokers suSer a disproportionate number
of ill eSects.1e
A 1947 survey of Chicago lawyers showed that those who
had mere high school diplomas before entering legal training
earned 6.3 percent more, on the average, than college gradu-
ates. The confounder here-the real explanation-was age. In
1947' there were still many older lawyers without college de-
grees, and they were simply older, on the average, and hence
more established."
Occupational studies often confront another seeming para-
dox: The workers exposed to some possible adverse effect turn
out to be healthier than a control group of persons without suchh
exposure. The confounder: the well-known henllhy-uer,Ies effect:
®
®
THE SCIEL'TIFIC.
Workcrs tend t~
in gencrall
Some stu
increase in cas
gens. It took a
They commo;
were emitted.
seratr,rficd, or br
Such findings
genetics, whei
ing or ruling
blcs - are om
put blacks in ,
ent'rMriabl¢ th
"Inatw
plains, "one
which affects
that more pe(
seen as the t
incide, ) t
of cours-, so
stantly expos
than others. I
the black wo
one indepenc
portant undt
may be that
each other,
coworkers, t
cold weather
dry.-ing nasal
viruses.
The sea
pursuits of t
physician wl
any student

®
0
THE SCIENIIFIC WAY
Workers tend to be healthier and live longer than the population
in general.
Some studies of workers in steel mills showed no overall
increase in cancer, despite possible exposures to various carcino-
gens. It took a look at black workers albne to find excess cancen
They commonly worked~ at the coke ovens, where carcinogens
were emitted. This was a case where the population had to be
stbatifug or broken up in some meaningful way, to find the facts.
Such findings in blacks often may be falsely ascribed to race or
genetics, when the real or at least the most important contribut-
ing or ruling variables-to a statistician, the indepnudent raricr-
bles-are occupation and the social and economic plights ttiat
put blacks in vulnerable settings. The excess cancer is the depmd-
ent aanahle the result.
"In a two-variable tdationship," Dr. Gary Friedman ex-
plains, "one is usually considered the independent variable,
which affects the other or dependent variable.''O Take the fact
that more people get colds in winter. Here weather is commonly
seen as the underlying, or independent variable, which affects
incidence of the commoncold, the dependent variable. Actually,
of course, some people, like children in school who are con-
stantly exposed to new viruses, are more vulnerable to colds
than others. In the case of these children, then, as in the case of
the black workers at the coke ovens, there is often more than
one independent variable. Also, some people think that an im-
portant underlying reason for the prevalence of colds in winter
may be that children are congregated in school, giving colds to
each other, thence to their families, thence to their families'
coworkers, thence to the coworkers' families, and so on. But
cold weather-and home heating?'-may still figure, perhaps by
dzw,; nasal passages and making them mote vulnerable to
viruses.
The search for tsw rmrabla is obviously one of the main
pursuits of the epidemiologist; or disease detective-or of any
physician who wants to know what has affected a patient, or of
any student of society who seeks true causes. Like colds, many
®
0
e

cHAYrzR3
medical conditions, such as heart disease, cancers and probably
mental illness, have multiple contributing factors. Where many
knowns measurable factors are involved, statisticians can use
mathematical teclutiques-the terms you willisee include malteplc
regrrssion, rnaltivariatc analysit and discriminmtt analysis and fnctof
cGcrter, path, and twa stc~c ldzrt-squarrs mial ysv - to relate all the
variables and try to find which are the truly important predic-
tors. Yet, some situations, like the striking decline in U.S. heart
disease monality in recent years, defy such analyses. These
years have seen several major changes in American life that
may play a role: less smoking among men, consumption of a
leaner diet, more tea,eational exercise (though more sedentary
work). Medical care is far better, including the treatment of
hypertension, which disposes people to heart disease. Many of
these variables cannot be well measuredi and' the effect of' some
is debatable, so-a common situation in science-the truth re-
mains uncertain.
variabzti y
Doctors always say, 'Most things are better in the morning,"
and they're mostly, right. Most chronic or recurring conditions
wax and wane. We tend to wake up at night when the condition
is at its worst. Then, no matter what is done by way of treat-
ment the next day, the odds are that we'll feel better.
This is regression towmd t1u moan: the tendency of aIl values in
every field of science-physical, biological, social, and eco-
nomic-to move toward the average. Tall! fathers tend to have
shorter sons, and short fathers, taller sons. The students who get
the highest grades on, an exam tend to get, somewhat, lower ones
the next time. The regression effect is common to all repeated
measurements.
Regression is part of an even more basic phenomenom
raarintinn, or aoiability. Virtually everything that is measured var-
ies from measurement to measurement. When, repeated, every
experiment has at least slightly different results. Take a patient's
THE Sc[EM1'77FIC
blood pressurE
row, and the r
different times
vary gready.
The impo
also measuren
and observer
doctors will re
be gnossly diff(
.
heart mutine:
hearing to det
one time to th
cancer resean
usual'rcgulari
too well and t}
enough varial
Biological
physiology ar
tients, T-act
di$er i_ jr
lations, and-
within the sa
Every pK
each with m:
such as heigh
and-if we \
tion-we mu
We can't get
need singie \
Enter ht
nYdian, and r
some idea o'
properties, o'
When n
maan or cn11u
number of v

®
®
THE SCIENTIFIC WAY 31
blood pressure, pulse rate, or blood count several times in a
row, and the readings will be somewhat different. Take them at
different times of day or on different days, and the readings may
vary greatly.
The important: tr.asons? In part, fluctuating physiology, but
also measurement errors, the limits of measurement aauracy,
and observer variation. Exarnining the same patient, no two
doctors wi1I' report exactly the same results, and the results may
be grossly different. If six doctors examine a patient with a faint
heart murmer, only one or two may have the skill or keen
hearing to detect it. Eicpcrimental results so typically di$er from
one time to the next that scientific and medical fakers -a Boston
cancer researcher, for ezampie-have been detected by the un-
usual rcgulariry of their reported results, with numbers agreeing
too well and the same results appearing time after time, with not
enough variation from patient to patient.
Biolqgical umiation is the most important cause of variation in
physiology and medicine. Different patients, and the same pa-
tients, react differently to the same treatment. Disease rates
di$er in diferent parts of the country and among different popu-
lations, and-alas, nothing is simple-there is natural variation
within the same population.
Every population, after all, is a collection of individuals,
each with many charat.~teristirs. Each characteristic, or nana6le,
such as height, has a dirtrihrtion of values from person to person,
and-if we would know something about the whole popula-
tion -we must have some handy summaries of the distribution.
We can't get much out of a list of 10,000 measurements, so we
need singie values that summarize many measurements.
Enter here the familiar awqe or, more exactly, the rneme,
madicn, and mode. These and a few other measures can give us
some idea of the look of the whole and its many measurable
properties, or parameurs.
When most of us speak of an average, we mean simply the
mami, or milMnetic arxm~r, the sum of all the values divided by the
number of values. The mean is no mean tool; it is a good way
©
®
®
®
®
0

to get a typical number, but it has limitations, especially when
there are some extreme values. There is said to be a memorial
in a Siberian town to a fictitious Count Smerdlovski, the world's
champion at Russian roulette. On the average he won, but his
actual record was 73 and 1.1'
If you look at the average salary in a hospital, you will not
know that half the personnel i may be working for the minimum
wage, whilc a few hundred persons make $100,000 or more a
year. You may learn more here from the median, the figure that
divides a population into two equal halves. The median can be
of value when a group has a few members with extreme values,
like the 400-pounder at an obesity clinic whose other patients
weigh from 180 to 200 pounds. If he leaves, the patients' mean
weight might drop by 10 pounds, but the median might drop
just l pound.11
The most frequently occurring number or value in a distri-
bution is called the modc. When the median and the mode are
about the same, or even more when means median, and mode
are roughly equal, you can feel comfortablt about knowing the
typical value.
You still' need to know something about the exceptions, in
short, the disperrion (or spread or scatter) of the entire disuibu-
tion. One measure of spread is the range. It tells you the lowest
and highest values. It might inform you,, for example, that the
salaries in that hospital range from f10,000 to =250,000.
You can also divide your values into 100:perce-rt:lrs so you
can say someone or something fall5 into the 10th or 71st per-
centile, or into quartitrs (fourths) or quirntilrs (fifths). One useful
measure is the interqumtile range, the interval between the 75th
and 25th percentiles-this is the distribution~ in the middle,
which avoids the extreme values at each end. Or you can divide
a distribution into n+bgroupr-those with incomes from s10,000
to $20,000, for example, or ages 20 to 29, 30 to 39; and'so on:
All: these values can easily be plotted. With many of the
things that scientists, economists, or others measure -1Qs, for
example, and other test~scores-we typically tend to see a famil~
iary bel]Lshaped
end, or taif. TY
19th-century C
But you may i
clusters, a Gimc
A widely i
great deal. No
tance from the
range, this has
how spread ou
In what one st
in most sets c
being measum
average by m
more than 2 <
than 2.57 star
"Once yo
shaped distrib
the whole pict
cvrve wl
variatik ht
the more sprc
"Drni, nM
dk-prncling on tlir
diBrrvntts bn~c:
numbrr ol squares
of e pnpulriKm mu
tc.uh A,in

®
®
THE SCIENTIFIC WAY 33
iar, bell-shaped ; rwnnal distributorq high in the middle, low at each
end, or 1ail. This is the classic CCouuian currx, named after the
19th-century German mathematician Karl Friedrich Gauss.
But you may also find that the plot has two or more peaks or
dusters, a 6imodal or multimoda!'dirhibution.
A widely used number, the stmtdard dcviation, can reveal a
great deal. No matter how it sounds, it is not the average dis-
tance from the mean but a more complex figure. ` Unlike the
range, this handy figure takes full account of every value to tell
how spread out things are-how dispersed the measurements.
In what one statistician calls a truly remarkable generalization,
in most sets of measurement "and without regard; to what is
being measured" only I measurement in 3 will deviate fiom the
average by more than 1 standard deviation, only 11 in 2& by
more than 2' standard deviations, and only 1 in 100 by more
than 2.57 standard deviations.
"Once you know the standard deviation in a normal, bell-
shaped disu ibution,, according to Thomas Louis, 'you can draw
the whole picture of the data. You can visualize the shape of the
curve without even drawing the picture, since the larger the
variation of the numbers, the larger the standard deviation and
the more spread out the curve -and vice versa.n
'Tlcrv i.s nrrn than mw way tu cakulLic it, and thcrv are avrrd vanatNxu,
d'ependint; on the statiwwian:a hurpa: A uwnmun wK ib to aJcJ the squates of the
di/lercnces betw.xn each number uxd the mean, then divide that number by the totat
number ot squerts, otten rekinil io aa the am>axr (minus I if vou're toohing at a sampk
ot a population rather than the whok population). Then cakulate the squarr eoa ot the
n-wL. A., in
Snma-timc, vatisuiciana cakulate the uaW"vd'druimmn of 1M nWn-this because ttx
mcan, hring an a%vraW; is ks. .aria6lc than cinyl. nnawnrnwm.. 5,rtk call thts tM
ilmd®d nror cK aandmd'efw o, tM mew A. In,
All ttn- jIx;vcarc nwanun. 14 di.lwrnun..
2N
W
~
N
~
(b
cla
®

This is the part I always hate.
Sit down before fact as a little child, be prepared to give up every prernnceived
notion, fbllow humbiy wherever and'to whatever abysses nature leads, or you
shaD learn nothing.
4
-Jotin Hunter
1lkA-aawy BrioiA aawniu
-Thomas Henry Huxley.
THERE is no disease that strikes older people more tragi-
cally than Alzheimet's disease, which makes a useless tangle of
the brain. At a prestigious New England university a researrh
team imaginatively inserted catheters into the skulls of four pa-
tients aged 64 to 73 to deliver a continuous infusion of either a
theoretically promisimg drug or, altrmately; an ineffectual saline
solution for comparison.
After 18 months the investigators published a paper saying
that according to observations by the patients' families, three
patients showed marked improvement and the fourth at least
held his own. Favi*+Ating, of course. Some reporters learned of
the work and began inquiring. The investigators let a'I'W crew
do a story and also held' a news conference, with one patient

Example: If the average score of all students who take the
SAT college entrance test is relatively low and the spread-tlie
standard deviation-relatively large, this creates a very long-
tailed, low-humped curve of test scores, ranging, say, fromm
around 300 to 1500. But if the average score of a group of
brighter students entering an elite college is highs the standard'
deviation of the scores will be less and the curve will' be high-
humped and' short-tailed, going from maybe 900 to 1500.
"If I just told you the means of two such distributions, you
might say they were the same," another scientist says. "But if I
reported the means and the standard deviations, you'd know
theyy were different, with a lot more variations in one"
From~ a human standpoint, variation tells us that it takes
more than averages to describe individualk: Biologist Stephen
Jay, Gou1d learned in 1982 that he had a serious form of cancer.
The literature told him the median survival was only eight
months after discovery. Three years later he wrote in Discmxr,
"All evolutionary biologists know that means and medians are
the abstractions," while variation is "the reality," meaning "half
the people will live longer" than eight months.
Since he was young, since his disease had been diagnosed
early, and since he would'~ neceive the best possiblt treatment, he
decided he had a good chance of being at the far end of the
curve. He calculated that the curve must be skewed well to the
right, as the leh half of the distribution hadto be "scrunclied up
between zero and eight months, but the upper right half [could]
extend out for years." He conduded, "I saw no reason why I
shouldn't be in that small tail.... I would have time to think,
to plgn and to fight." Also, since he was being placed on an
experimental new treatment, he might if fortune smiled "be in
the first cohort of a new distribution with . . . a right tail ex-
tending to death by natural causes at advanced old age.'"
Statistics cannot tell us whether fortune will smile, only that
such reasoning is sound.
f
Studie:
Good (
Why think? Why
Sit down befors fa
notion; follow hur
shatl kam nothinc
This is the part Ii
T HERE is
cally than ATzI
the brain. At ,
team imaginat
tients aged 64
theoretically p:
solution for o0
After 18'r
that accordinf
patients showr
held his own.
the work and '.
do a story an

cttAPTER4
brought forth for on-camera testimonials. Except for some
newspapers that decided to print nothing, the story flew far and
wide.
The head' investigator, a chief resident in neurosurgery,
cautioned that the results, though encouraging, were 'very
early" and "certainly do not prove this is an effectiwe treat7rtent"
He advised healthy skepticism. But headlines unequivocally
read: "Alzheimer's Test Found~ Successfiil," "Alz}ieimer's:. A New
Pmmise,° "First Breakthrough Against Alzheimer's;:' "Pump Of-
fers Hope,' 'Possible Alzheiinet's Cure' Within two months the medical center logged 2,600 phone
calls, mainly from desperate families, and critics began asking
why a press conference had been held, since a study of only four
patients-with unblinded investigators getting their assessments
from hopeful families-meant; little.
Harvard's Dr. Jay Winsten conduded t}iat 't}ie decision to
hold a press conference ... far outweighed in impact the mod-
ulating effect of the investigators' qualif}ling language. The vis-
ual impact of [one] patient's on-camera testimonials all but
guaranteed that TV coverage would oversell the researchs de-
spite any qualifying language""
When dubious daims are made - about Alzheimer's, a new
cancer drug, a possible AIDS cure-and'the daims get widely
reported, there is commonly a lot of postmortem ducking and
soulLsearrhung among reporters and'editors. Then someone else
makes some sensational' clairrt, and the same thing may happen
all over again.
The biggest error in mediaaJ science, according to Dr.
Thomas Chalmers, is "the uncontrolled pilot study in which the
investigators try a treatment on 10 patients, and if it seems to
work ... are tempted'to report~ it" to fellow scientists, let alone
the media.'
All science is only a stab at the truth. Even with the best of
statistics, "We scientists don't know how to tell the whole tiuth,"
Mosteller reminds us.' Outside this honest limitation lie vast
realms of inadequate science with plausible-sounding yet shaky,
®
UP111=11CMI
STUDIES. GOOD A'
statistics. A Fren~
said 150 years ac,,
the numerical m,
time than the tr
often give it. `Sa
every idiot in th(
program thinks
The big pi
have little to do
do with judgme
to conduct it, tt
fnenzied media
many chanoes '
calls for sophist
hope of' telling
repon?
A fundarr
ducted study k
indude rs
and to L J e
methods and
th'is kind of a9N
This is n,
there is much
of numbers a
EXpaiM
.
Student,
credit-rated,
what has be
studies carr)
Science
as generaiiz
tured into
science. Ob

STUDIES, GOOD AND BAD 37
statistics. A French physician, Pierre Charles Alexandre Louis,
said 150 years ago, "I'he only reproach which can be made to
the numerical method" is that it 4rtquires much more labor and
time than the most distinguished members of our profession"
often give it. `Some days," says one modern statisticians "I; think
every idiot in the country who can put his hands on a computer
program thinks he's a statistician"
The big problems of statistics, say its best practitioners,
have little to do with computations and formulas. They have to
do with judgment, we're told, with how to design a study, how
to conduct it, then analyze and' interpret the results. In a day of
frenzied media competition for the public's eye and ear-and
many chances to do harm~ by shaky reporting-journalism too
calls for sophisticated judgment. How, then, can we have some
hope of telling which studies seem credible, which we should
report?
A fundamental principle is that every conscientiously con-
ducted study has a careful design: a method or plan of attack to
include the right kind and number of patients or petri dishes
and to try to eliminate bias. Dif'erent problems require different
methods, and one of the most basic questions in science is, Can
tlus kind of apxrirnent thir dest~n, yield the ansuxrl
'I'his is not a simple question for a reporter to answer, but
tliere is much we can know. What kinds of studies, what kinds
of numbers and controls and methods, should we look for?
Experim.ents versus Seductive Anecdotes
Students and eggs can be graded, citizens and cities can be
credit-rated, and scientific evidence can be weighed according to
what has been called a hierarchy of evidence. Some kinds of
studies carry little weight, some more, some a grean deal.
Science and medicine started witli mucdola, unreliable as far
as generalization is concerned, yet provocative. Anecdotes ma-
tured~ into systematic o6scruatiors, the most ancient form of
science. Observation told the ancients much about the stars, it
®
®
®
®
M

told~ the pharaohs' physicians much about the sick, and it is still
important, for simple "'eyeballing'' has developed into deta collex-
tion and the recording of case hirforics. These are respectable, yea,
indispensable methods yet still only one part of science. Case
histories may not be typical, or they may reflect the beholder.
Medicine continues to be p1agued by Big Authorities who insist
"T know what I see"
There can be useful, even inspired, observation and analy-
sis of natural cperiments: Excess fluoride in some waters hardened
teeth, and this observation led to fluorid'ation of drinking water
to prevent tooth decay. There are also man's inadvertent experi-
ments, disastrous and benign, to be studied. Hiroshima trig-
gered wide analysis of the effects of nuclear radiation, invaluable
yet frustrating because there were no good~ measures of exposure
levels, a gap that has caused confusion and controversy ever
since.
In 1585 or so, Galileo dropped those weights from a tower
and'he]ped invent the aeiintifu aperiment; a study in which the
experimenter aonvvls the conditions-controlled conditions are
the heart of the experimental method-and records the efect.
Experiments on objects, animals, germs, and people matured
into the modem aprrimenlal study,, in~ which the experimenter
typically changes only one or some other planned number of
variables to see the outcome.
Clinical Trials
The experimentali method is the essence of experimental
medicine's current "gold standard":' the controUed, randdmized clini-
cal trial: At its best, the investigator tests a treatment or drug or
some other intervention by randomly sdeeting at least two com-
parable groups, the ezpe,irnental group that is tested or treated and
a control group that is observed' for comparison.
True clinical trials are expensive and difficult. It has been
estimate& that of 100 scheduled trials, 60 are abandoned; not
,
SCUDIFS,, GOOD ~
implemented, o
culty in rernri6
lems, or, some
(making contirr
group unethica
sults, and jµst :
theless are callc
to evaluate m
Randomized c
heart attadc de
atrokes, and th
No doctor~ ob!
shown these ti
Types of
Among
similar group~
no treaunent.
In cia ssr
ments in suco,
~
contro&,'
observL
treatmen.. Tl
outcome of tz
between stud
become mor
health-cons6
patients in z
studies eithe,
cholesterol a
some of the
fewer fats-;
Invcst
son with ol
percent, sa~
uienul oonbr

implemented, or not completed, whether for lack of funds, diffi-
culty in, recruiting or keeping patients, toxicity or other prob-
lems, or, sometimes, rapid evidtnce of a difference in effect
(making continued denial of effective treatment to a control
group unethical). Another 20 trials produce no noteworthy re-
suits, and just 20, results worth publishing. Clinical trials none-
theless are called the strongest, most, precise, most decisive way
to evaluate medical interventions and learn true causation.
Randotnized clinical' trials proved that new drugs could cut the
heart attack death rate, that treating hypertension could prevent
strokes, and that polio, measles, and hepatitis vaccines worked.
No doctor, observing a limited number of patients, could have
shown these things.
Types of clinical studies include the following:
Among the most reliable are prrra!!d stLdia comparing
similar groups given different treatments, or a treatment versus
no treatment. But such studies are not always possible.
In onssover scudus the same patients get two or more treat-
ments in succession and act as their own controls. Similariy, scl, f-controlW studiu evaluate an
experimental treatment by control
observations during periods of no treatment or of some standard
treatment. There are pitfalls here. Treatment A might affect the
outcome of treatment B, despite the usual use of a uaashout prriad
between study periods. Patients become acclimated: They may
become more tolerant of pain or side effects or, now more
health-conscious, may change their ways. The controls-the
patients in a control group-don't always behave in parallel
studies either: In one large-scale trial of methods to lower blood-
cholesteroL and risk of heart disease, many controls adopted
some of the same methods-quitting cigarette smoking, eating
fewer fats-and reduced their risk too.
Investigators often use hirtoncal condvAs (meaning compari-
son with old records: historically the cure rate has been 30
percent, say, and the new therapy cures 60 percent) or other
exle.nal contw(r (such as comparison with other studies). These

Q, cHArrER4
controls art often misleading-the groups compared are fre-
quently not comparable, the treatments may have been given
by different methods-but they are still at times useful.
What Makes a Study Honest?
Obviously all studies, including the best, have potential
pitfalls:
Lack of'adcquatc controls is fatal if you really want to put the
results in the bank.
The group or smnph studicd, 10 people or 10;000; must be
lasgr enough to get a valid result and repse,sentatirx enough to
apply to a larger population. Because people vary so widely in
their reactions, and a few patients can fool you, fair-sized groups
of patients are usually neededl And enough of the right kind of
subjects arc needed for a suitable sample. Picking patients for a
medicali study is no different from picking citizens to be ques-
tione& in a political poll. In, both, a sample i's studied, and
inferencrs-the outcome of an election, the results in patients in
general-are made for a larger population.
To get a large enough sample, medicall researchers more
and more try to conduct m+iltrcenter triaLs, which are appealing
because they can include hundreds of patients, but expensive
and tricky because one must, try to maintain similar patient
selection and quality control at 10'or 100 institutions. Suceessful
multicenter trials established the value of controlling hyperten-
sion to prevent strokes. They demonstrated the strong probabil-
ity that less extensive surgery is as effective as more drastic
surgery for many breast cancers.
The smnfiili should be rmidomizcd-divided by some random
method into comparable experimental and control groups. Ran-
domization can easily be violated. A doctor assigning patients to
treatment A or B may, seeing a particular type of patient, say or
think, "I'his patient will be better on B."
If treatment B has been established as better than A, there
should be no random study in the fusr: place and certainly no
I
STUDIES, GOOI
study of that c
'the trial's gua
one critique. E
are often assit
puter-generat(
Tocomb
and get answf
study popul,~
groups by ag,
stratify can h
tampons in t
cases were bi~
The ide
can be trick,
mav fail to s<
But some p
stronger pat
treated imm
We repc
or di!
nomia
major newsl
parity with i
vantaged gr
page did ti
older peopl
incomes be'
are still ma
'To cn
blirrded-to
bl;nded, so I
a treatmen
know whet
ineffective 1
a good~ res
There is i

S'IUDIFS, GOOD AND BAD 41
study of that doctor's patient. When randomization is violated,.
"the triaPs guarantee of lack of bias goes down the drain," says
one critique. As a result, patients who consent to randomization
are often assigned to study groups according to a list of com-
puter-generated random numbers.
Ta combal'bias-the influence of confounding variables-
and get answers applicable to various populations, the sample or
study population must often be siratfW ' or separated into
groups by age, sex, socioeconomic status, and so on. Failure to
stratify can hide true associations. The role of high-absorbency
tampons in toxic shock syndrome was darified only when the
cases were broken down by precise type of tampon used,
The identification of important subcategories of patients
can be tricky indeed. A study of open-heart surgery patients
may fail to separate out those who had to wait for their surgery
But some patients die waiting, and those left are relatively
stronger patients who do better, on the average, than those
treated immediately after diagnosis.
Wt reporters may also fail to pay attention- to stratification,
or distribution. In early 1985 the Ptesident's Council of Eco-
nomic Advisers reported that-to quote the page-one lead in a
major newspaper-"elderlyArnericans have achieved economic
parity with the rest of the population and no longer are a disad-
vantaged group" Not for several1 paragraphs, now on an inside
page, did the story note that "there's a lot of variability;' and
older people are also 'nore l"ikely ... to have members with
incomes below the average of their age group "' In short, there
are still many elderly trapped in poverty.
To cnmbal bias in inuxstigators or patiertts, studies should be
blinded- to the extent feasible, sing(e-, double-, or, best of all, triple-
blindid, so that neither the doctors nor the nurses administering
a treatment nor the patients nor those who assess the results
know whether today's pill is treatment A, treatment B, or an
imeffective placebo: Otherwise, a doctor or patient who yearns for
a good result may see or feel one when the `right" drug is given:
There is a tale of an overualous receptionist who, knowing

42' CIiAFI'ER Z
which patients were getting the rr.al drug and not the placebo;
was so encrwraging,to these patients that they began saying'they
felt good, wiIly-nilly.'
Barring observant receptionists, the use of a plaeebfl-from
the Latin meaning "I shall1 please"-may help maintain blind-
ness. Placebos actually give some relief in a third of all patients,
on the average, in various conditions. The effect is usually tem-
porary, howevu, and a tnily effective drug ought to work sub-
stantially better tltan~ the placebo.
Blinding is often impossible or unwise. Some treatments
don't lend themselves to it, and some drugs quickly trveal i themr
selves by various effects. But an unblinded test is a weaker test.
Finally, what makes a study honest is honesty, John Bailar
warns of deliberate or careless deceptions that seem to be uni'-
versally accepted today, practices that sometimes have much
value but at other times are "inappropriate and improper and,
to the extent that they are deceptive, unethical." Among them:
the selective reporting, of findings, leaving out some that might
not fit the conclusion; the reporting of a single study in multiple
fragments, when the whole might not sound so good; and' the
failure to report the low power of some studies, their inability to
detect a result even if one existed'.'
Dr. Charles Moead of the Mayo Clinic says,
Probably the majority of cancer patients treated with chemotherapy
today art receiving regimens that have not been proved e$ective by
randomized trial! ... Many artides,publishe.d in our major journals
make claims for fantastic therapeutic accomplishments with no ran-
domiz,ed'contralk. ... Many, if not most, of the randomized'studies .
.. are of such poor quality that their Izsiilts are unbelievable....,
Ptrcious few have withstood the sautiny of carefully designed
confirmatory scientific study.
He calls a multitude of poor methods statistical legerde-
main: 'tfie games we play, trying to squeeze out that little bit of
breakthrough" Why the pressure to play them? 'Salvation," Dr.
Si'SJDIFS. C',OOD.
David Salsburf;
prestige, invitr
references in t!
Epi&emiok
Clinical s
populations, v
demiology set
a population
ra!' innertigatior
Epidemi,
ies-aome sn
same pitfallss
the right ans<
goes, an epic
sex.
Epideni
epidemics of
miolo,r" .to
we liv X,
the heaatnies
the first en
healthier to
today's enviu
may' have b
he might he
wealthier ar
In 174
ess b
y
succ
, don's c}tinu
Ij to: soot-br
rette. A oe
I
cases on a
drinking w
Street pun,

/
David Salsburg answers. "Ftuit in this world (increases in salary,
prestige, invitations to speak) and beyond this life (continual
references in the citation index) "'
Epidentiolo,~y:.. I~ippocrates to AIDS
Glinical studies deal with patients. Epidemiology deals with
populations, which sometimes are large groups of patients. Epi-
demiology seeks the causes of both health and disease by placing
a population under its own kind of microscope, the epidenu'oCqgi-
cal irsvGrtigation.
Epidemiological studies in many ways parallel' c]inieal stud-
ies-some studies are both-and are subject to many of the
same pitfalls and rules, like avoiding bias and stratifying to get
the right answers about the right subgroups. An old'saw, in fact,
goes, an epidemiologist is a physician broken down by age and
sex.
Epidemiology in its early days was concerned wholly with
epidemics of typhoid, smallpox, and other infections. But epide-
miolbgists today also ask, "What should we eat and how should
we live to stay healthy?" and they study large groups to see how
the healthiest and unhealthiest live. Hippocrates has been called
the first environmentalist because he observed that it was
healthier to live in high places than in low ones. Anticipating
today's environmentalists, he blamed bad air and bad water and
may have been partly right. But he failed to stratify; otherwise
he might have noticed that the people who lived' high were also
wealthier and better nourished than those who lived low.'
In 1740 Percival Pott scorrd a famous epidemiological
success by observing the high rate of scrotum cancer in Lon-
don's chimney sweeps and correctly blaming it on their exposure
to soot-burned organic material, much like a smoked ciga-
rette. A century later, John Snow, plotting London cholera
cases on a map and noting a duster around one source of
drinking water, removed the handle from the now famed Broad'
Street pump and helped end a deadly epidemic. The 19th-

STUDIFS, GOOD ,
century French advocate of statistical methods, Pierre Louis,
observed hospital patients and helped stop the use of bleeding as
a treatment. Ignaz Semrnelweis showed that doctors' dirty
hands t.ransmitted deadly childbed fever to mothers.
Modem epidemiologists successfully indicted smoking as a
cause of lung cancer and heart disease and identified the associa-
tion of fats and cholesterol with dogging of the arteries. They
evaluate vaccines, assess new methods of health care delivery,
and track down the causes of new scourges like AIDS, toxic
shock syndrome, and Legionnaires' disease, all by several
methods. AIl are valuable. All are fuIl of traps.
Epidemiology, like all of' science, started with obsmatianal
studies, and these remain important. They are weak and uncer
tain, we have noted, when it comes to determining cause and
effecr. Yet observation is how we firsr learned of the unfortunate
effects of toxic rain,, Agent Orange, cigarette smoking, and
many sometimes helpful, sometimes harmful; medications-and
of certain sexual! practices and addicts' use of dirty needles on
AIDS.
Some observational studies are simply drseriptirx-describ-
ing the incidence, prevalence, and mortality rates of various
diseases, for example. Other, analytu studies seek to analyze or
explain: the Seven-Country Study, for example, that helped
associate high meat and dairy fat and cholesterol consumption
with excess risk of coronary heart disease. Ecological studies look
for links between environmental conditions and illness. Human
migrations-like that of the Japanese who come to the United
States, eat more fat, and get~ more disease than they did in
Japan-are among valuable natural erfxrzmrnts.
The simplest observational measurement is a count. Samn-
ph'rg is just a more sophisticated kind of count. You can't count
or ques6on everybody, so you seek a sample that~ represents the
whole. Many epidemiological sunxys rely on samples-among
thems government surveys of health and nutritional habits.
Samples and surveys often use guestionnaisa to get information.
A sample or survey is never more than a snapshot of the
scene at the mo
unless fiequentl
than the q,ualin
compared patie
with those their
altnost half of eh
of a year. And
people tend to
often say both }
A survey may
get accurate in
Epidemi
control studirs, or
or crosssational'
look at the ratf
effects by age,
study: A cross
few days.
A o vnt
a disea }]:
examineo gro>
drorne, mairtl
case Control I
tients, or cases
their families
ries that cover
group is then
comp- ,lrrmup,
and other c1ii
The resu
the case-cont
tively eacy' Ic
semble clues
may test sorr
use of tampo O
as ttie main ~
N
W
CJl
~
N
~
~
~A

STUDIES, GOOD AND BAD 45
scene at the moment; it can't portray an ever-changing picture
unless frequently repea.te& Questionnaires may be no better
than the quality of the answers, written or verbal. One survey
compared patients' reporting of their current chronic illnesses
with those their doctors recorded. The patients failed to mention
almost half of the conditions the doctors detected over the course
of a'year. And whether it comes to illness, diets, or drinking,
people tend to put themselves in the best possible light. They
often say both yes and no to the same question in different form.
A survey may stand or fall on the use of sophisticated ways to
get accurate information.
Epidemiologists' studies may also be p.er.vlencr studies, uisr
conbol stLdicr, or cohort studtes. A prraalence study; also called a ewmit
or cran-sartimral study is a wide-angle snapshot of a population: a
look at the rate of disease X or at toxic agent X and its possible
effects by age, sex, or other variables. A political poll is such a
study: A cross section of the nation is examined in a period of a
few days.
A carr-corebvd study examines caus and contrvlr for a close-up of
a disease's relationship to other factors in a smaU, intensively
examined group. The nation hears of cases of toxic shock syn-
drome, mainly in young women. The federal Centers for Dis-
ease Control launches a J,icld in~n to find a series of pa-
tients, or rasa, confirm the diagnosis, then interview tbern and
their families and other contacts to assemble careful case histo-
ries that cover, hopefully, all possible causes or associations. This
group is then compared with a randomly selected; but matched
com,bar group; or control group, of healthy young women of like age
and other characteristics.
The results need to be interpreted with great caution, but
the case-control study is often a quick, highly useful and rela-
tively easy, low-cost first approach or fishing expedition to as-
semble dues about causes or even a working hypothesis. Or it
may test some hypothesis. A case-cantrol' study pinpointed the
use of tampons (later found: to be certain high-absorbency ones))
as the main villain in toxic shock. The relationship of cigarette
©
a
®
iu
®
®

smoking to lung cancer, the association of birth control pills withh
blood vessel problems, and the transmission ~ patterns of AIDS
were identified~ in case-control studies that pointed to the need
for broader investigation.
f'.dimff or incidencr stud:is are motion pictures. They pick a
group of people, or cohort -a cohon was a unit of a Roman
legion-oken stratify or divide them into subgroups, then follow
them over time, often for years,, to see how some disease or
diseases develop. These studies are costly and difficult. Sutbjects
drop out or disappear. Large numbers must be studied to we
rare events. But cohort studies can be powerful instruments and
substitutes for randomized' experiments that would' be ethically
impossible. You can't ethically expose a group to an agent that
you suspect would cause a disease. You can watch a group so
eacposed.
The noted Framingharn study of ways off life that might be
associated with developing heart disease has followed more than
5,000 residents of that Massachusetts town since 1948. The
American Cancer Societ/s 1952-55 study of 187,783 men aged
50 to 69, with 11,780 of them dying during that period, did
much to establish that cigarette smoking was strongly associated
with developing lung cancer.1O'
Many epidemiological, as well as clinical, studies are
handicapped because they must be retrorpectirac. T}iey lbok back
in time-at medical records, vital'statistics, or people's recollec-
tions (for example, those collected in interviews in a case-control
study). People who have a disease are questioned to try to find
common habits or exposures. Women with cervical cancer are
interviewed to see how many took possibly guilty hormones and
how many did' not. People who live around a Love Canal are
asked if they have been ill.
Retrospective studies are notoriously unreliable. Memories
fail or play tricks. Old records are poor and misleading. Defini-
tions of diseases and methods of diagnosis vary sharply over the
years. The patients you find may not be representative. A retro-
spective study, however intriguing, generally only says that
there may be something here that ought to be investigated.
STUDIES, GOOD f
(There are excel,
tive study can I
lected!in the pw
was a retrosper'
pA p%rprcr
the American C
sharply on a se
statistical and r
ford tells how fc
the accuracy e:
adequate prosF
ward looks we:
Epidemi
experiments of
cally inLmeLtio,
tion; somethir
The mas!
Salk polio vac
trial' too j with
to ~ eithr va
placebc
divided betwt~
first- and thiparuetpating
counted all a
those who h.
In the placel
the vaccinatc
subjects late7
shot."
Anothea
tablished' th,
tooth decay.
not. Blindin
tal caries th:&
cebo effect.

('I'here are exceptions. Dr. Gary Friedman writes, "A retrospec-
tive study can be quite reliable if based on data caiefWly co1-
lected in the past. A revealing study of mortality in radiologists
was a retrospective cohort study based on good data")
* A pmpaarx sdudy, in contrast-like the Framingham and
the American Cancer Society studies-looks forward. It focuses
sharply on a selected group who are all followed by the same
statistical and medical techniques. Dr. Eugene Robin at Stan-
ford tells how four separate retrospective clinical studies affirmed
the accuracy of a test for blood dots in the lungs. When an
adequate prospective clinical trial was done, most of the back-
ward looks were proved' wrong."
Epidemiology also includes arperirr+rnlal rtudies; the dassical
experiments of science on a larger human scale. These are typi-
cally tntcruentwn studia. Zhere is some intervention or manipula-
tion; something is done to some of the subjects.
The massive and hugely successful 1954 field trial of the
Salk polio vaccine was a classic intervention trial and a clinical
trial too, with 401,974 first- to t3tird-graders assigned at random
to either a vaccinated group or a control group injected with a
placebo, or dummy shot-and another 947,171 children
divided' between vaccinated second-graders and unvaccinated
first- and third-graders acting as controls. In addition, in all
participating states or counties, the investigators studied and
counted all cases of polio in a grand total of 1,829,916 children:
those who had~ taken part in the study and those who had not.
In the placebo areas, the study was also triple-blinded: neither
the vaccinators, the subjects, nor the doctors who examined the
subjects later for polio knew which children got which kind of
shot. `'_
Another successful intervention study, a conmunity~ bial, es-
tablished the value of fluoridating water supplies to prevent
tooth decay. Some towns had their water fluoridated; some did
not. Blinding was impossible, but the striking difference in den-
tal caries that resulted could not have been caused by any pla-
cebo effect.
/

Just bccausc Dr. Famous or Dr: Bigshot says this is what hc fbund dor.Yn i mean
it is neccsurilj+ so:
-th. Amold Rclman.
Ask to see the numbers, noa jusa the pretty coiors.
-Dr. Richard Muoin
tiarxvwl . /n,trauan aJ Mmhb,
ikverihin}; R! I.una w rtponcn,
WHAT questions should we reporters ask -to make our
news solid, to report the more valid claims and ignore the weak
and phony? When a scientist or physician or anyone else says,
Tve discovered that ...," what should we ask?
In 1949, a year after Britain's National Health Service-
"socialized medicine'- was launched, my editors sent me to
Britain to see how it was working. A bit stumped, I asked Dr.
Morris Fishbein, the provocative genius who long edited the
fournal of the Arrurican Mr,dical Association, "How can~ I, a reporter,
tell whether a doctor is doing a good job?" He immediately said,
"Ask him~ how often he has a patient take off his shiit."
His lesson was plain: No physirali examination is complete
unless the patient takes off his or her dothes. Most reporters are
not skilled statisticians, but we can ask some similgrly revealing
questions. Many of these arz not even statistical, just, simple
ones that, like Fishbein's, probe soft spots and often disclose
either a conscientious approach or one that can't be trusted.
We can learn here from one method of science. We said
49
QUESTIONS RFPi:)t
earlier that a prc
seeking trutli, oft
A is no better tha
sees whether or r
much like the lav
cutor to prove '
guilt}°: A reporte
should be equall
words or thougf
If an invest
case, you may Y
since a good sci,
for you. The n
something.
Here are sc
p1e and obviow
want to ask the
How do }m,
mCnt? YITi 't i
Answer
'I've seet. ~0
block. . . ' rn,
gation, may bc
amt}iing like c
Wh'a1 kirrG'd
dcsi,gn' And a f
1470 wa s
casr-eonttol, ~'irn.
ter for kinds
people just sc
conclusion wi
medical! edito
studj? l4that' s;
mcrwer?'

(
earlier that a properly skeptical scientist, starting a study and
seeking truth, often begins with a nvll hypotlvsis-that tieannent
A is no better than treatment B, that there's nothing there - then
sees whether or not the evidence disproves it. This approach is
much like the lavJs presumption of innocence: It is for the prose-
cutor to prove beyond reasonable doubt that the suspect is
guilry. A reporter, without being cynical and believing nothing,
should be equally skeptical and greet every claim by saying, in
words or thought, 'Show me."
If an invrstigator or claimant is competent and has a good
case, you may have to ask none or very few of these questions,
since a good scientific presentation should' answer most of them
for you. The need for a lot of questions could itself tell you
something,
Here are some possible questions, then, some of them sim-
ple and' obvious ones, a few more terhnical1 for those who might
want to ask them.
How do you know.? Have you doru a study.) Was thac an apni-.
merit? I1'lrat is the aidew? Or is the approach just anecdotal?
Answers like "In my experience .... " "In my hands . . . ,'
"I've seen 20 cases ...' and "Ihere are four cases in our
block ...' may be interesting, may' be worth scientific investi-
gation, may be worth a cautious news story, but there is not yet
anything like certainty.
What kind of study ulas i!? Was there a rystematic rrsaarch plan or
drsigre? And a prowcol or set of rukr?
What uw the study deszgn or mtdod.` obsns.ntional, alxrimenlal,
carrco.rbol, prasperttUr, rdrnspeceive, or wheL? (See the previous chap-
ter for kinds of studies and their uses and limits. )"A lot of
people just scrounge around and try to come up with some
conclusion without any real plan or design at the start," one
medical editor reports. Was the dksign diauhr befmr you smrtnd ;rvev
sdidy ? What sperfte' quatiorcs or hypotiresa a'id yoe sd out to test or
aarurl?

Why did you do it that way ? Do you think it uxis the right kind of
study to get the answer to this guestion or problern?
Was it a trnrc human rxperiment, fpossible, with comfiiarabla groups
picked at random for comparison? If' not, why not? And what was the
subititute ?
If an investigator patiently - you hope-tells you about an
acceptable-sounding design, that's worth a brownie point. If'the
answer is "Huh?" or a nasty one, that may tell you something,
else.
Are yon presenting preliminary data or something fairy eonclusn'irr?
Are you prrsrnting a conclusion or a hypotlesis for ftatM study? "Pre-
liminary" and "interesting" can mean 'unproved"
If'the result is not ruuonab y concltcsirx, should there be further stvd:us
and ' what Aznd?
How many su~ects patients, cases, or penp'te are you taLting about?
Are thae nwnbers lnrge enough; statistically ngorous enough, to get the
aruuxrs you u.iant.1 Was there an adequate number of patients to show a
di,&*nncr between trtatments? Why are you calling a press conference too
rrporl'on foro patients?
Small!numbers can sometimes carry weight. And they may
sometimes be the only ones possible. 'Sometimes small samples
an° the best we can dos one researcher says. But larger numbers
arc always more likely to pass statistical muster,
The number studied can also depend on the subject. A
thorough physiological study of five cases of some difficult disor-
der may be important. One new case of smallpox would' be a
shoc.ker in a world in which smallpox has supposedly been elimi-
nated. In June 1981 the federal Centers for Disease Control'
reported that five young men, all active homosexuals, had been
treated for Pneumocystis =inii pneumonia at three Los Angeles
hospitals.'' This alerted the world to what soon became the
AIDS epidemic.
Who were your subjects? How were theyy sulGetrd? what were your
crzteriafor abnission to the stud.y? Werr rignrout laboratory tests used to.
QU£STIONS REPOR'
&finC the PQ17CntSi or
Was the auigr
randomizcd ~ Randc
cent chance of bei
armed study (one
ttd to thr study btf
How was the randc
If'thr subjes z
"If it is a nonrar
some extraordinz
Was there a c
always be weakeison.P'In.other wo
what are you carnlt
control'group simi/c'
siudicd ?
Vogt calls 4
bly ... th -~nE
ular liter D
Do you hane
atiur of the grnera
the disease or cmu
long way towar~
an the rrsultc apJ
.
Ifv= gm
important fwfiulc
statistical adJustr
sPa'~.rw gr°upsi m
ple to make z
nearly compar
bility and strar
Was the it
treatment ' with a

define the patientr, or uaen chnical'diagnoses (nxeuari[y less nliablc) used,'
Was the assignment of subjeids to b'mtrnent or other v~n
randomizeV Randomization should give every patient a 50 per-
cent chance of being assigned' to one group or the other of a two-
armed study (one comparing two groups). Were the patients admit-
kd to the study before the randomizatiort? This helps elimiinate bias.
How uxis the rmidomizataon done?
If the subjatr uxnnt randomizad, why not?'Qne statistician says,.
"If it is a nonrandomized study, a biascd investigator can get
some extraordinary results by carefulIy picking his subjects"
!
Was thcre a control or comparison group? If not, the study wiIl
always be weaker Who or what wen}+our contmis or bQUS fvr compmi-
son? In other words: When you say ynu have such and such a result,
what are}+ou comparing it' with? Are thc study or poturtt grocrp and thc
contml group similar in all raabacts but the traatrnent'or other variable being
stndird.?
Wogt calls "comparison of non-comparable groups proba-
bTy ... the singie most common error in the medical and' pop-
ular literature on healthh and disease."
lb}pu have rauon to brli'eue yo=n sublacts and contsols waa represent-
atirx of the general pnpelation? Or the paatrtular population-thau with
the disease or condition you are int~ in? The answers here go a
long way toward answering these questions: To what populations
are the rrsults applicable? Would the association hold fpr other groups?
If.yoiv groups are not comparable to the grneral populhtiorc or some
importarrt populatim, have pou taken steps b adjutt for thir? Eith'er
stdtirtical adjustrrient or stratifuat:on of your sample to fiad out about
spwfugrmcps, or both?'Samples can be adjusted for age, for exarn-
ple, to make an older- or younger-than-average sample more
neariy comparable to the general populace. (More on applica-
bility and stratification after a bit. ) :
Was the strrd}r blind-' In a study companng diugs or other f6rnw of
brntinenl wilh a placebo or a'unvny tnattnent; did (I) those arbTSinisteing

®
®
52 CHhPTER 5
the ftatmnd, (2)'tlwse gctt:ng d; and (3) those assessing the outcome know
who uaar g+dtrng what, or were th~y inderd blindcd; lnounng only that they
were comparing A' and B(or A, B, and C, perhapr)?
Could those gunng or Betting the treatment huve emtily, gguessed which
was which by a d:'ffereencc in naction or tnste or other rusultt?
Not every study can be a blind study. One tzsearcher says,
'hete can be ethical problems in not telling patients what drug
they're taking and the possible side effects. People are not guinea
pigs" True enough,, but a blinded study will always carry more
comaction.
Were there other acapted'qualtty controls? For example, making
sure (perhaps by counting pills or studying urine samples) tliatt
the patients supposed to take a pill really took it.
Were you abLe to foflow}nur protocol or study plnn?'
If there were questionnaires, interviews, or a survey: Were
the querions likely to eGiit atttcale, reliable answers? Was i1 really possible
to get aawatr answers to these questions?
Sampling is as common in mediaal studies as in~ political
polling. Every study examines a sample, not the whole popula-
tion, The sample must be reasonably accurate to~ give valid
results. But badly worded questions can also distort the results.
Respondents' answers can~ differ sharply, depending on~ how
questions am asked. Exarnple: In one study 1,153 subjects were
asked which is safer, a meatment that kills 10 percent of every
100 patients or a treatment with a 90 percent sutvival' rate?
More people voted for the seaond' way of saying precisely the
same thing.'
People commonly give inaccurate answers to sensitive
questions, such as those about sexual behavior. They are noto-
riously inaccurate in reporting their own medical histories, even
those of recent months.
Ask: Ihd you pretest your qursturns for e,~'ectiueners befo>e do:ng your
actual surury.?
Also: What was your nonrerporue rate? Do you report it?
QUFSTIOT:S REPOR"
In any studyy
toursc., Do you aam.
Every study
David Sackett saN
masons. Rather,
recover, die, or tf
ability." If an, inve
dropped out, it a
died of "other cau1
being investigate<
after all', they dii
treatment look b.
deaths in every t
SaeFiett add
originall inocptiot
more are not ac
worth reading"'
"Gtnerally true,
Professor V
few relate d
containI all'.. J
sometimes been
. . incJuding or:
what attnt has
data? ... It is :
data to: make t}.
How long u'
i1), survicr wi.fhlreall}^. k,ww the o
And: N'ou
biasis-a dise:
made by findiJ
but a cure waa
"It does pay tc

®
QUESTIONS REPORTERS CAN ASK
In any study: How mm,v of yorn sdrdy subjia tr cmnpAtted the
ernvst.~ Ik you aeemrnt fvr those who drvppad out cnd re11 usrey they did.'
Every study has dropouts. McMaster Ihtiversity's Dr.
David Sackett says, 'atients do not disappear ... for trivial
reasong. Rather, they leave ... because they refuse therapy,
tzcover, die, or retire to the Sunbelt with their permanent dis-
ability.' If an investigator ignores those who didn't do well and
dropped out, it can make the outcome look better. If those who
died of "other causes" are listed among `survivors" of the disease
being investigated-this is sometimes done on the theory that,
after all, they didn't die of the target cause - it can make a
treatment look better unless there are equatnumbers of such
deaths in every branch of the study.
Sackett adds, 'The loss to follow-up of 10 per cent of the
original inception cohort is cause for concern. If 20 per cent or
more are not accounted for, the results ... are probably not
worth reading'" (On which Dr. Thomas Vogt oomments,.
"Generally tnie, but utterly dependent on the situation:")
Professor Warren Burkett of the Universiry of Texas adds a
few related and'pointed questions; "Does the paper or pubGcation
contain all roultr of all apnir+rentr.? Support for a hypothesis has
sometimes been made to seem stronger by selective reporting .
.. including only the data that most dosely fit the theory. To
what edeni has the data of fered ...&en smoothid ,/ttme the raw
data? . . . It is not unknown for researchers to dip and round
data to make them fit [their] predicted resuits" (italics mine):'
Hout lomg wac the sddy'r fodlow-up? How long do patientt ordinar-
ily szvuiuc rwidi this disense.?' Were your patientr follorcad long mough to
set111y bww the outcomes, , good or bod?And: How thorough uaas the fivllary-up? In one report on
ame-
biasis - a disease caused by an amoeba- the diagnosis was
made by finding the amoeba in one of three consecutive stools,
but a cure was declared after observing just one negative stool.
'It does pay to read with care,' a medical professor observes.
W
®
®
I
N

®
Ct:t,,Pt1R 5
Could yotcr nsults har.r ornvrrad just by chance? Haue any statirtical
lcttr bem appl'ied to tcst thir?'
Did you calculatc a P raaluc? Was it fauorablc-.05 or less? (Re-
ported as <.05; see Chapter 3.) P values and confidence state-
menu need not be regarded as straitjackets, but like jury ver-
dicts, they indicate reasonable doubt or reasonable certainty
Remember that positive findings are more likely to be re-
ported and published than negative findings. Remember that a
favorablt-sounding P value of <.05 means only that there is
just I chance in 20, or a 5 pettxn.t probability, that: the statistics
could have come out this way by pure chance tahen there uas
actually no~efect-so I in every 20 statistically significant results
may be a misleading false positive.
There are also ways and ways of arriving at P values. For
example, an investigator may choose to report one of several end
points, death, length ~ of survival, blood pressure, other measurr
ments, or just the patient's condition on leaving the hospital. All
can be impottant, but a P value can ~ be misleading if the wrong
one is picked or emphasized.
You might want to ask: Are all tlic imporiant end points aruf their
P vali/rs rcpflrtcd? Also: Was the tesi giving the P value the appropriate
test; as planned in your anrtkn protocol, or dul yrou fsnally do more than
one lcind af test? (And perhaps report only the best answer?) What
uxrr the other values?
DId you collaborate with a siatistinan in both' yotv dcrign and }rour
analysis?'A statistician s collaboration often may be indicated in a
credit or footnote.
In studies seeking cause wtd cJfat, remember that associatSon~
is not necessarily causation. Rutgers' Dr. Michael Greenberg
reminds us, "Mathematical methods cannot establish proof of
cause and e$ect. They can indicate the probability that a rela-
tionship occurred by chance, can sometimes quantify the exist-
ing relationship between actions and efects, and ~ can under the
best circumstances be used to predict the impact of actions even~
®
®
®
QUES111ONS RE
if the comple.
. View ml
skepticism."
A true cx
prove cause i
and chemistn
association in
experiment) i
ria that you: c
Is the auo
different plac
How ~ stro
describing a ;,
ralio? The wc
lt mainly me
ing the outur
A rdatiu
one by the ot
(see pavr 46>
55 to iL
188 pc. 10(
smokers we:
cancer-thei
Is there
curve or gra
agent, or ca
deed at gre
smokers at F,
is an unsert]
only after sc
Anothe
conclahon Qor
the associati
tion, betwe(
straight, ste
a straight I
®
0
M

if the complex phenomena driving them are not understood.
... View mathematical associations with a healthy degree of
skepticism."
A true experiment, controlling all variables, can sometimes
prove cause and effect aUnost surely This is easier in physics
and chemistry than in human biology When, then, does a dose
association in an observational study (rather than a controlled
experiment) indicate causation? There are several possible crite-
ria that you~ can ask about:
ls the association consistmt? Are similar results usually found in
different places and by different research methods?
Haw strong is the association? If risk is an appropriate way of
describing a particular situation: Wluratt is the relaticr rtsk; or the risk
ratio? The word "strong" is used here in its mathematical! sense.
It mainly means the magiitudr of an effect or risk, the odds favor-
ing the oattome of interest versus no such outcome.
A relative risk, or risk ratio, compares two rates by dividing
one by the other. In an American, Cancer Society smoking study,
(see page 46); the lung cancer mortality rate in nonsmokers aged'
55 to 69 was 19 per 100,000 per year; the risk in smokers was
188 per 100,000. Since 188 divided by 19 equals 9.89; the
smokers were about 9.9 times more likely to die from lung
cancer-their relative risk was 9.9.' That's strong!
Is there an impressive dase-raporue, or casesc and-rffect; cww- a
curve or gradient that shows that the greater the exposure to the
agent, or cause, the greater the effect?' Heavy smokers are in-
deed at greater risk than moderate smokers, and moderate
smokers at greater risk than 6ght smokers. (In some cascs-tfiis
is an unsettAed matter- therc may be a ttueshold effect, an effect
only after some minimum dose.)
Another way of asking about risk and response: What is tha
corrrltrtion coeffuieru-the extent to which a set of measurements of
the association is linear? A perfect linear relationship, or correla-
tion4 between two observations or variables would show up as a
straight, steadily rising set of data poir~tr-in everyday language,
a straight line on a graph. A perfect positive correlation or,
t

®
®
ciLkPTER 5
linear relationship, is given the value +1; +.5 would be a lesser
but still interesting relationship;, -1 or any negative figure indi-
cates an; inrxrx or rugatiix rrlvtionrhs'p, such as a runner's speed
going down as his weight goes up. A correlation of zero means
no consistent association.
How spaific is the associatiori? Does a supposed cause lead to
many supposed effects? Or does an effect depend on many sup-
posed causes? Sucli associations are less specific, and thus more
suspect, until' positive evidence piles up. Smoking indeed causes
many effects. A lung disease, asbestosis, is most common when
there is exposure to both asbestos and cigarette smoke.
Does the supposed cause pra-.edc the did? Is a supposed beo ogical
association epidemiologically. plausibk? One strong argument for a
cause-and-effect rdationship between high consumption of satu-
rated fats and cholesterol and coronary heart disease is that
populations on such diets generally develop more such disease
than those on leaner diets.
Does the arsonctfon make biological sensP Does it agree with
current biological and physiological knowledgc?'You can't follow
this test out the window. Much biological facr is ill understood.
Also, Mosteller watns, "Sonuoie nearly always will clgim to see a
[biological', or physiological] association. But the people who
know the most may not be willing to."
Finally, look for the real why., Ask: Are there other possible
aplanntions?'Ded you ldok for other aplanatiorzs-confounders; or con-
fnundi'ng aariables; that may be producing or helping produce the
association? Sometimes we read that married people live longer
than singles. Does marriage really increase life span, or may
medicaL or other problems make some people less likely to
marry and also die sooner? Maybe the Dutch thought storkss
brought babies because better-off families had morr chimneys,
more storks, and more babies.
Did you tnke steps to avnt'rol or adjust for other possible aplmiatio+u?
Did you do a stratifud analyst;s-a breakdown of the data by strata
like sex, race, socioeconomic status, geograp}ncal' area, occvpa'
tion? Men commonly have more bronchitis and cirrhosis of the
WC
QUES77ONS R
liver than w
more heart
possibly beca
analyses will
Did you c
ak mtalysisj t,
analyses can
also be misu!
Some aophis
analysu did yc
the more an;
consider? Hou
tor tries eno
tion, he or
untrue.
In caus,
nanalysir of .
independent
see if t+- re
P -d
or rea se
analysis or r
among auth
reasoned ar.
than the an,
In stud
knoLv or da6aplu~; o*
'
ments or tc:
teniews, ph
highly subjc
provement
quantify), ou
YY~s there sor
Iftwoo

QUFS"17ONS REPORTERS CAN ASK 57
liver than women because they drink more. They also have
more heart disease, possibly because they've smoked longer,
possibly because some hormones protect women. Only stratified
analyses willi bring out such differences.
Drd ymu do an analysis (a rrgsecsice or somr othu fvrm of nvltivari-
a1s mmlysis) ~ to by to identzfy the impor~ aiaiiable or cmrabdis? Such
analyses can often reveal the strongest associations. They can
also be misused, and they are not always needed or appropriate.
Some sophisticated questions, when appropriate: How many sush
mmlyses d:d you have to run to dmidr on the appropriate one? Sometimes
the more analyses, the worse the study. How many variables did you
consids.T How many of these did yau wind up reporting? If an investiga-
tor tries enough variables in a kind of statistical fishing expedi=
tion, he or she is almost bound to find something, true or
untrue.
In eause-and-effect and other studies, ask: Has there &rn any
rennalysic of the data.-' "Results, if possible, should be met?iod-
independent," Greenberg believes. "You should recalculate and
see if the results hold up."
A word of caution: Questions about multivariate analysess
or reanalyses can be tricky. Whether or notto do one kind of
analysis or reanalysis or none at all is often a matter of dispute
among authorities. Launch the subject with some humility. A
reasoned answer, afumative or negative, may tell you more
than the answer's precise content.
In studies of medical treatments or preventives: How d:dyvu
kiwm or dai& whne your patients uxn c7vad or rinproved> Wen there
arplruit; objactirx outcome eriterra.~ That is, were there firm measure-
ments or tesr results rather than physicians' observations in in-
terviews, physical examinations, or chart reviews, all techniques
highly subject to great obsenxr variation and inaccuracy? If im-
provement or relief from pain-a particularly soft (hard to
quantify) outcome measure-had to be judged by observers:
Was diere some systema,tic way of making an auessmmt?
If lrrwo or more groufis uxrr cnmjaradfor sunnug ' was d+eis starlirg

~
~
CHAYTER 5
point the same at onset? At diagnosis? At start of tnatment? Were thcy
Jpdged by'the same disease alefinitions' at the stmi and the same merssures of
seU[R~y ' afill ot3tcorAe?
Did the intenention have the good resultr that uxre intended? Has
there been an aaal>ration to sa whether it was a useful recull?
Investigators often report that a drug or other measure has
lowered blood cholesterol levels. Fine, but were t.hey able to
show that it reduced the number of heart attacks? Or was reduc-
tion of a supposed risk factor itself taken to mean the hoped-for
outcome? That may' often be necessary, but the issue should be
discussed.
Investigators once repotted that a new heart drug reduced
the number of recurrent myocardial infarctions (heart attacks),
fatal and nonfatal. But total mortality for all causes was higher
in the treated group than in a placebo group.
Public health officials may announce the success of a cam-
paign to take high blood pressure measurements: X number of
people were found to be hypertensive and were referred to their
doctors. But how many went to their doctors? How many of
those received optimum treatment? Were their blood ptr.ssuress
reduced? (If they were, the evidence is strong that they should
suffer fewer strokes.).
In short: What uxis the bottom line? Did you really do any good?
To whom do your ruults apply? Can thry 6'e generarizod to a larger
populhtion? Are your patieni,'t like the average dodor's patients? Is there any
baszt in these findings for any patienl to ask his or her doctnr fof a change in
treatment? Clinic populations, hospital populations, and the
'worst ca.ses" are not necessarily typical of patients in general4
and improper generalization is unfortunately common in the
medical literature.
Agarn and again, in many of the cases cited in dus chapter,
ask: Do other sradies 6ack,ynu up? AnyKnir nnvlts consistent with other
clinic.al and erldcrimertal ffndings? Have yoea ,eultr b+rrn erjGraled or
Qt:ESrnoras Rf
confirnud or suj,
thesereS1llLs? Virtually
studies add c
criteria and tl
in humans, a
One s4e
grab bag of s
cumstances.'
but consisten
John Bailar t
several low I i
integrating ii
than any on
Mostly
most impor
these: What
data neally.
_
late6 6-won
mad; x
Dbes tlu
and flarus in
the inrxctigak
Robert Bo:
audacity ar.
use gLalifiyv
bound to i::
Ask tl
Yrour vxmE b
rienced sa
ers genera
Frederii
COtnTDM W7AW
thmgho ocn:.

QUFSTIONS REPORTERS CAN ASK
=fvmd or suppn.kd by otheff rtudirs? Or loar onry}m bixg cdlr m grr
#UM .esutu?
Virtually no single study proves anything. Two or 4 or 15
studies add credence, especially if the diagnostic and outcome
criteria and the people studied are similar. Consistency of results
in humans, aaimals, and laboratory tests also adds credence.
One scientist warns, however, 'You have to be wary about a
grab bag of studies with different populations and different cir-
cvmstances.' To which Haazvard's Mostelltr adds, "Yes, be wary,
but consistency across such differences cheers me up' And Dr
John Bailar tells us that, despite possible pitfalls,,'mda-mraCysir of
several low power reports"-that is, statistically analyzing and
integrating their results-"may come to stronger eondusions
than any one of them alone' (italics mine).
Mostly just good-sense questions? Of course. Some of the
most important questions of all for a reporter to ponder are
these: TNhot do I tlunk? Do the cvnclusions make snue to me? Do the
data really justify the conclusions? If this person has extrapo-
lated beyond the evidence, has he or she explained why and
made sense?'
Does the irwatiqator fsankly' dawnent or dittuss the possibl'e biatts
mid jaws in the study? A good scientific paper should do so. Does
thr intxstigator admit that fhe coaclusian may be finlodue or euiuoca!? Dr.
Robert Boruch of Northwestern University says, 'It requires
audacity and some courage to say, 'I don't know.'" Do the wu11wn
rca qualtfying pAemre? If such phrases are important, we are
bound to indude them in any responsible story
Ask the investigators themsd+ves: How much uxighe should
yotv urosk be giuere? Is it mally fsrm? And how imporienO An expe-
rienced science reporter says, `I have found that good' research-
ers generally have an honest and proportionate view of their
'Frsderick Moudlv diugreea with my a.zsaional iefe+ence to good senx or
common aens. If .omething is a commonusue ideam he says. 'wr* all would have
dwughr of it. So it mun be uncammon .eiue after all.' He msia good'rn+e.,
M
®
®
®
0
®
®

®
so cKAF7ER 5
own work's importance." But there are many exceptions.
Ask others in the same field: How do other infnrmed pmpk
ngard this rrport - and lheu invustigators? Are they s fxaking ia their arvnm
area of'eoertise, or have tliry shown roal mastery f they have rxntufed
ouLtide it? Have theif paaY results generally held up.P And'ryliat an somr
good'guestimu I tan ask them:?'True, a lot of brilliant and original
work has been pooh-poohed for a time by others. Still, scientists
survive only by eventually convincing their colleagues.
More formally: H'as d6rrc been a nezricv of the data and cnnclusions
by any duinkrestcd pwtus? Some major cljnicali studies are re-
viewed~ by independent second~ parties or committees. Reports
of the National Academy of Sciences must pass muster by a
review conmrnittee.
Has there Earn prn nview of the matmal? That is, has it been
examined by referees who were sent the article by a journal
editor?
And, a very important question: Has the work bxn publishrd
or accelbatd by a raputab7c journal? If not, why not.? The Ntw England
formral of Malicirre prints only 15 percent of the papers submitted
to it (many, of course,, are rejected because they are not of
enough interest to the journal's readers). Many have been given
at medical or scientific meetings, yet do not pass peer reviewers'
or the editors''muster: Most are eventually published elsewhere,
many in good journals. But there are journals and' journals.
In science as a whole, including biology and often basic
rnedical i sciences, &ience and the British Ntrtwe are indispensable.
In general medicine and clinical science at the physician's level,
the best, most useful journals are probably New England Journal
of Medicine,, Joarnnl of the American Medua! Association, Annals of
brtaaal Mulu:'ne C'anadian Mediialjournal; Journal of Clfnual hurs-
tzgatiars, and the British Iaauer and Biitirh Med:ial Journal: There
are many equally good' specialty journals as well as mediocre
ones. In epidemiology, three good sources are Amencan Journal of
Epidemiology, Journal of Chronic Dr'smses, and FRer.mtirX Madicone.
Ask pe.ople in any field: What are the most reliable journals,
those where you would want your work published?
QuFSTtorvs ,cU
Some of t
are not jpurna
like Family Prm
mary articles f
free-circulatior
and medical rr
revenue, are
journals. The
journais print
ords of work
JournaPs Dr. f
Read the
the investigat
the article ha
library, whid
hospitals, an
cieties. Too r
conservative] i
further in in1
to go ~ ti.ei
rrv1eM ln
put yo. . g
read the arti
Most re
an arttcle, loor ~
Arnec
tician, and a
ysis a.nd its c
to detect tre:
at least assu
statistical an
times. Som, .
isn't identifi
tics.
Table
sions. Som

QUESTIONS REPOitTERS CAN ASK 61
Some of the most valuable joutnals to a medical reporter
art not journals of original publication but review publications
like Fcrnily Practuc and Hospital Practice, which mainly ptint sum
mary articles for practitioners. With some strong exceptions, the
5ce-cirtulation - also known as controlled-czrcvlation - jounaals
and medical tnagazines, which depend wholly on adverttsing for
revenue, are not as rigorousiy screened as the traditional
journaIs. They are often on top of the news, however. All
journals print clinkers sometimes. "Scientific joun-tals are rec-
ords of work, not of revealed truth; says the New England
forvnal's Dc Arnold Retman.'o
Read the entire journal article yourself, if there is one. Ask
the investigator for a copy or phone the journal. 0r, assuming
the article has already been published, look for it at a medical
library, which can be found at any medical college, most good
hospitals, and the headquarters of many county medical so-
cieties. Tioo many news releases tout artides that read far more
conservatively than the PR version. Many scientists go much
further in interviews or news conferences than they are w0ling
to go in their articles. A reporter asked a scientist, `Does peer
review of an article put you at case?" He said, "It should help
put you at greater ease, but nothing puts me at ease until I've
read the article"
Most reporters can't be scientific referees, but uAen,yrou read
an mtrclt, loakfor t1u Jollon,irT.:
A credit or footnote indicating eollaboration with a statis-
ticians and a paragraph describing the method of statistical anal-
ysis and its outcomes, such as Pvalue or confidence level, power
to detect treatment effects, and so on. If they're in place, you can
at least assume that some efforti was made to apply the rigors of
statistical'analysis. If they're missing, should you beware? Some-
times. Sometimes the statistician is a coauthor whose specialty
isn't identified. tlnd~ some investigators are well versed in statis-
tics.
Tables and figures that tell the same story as the conclu-
sions. Sometimes they don't. One statistician told reporters,
COMEMES&NO
®
M
M
M

62 CH.APTF.R 5
"Don't assume that someone can interpret his own data. You
may do better." And "muddle around in the footnotes and ap-
pendices;' Mosteller advises. `You might find a few horrors.
That's how people found out that a much publicized study of
public and private schools induded only about 12 private, non-
parochial schools."
Other things described in this chapter, such as the proto-
col and study design, the criteria for admitting and ~ randomizing
subjects, the therapy actually receive& (in contrast to that
planned in the protocol); blinding, complications, loss to follow-
up, follow-up time, and any discussion of reservations or
weaknesses..
Ask, when appropriate: Where did the money to support the study
come from? Many honest investigators are financed by companies
that may profit from the outcome. So arr some dishonest or sr1f=
delitding investigators. But the peddler of a biased point of view
is as likely to be an antiestablishment crusa.der-or an academic
ladder-climber-as a corporate darling. Perhaps the best ques-
tion to ask yourself is Is this investigator a scientist or a sales-
man? lm any case, the public should know any pertinent con-
nections.
`What proportion~ of papers will satisfy [all] the require-
ments for scientific proof and clinical applicability?" Sackettt
writes, "Not very many. ... After all; there arc only a handful
of ways to do a study properl'y but a thousand ways to do it
wrong.""'
Despite impeccable designy some studies yield answers that
turn out to be wrong. Some fail for lack of understanding of
physiology and disease. Even the soundest studies may provoke
contzoversy.No study settles anything for all time.
And according to Sackett, some "may meet considerable
resistance when they d'iscredit the only treatment currently
available.... Clinicians may still elect to do something, even if
it is of no demonstrable benefit. Study results may be rejected,
QUEST1ONS F
regardless o
hood of thei
Repon:
everything
some of the

rega:dltss of their merit, if they threaten the prestige or liveli-
hood of their audience."
Reporters need to tread a narrow path between betieving
everything and believing nothing. Also-we are reporters-
some of the controversies make important stories.

®
®
Tests and Testing
®
M
Testing iu often the only way to answer, our questions, but it doest'i produce
unauailable, universal truths that should be canved on stone tablets. Instead,
testing produces statisucs,, which must be interpreted.
Who knows when thou mayest be tested?
-Roben Hooke
-Ronald Arthur Hopwood
DO physicians always know what they're doing when they
admuuster tests? Stanford's Dr. Eugene Robin says many tests
'have not been properly evaluated' and in fact may be useless or
harmful." He asks, "Is it common practice in medicine to per-
form careful dinical trials before introducing tests that can affect
the welfare of masses of patients? Sadly the answer is no:"
A good test~ should~ detect both~health and disease and do so
with high accuracy. The measures of the value of a ck'nical rrst;
one used for medical diagnosis, are seruztiuity and specifuit}; or,,
simply, the ability to avoid faLte negatirrrs and false po.Til:m: Snuah'r;-
ity is how well a tesv identifies a disease or condition in those who
have it-how well it avoidr folsa rugatiua, or missed cases. If 100
people with a condition are tested and 90'tesv positive, the test's
sensitivity is 90' percent. Spiuzfuity is how well a test identifies
those who do not have the disease or condition -how well it ruL_c
out Jaltepositiucr, or mistaken identifications. If 100 healthy peo-
ple are tested and 90 test negative, the test's specificity is 90
percent.
Sau:'tiui~K in short, tells us about disazre present. Spai,ficity tells
us about diauase absent. A highly unspecific test will produce
many false positives; a highly insensitive test, many false nega-
64
TFSTs AND ~ T1
tives. Almost
qualities-suc
an overlap. 7
every c.a,seth
you willget.'
labeling, the :'
you wtll get.
As a bor
terms. ('So ~
.
comments.)
concept, the
fact that tests
person who t
this:
How ma
biom thu:~ H;
tests in the ]
medical con(
tried~ a as
some , i.
follow-up, 0
condition be
subjects, ana
tion frequer,
How %+
false positiv(
not to misss
sitivity to p
avoiding fa]
anyway, on
Doubt
because in :
acceptable "
short, therclli
uated hornC
detected prtW
.. . . ~:r. -

---

---

---

---

---

---

---
