Philip Morris
Epidemiology and Environmental Tobacco Smoke
Fields
- Type
- SCRT, REPORT, SCIENTIFIC
- BIBL, BIBLIOGRAPHY
- Area
- SCIENTIFIC AFFAIRS/BLACK LATERAL OLD S&T
- Characteristic
- PARE, PARENT
- Named Organization
- Ahf, American Health Foundation
- Arch Environ Health
- Epa, Environmental Protection Agency
- Medical College of Va
- Arch Environ Health
- Named Person
- Binder, R.
- Fisher, R.
- Friedman
- Garfinkel
- Kilpatrick, S.J.
- Lee, P.
- Uberla, K.
- Wynder, E.
- Fisher, R.
- Document File
- 2023512309/2023512515/Ets Issue Binder: Epidemiology
- Litigation
- Okag/Privilege Withdrawn
- Okag/Produced
- Master ID
- 2023512310/2514
Related Documents:- 2023512316-2317 Statistical Significance and Confidence Intervals
- 2023512329-2340 Environmental Tobacco Smoke and Lung Cancer: A Critical Assessment
- 2023512341-2348 What Is the Epidemiologic Evidence for A Passive Smoking - Lung Cancer Association?
- 2023512361-2362
- 2023512364-2440 A Dictionary of Epidemiology
- 2023512442-2514 News & Numbers A Guide to Reporting Statistical Claims and Controversies in Health and Other Fields
- Site
- R529
- Date Loaded
- 24 May 1999
- UCSF Legacy ID
- tjc02a00
Document Images
EPIDEMIOLOGY
AND
ENVIRONMENTAL TOBACCO'=SDld.CE

THIS ISSUE BINDER IS INTENDED TO PROVIDE A BASIC,
COMPREHENSIVE REVIEW OF THE SCIENTIFIC LITERATURE
REGARDING A SPECIFIC TOPIC ON ETS AND THE HEALTH OF
NONSMOKERS..
PRIMARY STUDIES AND: REVIEWS HAVE BEEN HIGHLIGHTED
TO IDENTIFY (1) USEFUL OR HELPFUL INFORMATION (YELLOW
HIGHLIGHT) AND (2) ADVERSE RESULTS OR OPINIONS (BLUE
HIGHLIGHT).

TABLE OF
CONTENTS
zo28s12312

TABLE OF CONTENTS
TAB
I. STATISTICS AND EPIDEMIOhOGY . . . . . .. . . . . . . . . . . 1
"Statistical Significance and Confidence Intervals"
II. ETS* .
. . .. . . . . . . . . . . . . . . . . . . . . . . .2
Scientific Method
Inadequacies of ETS Studies
References
III. ESSAYS ON ETS EPIDEMIOLOGY .
E. Wynder and G. Kabat
N. Mantel
IV. GLOSSARY OF TERMS .
Definitions
. .3
. .4
V. A DICTIONARY OF EPIDEMIOLOGY . . . . . . . . . . . . . . . 5
VI. APPENDIX .
News and Numbers (excerpts)
. .6
* For analysis and criticism of the epidemiologic studies on
ETS and lung cancer, heart disease and respiratory disease in
children, see specific Issues Binders.

STATISTICS AND EPIDEMIOLOGY
"Nevertheless, in a real sense, statistics is
the study of populations, or aggregates of
individuals, rather than of individuals.
Scientific theories which involve the properties
of large aggregates of individuals, and: not
necessarily the properties of the individuals
themselves . . . are essentially statistical
arguments, and are liable to misinterpretations
as soon as the statistical nature of the
argument is lost sight of."
Sir Ronald Fisher

nt7rit:ii
This matertal rnay oe
618 protectz o bv copyright June 9, 1986 Vol. 144 THE MEDICAL IOL/RNAL OF AUSTRALIA
Statistical significance and confidence
M ~.nywpen in. rhe Journal use
/ V/ surf~~lcatme[hods arsdbne of th.I j 1 aJms ot the revlew procw is to try
to ensure that appropriate methods have
been& used. Often papers rer+ort results of
comtsarative studies that art designed to
S atuwer questions such as whether one
treatment is superior to another for a
particular disease, or whether there is an
association between sottx form of behaviour
(for exampk, taking regular, exercise or
smoking) and the occvrrence of some
disease. Comparative studies are almost
invatiably carried out on a sample of
individuals who are chosen from the
populatiort, of individuals to whom it is
intended to generalize the results. Data are
collected on the sample in order to make
inferences on the population. Valid
inferences can only be drawn if the sample
is chosen.in such.a way that it is represen-
tative of the population. Otherwise a bias
could occvr; epidemiological methods are
designed' to eliminate such biases.
Since the aim of a statistical analysis is to
make inferences. it is paramount to express
whatever inferences that can be drawn in the
most informative way: There are several
methods of statistical inference, but the two
that are most commonly used are
significance testing and confidence interval
estimation. The former is well known and
is featured by quoting P values. Many
authors appear to be under the impression
that a profusion of P values is necessary:
regrettably this impression has been bolstered
in the past by editors of biological Ijournals.
Significance testing has its place buts as
mentioned by Healy in,1978,' "it, is widely
agreed among statisticians (if less so among
the more naive users of statistics) that;
significance testing is not the be-all and end-
all of the subject". In this leading article I
would'like to discuss tfie characteristics of:
both methods of' inference, show that a
confidence interval contains the result of a
significance test, but nou vice versa, and
suggest that confidence intervals are the
answers to the more interesting questions
that data can be used to answer.
Any particular study is based on a
particular sample: however, it is useful to
imagine that the study is repeated with a
different sample being selected each time.
These hypothetical studies will give different
results because they contain differenn
individuals, and individuals vary in any
characteristic because of biological varia-
bility. The differences are termed sampling
variability. It follows then that the results
than are obtained from a particular sample
can only be taken as an approximation to the
actual situation~ in the whole popultitaon.
Statistical methods are concerned »rh
assessing the degree of approximatton and
intervals
what may be reasonably inferred, given that
different sample would have produced a
different result.
The methods are based on the assumption
that it is a matter of chance which particular
subjects are in the sample that is befng
studied, and the sampling variability is thus
random variation which is determined by the
taws of probability. Therefore, the inferences
are expressed in terms of probability. The
situation is illustrated below.
Population
I f- - - - - - - sampling variation
Sample data
- - - - - - uncertainty
Inlerences on population
Taking a samplt from the population
involves sampling variation. As a conse-
quence of thit, inferences from the sample
data back to the population~ involve
uncertainty.
A statistical analysis may be thought of as
asking questions of the data. In an investi-
gation that compares two groups for the
mean value of. for example, blood pressure
or the prevalence of some disease, three
questions may be posed: Is there a difference
between the groups?: How large is the
difference?; and How accurately is the size
of the difference known?.
As erpressed, the first question expects the
answer, "yes"'or "no": although the answer
cannot be given in, precisely these terms, itt
is often rcduced~ to two possibilities. The
appropriate methodology is the significance
rest. The second question expects a numerical!
value to be the answer. This is an estimate
and, as it is a single value, is referred to as
a point estimate. in effea, the third'question~
asks how reliable this point estimate is: the
answer is a range of values which iis referred~
to as an interval estimate or a confidence
interval:
These questions represent two approaches
to inference: hypothesis testing and~
estimation. Although at first sight they
appeartobe quite different. in concept they
have much in common. Both make
inferential statements about the value of a
parameter. (ik parameter is an unknowmy quantity which partly or wholly characterizes
a population, for, example, a mean or a
measure of association.)
The significance test is an appropriate
technique when there is an a priori hypothesis
to test. For the purpose of the statistical test
this hypothests is expressed in nuffform -
such as whemo no difference exists between;
groups - and the test evaluates whether the
data are consistent with the null hyptxhesisf tf the data differ markedly from thosrwhich
would be expected under the null hypothesis,
to the extent that the probability of such an
extreme result is low, then it is said that the
result is statistically significant. Probability
is measured on a continuum between 0 and
I, but in significance testing a probability is
considered low if it is less than conventionali
values such as 0.05 (J4.) or 0.01 (1%). A
significant result is equated with the reyacsion
of the null hypothesis or the claim of a real
effect. By definition, when the null
hypothesis is true, significant results will
occur by chance with the same relative
frequency as the signifieance probability.
That is, real effects will be claimed when the
null hypothesis is true; however, the proba-
bility of this error (type I) is determined in
the data analysis.
One disadvantage of a significance test is
that: it may fail to detect a real effect:'that
is, although the null hypothesis is false, the
evidence is not strong enough to reject it. The
probability' of this error (type 11) can be
controlled' at the design stage only, by
appropriate selection of the satnple size, and
may be quite large. Thus, the trap of
equating non-sitnifrcance with no effect
must be avoided; failure to reject the null
hypothesis is not the same as accepting it.
In the approach of confidence interval
estimation no particular hypothesis is consi-
dered: ratherthe emphasis is on estimatingg
those values of the parameter withwhich,the
data are consistent. These valhes form a
range - the confidence interval. The range
is calculated so that there is a high proba-
bility - conventionally 95*t9 or 99'f. - that
it contains the true value of the parameter.
A significance test is essentially a test of
whether the data are consistent with a
specified parameter value, and the confi-
dence intervali contains those parameter
valucs with which the data are consistent.
Therefore, a Srtsignificance test,and a 95%
confidence interval': contain some infor-
mation ir. common: significance implies that,
the null hypothesis value is outside the confr-
dence interval; non-siSnificance implies that
the null hypothesis value is within.the confi-
dence interval. However, the confidence
inteeval contains more information because
it is equivalent to performing a significance
test for all values of the parameter, not just
a single value. A confidence interval enables
a reader to see how large the effect may be.
not simply whether it is different from zero.
The limitations of the interpretations that
are provided'by a significance test may'now
be considered.
The difference is sisnifrcanr:. This means
that there is a difference orin otherwordsr the size of the difference is not zero. We
know no more than this. The difference may
J
t

THE MEDICAL JOURNAL OF AUSTPALIA Vol. 144 June 9, 1966
be large and of great importance or it may
be small and of no practial importance. It
is tr,umdactory that the tea provides no way
of distinguishing between these quite
different possibilitia.
The d(fJerrnor Is nor sijeljuvnf, This
means that there is insufficient evidence to
enable us to conclude that there is a
difference. So the difference may well be
zero. But this is not: the satae as vying that
it is zero. The true difference may be quite
large. Again, it is unsatisfactory that this
possibifity is ijot addressed.
The coeciusicns that may be drawn from
a significance test are considered to be
incomplete because it is rarely that one is
interested solely in whether a null hypothesis
is or is not true; indeed' in many cases it may,
be recognized at the outset that the null
hypothesis is unlikely to be ttue.,Rather, the
question is how large is the difference and:
is it possibly large enough to be important?
The emphasis is on measuring rather than on
testing. The addition of the concept of an
important difference to that of a null
hypothesis means that there are four possible
interpretations to an analysis: (a) the
difference is significant and large enough to
be of praRical iinportanoe; (b) the difference
is significant but too small to be of practical
importance; fc1' the difference is nott
significant but may be large enough to be
importantt and fd1 the difference is not
significant and also not large enough to be
of practical importance.
pHtert.nc
Ynportant
NuM' 0
hypot6.a.
The size of differeace that is considered
to be large enough to be important is a
matter for debate, and genuine differences
of opinion may arise. It is a tnedieal, not a
statiuial, question, ahboujh a sssedsal
statistitzatt who is esperienoed in thesubject
area could contribute to setting a value. The
fact that agreratent on a unique value may
be impossible in no way detracts from the
argument. In fact. expressing the results as
a confidence tnterval enables interpretations
to be made for any particular value that is
considered appropriate.
These possibilities are illustrated in the
Figure where the confidence intervals are
shown. The significant and non-significant
cases are distinguished by the confidence
intervals that exclude or include zero respec-
tively. The main point is that in each case
the confidence intervali gives the range of
possible values for the true difference. Of
particular concern is Ic1. Here ther: rttay be
no true difference or there may be a luge,
important difference. In other words the
study is completely inconclusive. Such a
possibility is missed by the simple expression
"not signifianr" with its lure of equating
this falsely with "no effect". This situation
will arise with a studythat is carried out on
too small a sample and this is why good study
design demands attention to sample size to
try to prevent the occurrence of an incon-
clusive result. Altman found that it was
common for undue emphasis to be placed on
"negative" findings from small studies,'
ta (b)
tb) td)
L ~ l l
SIGNIFICANT NOT SIGNIFICANT
Nnportant
Not Important
Inconclu.iw Tru n.p.tJv
raault
FIGt/RE Conhdence intervals show.nS Jour ppss+ble conclusions in terms of stattttrcalsrgndrcance
and practtcal'xttportrnce.
619
while Freimen et al. noted that 'nesative'
trisls were often too sasall to aonai:ute a fair
teu of tbtmrpies.' Similarly, a ssgniGcance
test will contrast (b).s significant and (d) as
not sijnifiaar but fash to rec+t>Ssia tmt they
give essentiaQy the tsme eoodmion - d.f
any difference is too small ~to be iasportant.
As an example., consider some results
which were obtaiaetf by Garraway et aL from
a dinial trial' for the -agraseat of arwr
stroke in the elderty.' Of 155 puieau who
were tssaaaged in a txroke tmtt. 73 were
asxsssd as independeat when tbry wen
discharged front the trnft compared with 49
of 132 who wert: maaaged in a med"l tsust.
The simplest analysis shows that the
difference betweefl the sneass raw of the
two units is stipsific"t at the l% levd.
Therefore, a genuine effect has beea estab-
lished. To appreciate the importanca of this
effect the advantage of the svoke unit may
be measured by the difference bet..eea tbe
two units in the percentage of tubjea.s
who were discharge& as independent:
30.3% - 32.2% - 18. 1 %. This is the poiiu
estimate. The aaurae7 of this iesditnue is
given by its staadard erro>r (5.5) and the 95%
confidence limits (/.3% and 2g.9%).'iaus,
the gain could be as large as 29'h or as small
as 7%.
Recently, Gardner and Ahtnan have
arstted against the eaarsive use off hypothesis
testing and urged a Qeater use of confidence
intetvds,' In an appendix to their paper they
give methods to calculate confidence
intervals for the commonly occurring two-
sample comparisons.
in presenting the main results of a study
it is good practice to provide confidence
intervals rather than to restrict the analysis
to significance tesa. Only by so doing can
authors give readers sufficient information
for a proper conclusion to be drawn;
otherwise readen have to rely upon the
authors' own interpretation.' Therefore,
intending authors are urged to express their
main conctusions in confdertee interval form
(possibly with the addition of.a siPifiance
test, although strictly that would provide no
extra information). One of the aims of the
)ournal's statistical review process will be to
ensure that where possible this is done.
GEOFFREY BERRY
Associatc Profesnar or Bioaaustio
School of Public Heatth nad Tropieat Medtcine
The Utiiver0ty of Sydney
I. Healy M1R. It uatma . tnenre:'J R SurraSor A.
1971;, 1at: 3aS31J.
2. Aheua DG. Stauwtra Is awd+cat )oarnaL: Sta MsI
1912..1 : 5901.
1. Frerean /A. Cbalr.rs TC., Smith H it. Xa01er RR.
Tlr.unponvtct of Ea.. tAc rypr 11 aror aeG.rapie
ua n ,the ora+P and sourprem+m uf uye rasdamootmC
control trut. N Ewrt fM.d 1911; 299.' NOY9s
4 . Grvrs.,ay.wM. Akhw AJ. Prercou Rl. HocYer L.
Mwaernem of sc+wr r.rde to tBr efoaty: trebutuisry
rewhf of . toarolled trul. MMed'!' 19a0; 200:
IW4t0a3.
3. C+rdne. MJ. Altmao DG. Confdma war.ahntAn
ttue P.aluncaueutonP ruAer tBaa Eypotbau
«are{. A. Ned 1 19R6; 292: 74&750.

ENVIRONMENTAL TOBACCO SMOKE
Scientific Method
Scientific inquiry within an: epidemiologic study begins
with framing what is called a "null -hypothesis." The null-
hypothesis states, in this instance, that ETS is not associated
with a given disease state (e.g., lung cancer, heart disease, etc.).
Data are then collected and analyzed in order to test i.e., reject
or accept, the hypothesis.
One method which is used to assess the relationship of
the collected data to a given hypothesis is the test for statistical
significance.* Simply put, if the data examined yield a
statistically significant result (here, the relationship between
ETS exposure and a disease state), then~the scientist is permitted,
on the basis of those data, to reject the null-hypothesis. If the
statistical test is not significant, then the data do not support
rej~ection of the null hypothesis.
* By convention, a'p' (probability) value less than 0.05 is
deemed statistically significant. A'p' value less than 0.05
means that the observed results would occur by chance less
than 5 times out of 100.
"Confidence limits" are the values between which the risk
value can be expected to fall 95% of the time based on the
variability of the underlying data. When the 95% confidence
limits are both greater and~less than 1.00, the risk value is
considered not statistically significant, i.e., the results
are likely to be due to chance and do not support a judgment
regarding an association between exposure and disease.

There is no "absolute proof" involved, and there is
nothing immutable about the concept of significance
testing.
Statistical significance is, after all, a convention. But the
concept is illustrative, especially in the case of the association
between ETS exposure and lung cancer. To date, there have been 28
published reports on ETS and lung cancer, and only five have
achieved statistical significance. It is clear that the
preponderance of data do not permit rejection of the null
hypothesis, i.e., there is no association between ETS exposures
and lung cancer. In addition, virtually all of the individual
risks reported in such studies are less than 2, which, to the
epidemiologist, suggests a "weak" association which is probably
the result of bias or confounding!of factors unrelated to ETS.
Inadeguacies of ETS Studies
Epidemiologic studies are notoriously unreliable
in outcome. An observed relative risk of less
than 1.5-2.0 (some would up to 3.0) is
inadequate to reject the hypothesis of no
effect. The overall relative risk calculated
across studies is well below a minimal value
for seriously attributing it to the presence
of a real effect, i.e., it is within the range
easily due to the "noise" in epidemiologic
data resulting from the limitations and vagaries
intrinsic to the methodology and its
application. This same conclusion also applies
to nearly all of the studies on an individual
basis. Another reason for conservative
interpretation of the ETS studies is that
several studies are of poor quality (good
textbook examples of how not to do an
epidemiologic study) and some were originally
designed for a different, or broader, purpose
than assessing health risks from ETS exposure.

Sources of bias are present to varying degrees
in most of the studies. Lung cancer patients
may tend to overstate their exposure to spousal
smoking as an explanation for their illness.
Bias may result from depending on memory recall
of a subject's exposure to spousal smoking.
Estimates of relative risk may differ markedly
between data collected from the subjects and
data obtained from a surrogate, such as their
children. Histologic verification of lung
cancer was not conducted in all studies and
the error rate may be substantial, e.g., 13%
of the lung cancer cases in the case-control
study of Garfinkel et al. were found to be
incorrectly diagnosed when the histology was
reviewed by one of the authors. (From:
Summary of Public Docket Comments, Draft Risk
Assessment, U.S. EPA, Dec. 1990.)
Peter Lee, a statistician and epidemiologist from the
United Kingdom, has argued that the increased risks reported in
various epidemiologic studies are the result of an inherent bias
in study design rather than the result of any genuine effect from
exposure to ETS.1-5 Lee presents data which indicate that the re-
ported risks cannot be explained on the basis of either ETS expo-
sure or dose for the nonsmoker. It is Lee's contention that the
reported "risks" are the result of bias caused by a small number
of smokers who are misreported in~the studies as nonsmokers.
Other kinds of misclassification may contribute to the
repa:eted increase in lung cancer risks among nonsmokers, according
to several scientists. For example, none of the studies on ETS
and lung cancer provides direct observational information on ETS
exposures. Instead, spouses, next-of-kin or friends are asked to
estimate the amount of ETS to which they think the subject was
exposed. Such estimates may lead to a kind of misclassification,.
