Jump to:

Philip Morris

Date: 28 Dec 1962
Length: 4 pages
2015068091-2015068094
Jump To Images
snapshot_pm 2015068091-2015068094

Fields

Author
Wilson, E.B.
Area
LEGAL DEPT/CARLSTADT
Type
LETT, LETTER
Site
N28
Named Person
Dorn
Parran
Pearl
Reed
Named Organization
Symposium on Smoking + Cancer
Recipient
Hockett
Document File
2015068090/2015068113/3806-12 Edwin B. Wilson, Ph.D Office of Naval Research and Havard School of Public Health Boston, Mass.
Litigation
Txag/Produced
Author (Organization)
Harvard Univ
Master ID
2015068091/8112
Related Documents:
Characteristic
MARG, MARGINALIA
Date Loaded
24 May 1999
UCSF Legacy ID
efv61f00

Document Images

Text Control

Highlight Text:

OCR Text Alignment:

Image Control

Image Rotation:

Image Size:

Page 1: efv61f00 Log in for more options!
~ ~~! ..-/ . .. .. r C 0 P Y December 28, 1962 Dear Dr. Hockett: I have been re-reading your contribution to the symposium on Smoking and Cancer under the chairmanship of Dr. Parran, and~note that on p. 67, column 2, first new paragraph you state, about in its middle, "the moral is that statistical association alone is not able to indicate whether a specific factor is actually a part of the causal complex or to distinguish~between a direct and ms.jor factor and one that is involved in an indirect, incidental or accidental manner." This is true, but it is particularly true when~one does not apply the criteria which all good statisticians have for many years considered necessary. I will take an old case. Pearl and Reed fitted'the autocatalytic curve P= K to census populations to find'the limiting population K, when t=°0 . 1 me At times they said they fitted by least squares. Most of their alleged least squares fits could be improved by inspection methods which showed that they were not least squares fits. This did not make much difference in the goodness of the fit to the given data; but one does not take the trouble to make a least squares fit just to get a better fit -- althoughla lot of poor scientists may think so. One really makes the least squares fit to obtain an estimate of the mean error of the unknown parameters -- in this case K, m, n. Now in all minimum problems it is known that the variation of the value of the function in the neighborhood of the minimum is infinitely smaller than the variation of the variable which determines the minimum ~ , i.e., that a considerable error in the variable may make very little ~A difference in the value of the function. When there are several variables, as in ~ ~ the case of shooting at a target, when errors in altitude and azimuth occur, it O ~ ~
Page 2: efv61f00 Log in for more options!
- 2 - often happens that the errors are correlated. In the population problem the asymptotic value K is the one Pearl and Reed were after. It may be highly correlated~with m or n~ about which they did not care. Even if it were not so correlated, the value of K might be very badly determined because of having a large mean error. And, of course, if one changes the formula, as they did, to P= d~ K allowing the l L me- population to start at d at time t=- a0, thus introducimg another constant, one make a great difference in the asymptotic population. Their work on determining or some asymptotic populations was very poor, notable for quantity rather than quality or even common sense.* This is true of most of the work done by most of the statistical biologists and medical men; they do not even apply the statistical precautions that have been appliedfor years by good statisticians, who themselves are not too dogmatic about their conclusions even after applying all the precautions they can. One of these precautions is to figure partial correlations or associations -- thus taking care of' the possible and indeed probable correlations between the variables one uses. Thus if alcohol usage and tobacco usage both have an effect statistically upon~mortality, it is necessary to have the correlation between them, or to resolve the population into four groups, before one can get any picture of the statistical effects of either. This is usually not done -- indeed, if there are many variables it is a tiresome, though necessary, thing to do. (Some of the variables one may eliminate by "controls," age is generally eliminated that way and sometimes social class (in~England) estimate of socio-economic status or intelligence or other variable here.) In the old days, and still to some extent, there is neglect of a signifi- cance test, but at present we usually have significan6e tests. They may N C ~A merely show that O the samples are not too small for the inference to be made; they have nothing to about bias in the cases or controls or in the variability due to some causes not specified. *-Illustration: They got about 70 for the asymptotic population of England & Wales. Without d they should have got 107 ± 12 milliony and with the d the asymptote is at 383 million but its value has become really totally indeterminate.
Page 3: efv61f00 Log in for more options!
But almost never does one estimate how much of the total causation is covered by the variables used, and how much is left undetermined. Thus if we are dealing with correlations, a zero order correlation r such as that between~ mortality ratio and tobacco consumption, will explain only r2 of the variance in the phenomenon and leave 1 - r2 unexplained. Thus in Dorn's Table II, p~. 53, mortality ratios for regular cigarette smokers who begin at ages -20, 20-24, 25-34 and 35 7L on 1 pack or less are 1.65, 1.51, 1.20, 1.18'respectively. If the age of starting be taken serially just as 1,2,3,4, there is obviously a high degree of correlation if we take no account of the numbers of deaths involved which are 644, 391, 147, 58' respectively. There would also be a high correlation if we took account of them. This means that so far as "explaining" the variation in the mortality ratios, the age of starting must take care of everything except a small percentage of the variance -- maybe 10% and all other causes would have to be in that 10%. This is nonsense, of course. It shows that we cannot work in this matter with mortality ratios averaged for the four groups. It shows that other variables like the amount smoked would have to be highly correlated with this variable. Really, one should use some other method. The fundamental difficulty we are up against is the poor scientists with whom we have to deal so far as concerns solving our problem of the relation of tobacco to health. They may be very good scientists for solving their problems, and in some cases certainly are. Smoking cigarettes may well shorten one's life. Or it may not, for we have no idea when or of what the regullar cigarette smoker would die if he d'id~not N ~ N C11 smoke them. At any rate, for each whose life may be shortened there are several ~ G7~ who smoke heavily for years with no apparent ill effects and with much pleasure. ~ 0
Page 4: efv61f00 Log in for more options!
It is the same with~automobile driving which kills more people, I believe, than~die of lung cancer, but with this difference, that the driver often kills some non-driver whereas the cigarette smoker does not kill the non-smoker. Sincerely, /s/ E. B. Wilson Emeritus Professor of Vital Statistics Harvard University

Text Control

Highlight Text:

OCR Text Alignment:

Image Control

Image Rotation:

Image Size: